RANKING | HPC AI500

HPC AI500 Ranking, Image Classification, Free Level, July 2, 2020

Rank	Source	VPFLOPS	Time (min)	Quality	AI acc	Framework
1	Fujitsu [1]	31.41	1.2	75.1%	Tesla V100 * 2048	MXNet
2	Google [2]	20.10	2.2	76.3%	TPU V3 * 1024	TensorFlow
3	Sony [3]	10.02	3.7	75.0%	Tesla V100 * 2176	NNL
4	Tencent [4]	6.44	6.6	76.0%	Tesla P100 * 1024	Chainer
5	Preferred Network [5]	2.41	15	74.9%	Tesla P40 * 2048	TensorFlow
6	Berkeley [6]	1.95	20	75.4%	KNL * 2048	Intel Caffe
7	Intel [7]	1.27	28	74.6%	KNL * 1536	Intel Caffe
8	IBM [8]	0.75	50	75.0%	Tesla P100 * 256	Caffe
9	Facebook [9]	0.70	60	76.3%	Tesla P100 * 1024	Caffe2

The data (unverified) are collected from the original papers and technical reports.
Submission Contact: Please contact benchcouncil@gmail.com to submit a new benchmarking result.

Top 3 HPC AI Systems

Rank 1: Fujitsu
Rank 2: Google
Rank 3: Sony

Benchmarks

HPC AI500 Benchmarks

Problem Domain	Dataset	Target quality	Epochs
Image Classification	ImageNet	Top1 Accuracy = 0.763	90
Extreme Weather Analytics	The extreme weather dataset	mAP@[IoU=0.5]=0.35	50

Metrics

The primary metric is Valid FLOPS, which is calculated by the following equation:

VFLOPS = FLOPS * (achieved quality/ target quality) ^n

Achieved quality represents the actual model quality achieved in the evaluation; target quality is the state-of-the-art model quality, predefined in HPC AI500 benchmark. N is a positive integer, indicating the sensitivity to the model quality. In image Classification, the target quality is top1 accuracy=0.763 and the value of n is 5 as default. In extreme weather analytics, the target quality is mAP@[IoU=0.5]=0.35 and the value of n is 10 as default.

Methodology

As shown in Figure 2, HPC AI500 benchmarking methodology provides three benchmarking levels, including hardware level, system level, and free level.

Hardware level, users can change layer 1 to layer 4. For the other layers, the benchmark users can only change parallel modes inLayer 6 or tune learning rate policies and batchsize settings in Layer 8.
System level, In addition to the changes allowed in the hardware level, the users areallowed to re-implement the algorithms on different or even customized AI framework (Layer 5).
Free level, users can change any layers from Layer 1 to Layer 8 while keeping Layer 9 intact. The same data set, target quality, and training epochs are defined in Layer 9 while the other layers are open for optimizations.

Figure 2: HPC AI500 V2.0 Methodology.

References

1. M. Yamazaki, A. Kasagi, A. Tabuchi, T. Honda, M. Miwa, N. Fukumoto, T. Tabaru, A. Ike, andK. Nakashima, “Yet another accelerated sgd: Resnet-50 training on imagenet in 74.7 seconds,” arXivpreprint arXiv:1903.12650, 2019.
2. C. Ying, S. Kumar, D. Chen, T. Wang, and Y. Cheng, “Image classification at supercomputer scale,” arXiv preprint arXiv:1811.06992, 2018.
3. Y. Tanaka and Y. Kageyama, “Imagenet/resnet-50 training in 224 seconds”.
4. X. Jia, S. Song, W. He, Y. Wang, H. Rong, F. Zhou, L. Xie, Z. Guo, Y. Yang, L. Yu, T. Chen, G. Hu, S. Shi, and X. Chu. Highly Scal- able Deep Learning Training System with Mixed–Precision: Training ImageNet in Four Minutes. arXiv:1807.11205, 2018.
5. T. Akiba, S. Suzuki, and K. Fukuda, “Extremely large minibatch sgd: Training resnet-50 on imagenetin 15 minutes,” arXiv preprint arXiv:1711.04325, 2017.
6. Y. You, Z. Zhang, C.-J. Hsieh, J. Demmel, and K. Keutzer, “Imagenet training in minutes,” inProceedings of the 47th International Conference on Parallel Processing, ICPP 2018, (New York,NY, USA), Association for Computing Machinery, 2018.
7. V. Codreanu, D. Podareanu, and V. Saletore, “Scale out for large minibatch sgd: Residual net-work training on imagenet-1k with improved accuracy and reduced time to train,”arXiv preprintarXiv:1711.04291, 2017.
8. M. Cho, U. Finkler, S. Kumar, D. Kung, V. Saxena, and D. Sreedhar, “Powerai ddl,”arXiv preprintarXiv:1708.02188, 2017.
9. P. Goyal, P. Doll ́ar, R. Girshick, P. Noordhuis, L. Wesolowski, A. Kyrola, A. Tulloch, Y. Jia,and K. He, “Accurate, large minibatch sgd: Training imagenet in 1 hour,” arXiv preprintarXiv:1706.02677, 2017.

	HPC AI500: A Benchmark Suite for HPC AI Systems
Home Specification Reference Implementation Ranking AIoT Edge Datacenter Publications Users BenchCouncil

HPC AI500: A Benchmark Suite for HPC AI Systems