Tutorial: BenchCouncil AIBench

---An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite

News: All the AIBench slide presentations and hands-on tutorial videos are publicly available from Tutorial_Link. The separate link for each talk is also provided in Schedule part. Additionally, the videos are provided to show how to use our benchmarks on a publicly available Testbed.


International Open Benchmark Council (BenchCouncil) is a non-profit international benchmark organization, which aims to promote the standardization, benchmarking, evaluation, incubation, and promotion of open-source chip, AI, and Big Data techniques. This tutorial is aimed at presenting AIBench. We are glad to introduce the following interesting topics:
(1) The challenges of characterizing modern workloads.
(2) What is an end-to-end benchmark?
(3) The motivation for agile domain-specific benchmarking methodology.
(4) What is agile domain-specific benchmarking methodology.
(5) Seventeen Industry Partners’ Benchmarking Requirements.
(6) Ten end-to-end application scenarios distilled from the industry-scale applications.
(7) 16 Representative AI Tasks of AIBench
(8) The AIBench Micro Benchmarks.
(9) The AIBench Framework.
(10) The Design and Implementation of an end-to-end benchmark---E-commerce Search Intelligence.
(11) The guild line for building other end-to-end benchmarks.
(12) End-to-end Benchmarking is Necessary for Online Server.
(13) Can a Statistical Model Predict the End-to-end Tail latency?
(14) Tradeoff among Service Quality, Model Accuracy, and Model Complexity.
(15) Tradeoff among Model Update Interval, Accuracy Improvement, and Training Overhead Using Offline Trainer.
(16) Why Diversity of AI Tasks Matters for Benchmarking?
(17) Drill Down to Functional-level Code.
(18) AI Benchmark Comparison.

Location and Date

Because of the Coronavirus 2020 outbreak, we cannot come to Lausanne, and give presentations. We are very sorry. But we will give presentation using a remote conferencing system, named Xinxiu. In addition, we will upload videos and show how to use our benchmarks on a publicly available testbed. If you have questions or comments, you need sign up the xinxiu system, or just email us via zhanjianfeng@ict.ac.cn or gaowanling@ict.ac.cn.

Organizers and Presenters

Organizer: Dr. Jianfeng Zhan ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Dr. Jianfeng Zhan ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Chunjie Luo ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Dr. Wanling Gao ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Tianshu Hao ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Zihan Jiang ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences
Presenter: Fei Tang ICT, Chinese Academy of Sciences, and University of Chinese Academy of Sciences

Abstract

Domain-specific software and hardware co-design is encouraging as it is much easier to achieve efficiency for fewer tasks. Agile domain-specific benchmarking speeds up the process as it provides not only relevant design inputs but also relevant metrics, and tools. Unfortunately, modern workloads like Big data, AI, and Internet services dwarf the traditional one in terms of code size, deployment scale, and execution path, and hence raise serious benchmarking challenges.
AIBench proposes an agile domain-specific benchmarking methodology. Together with seventeen industry partners, we identify ten important end-to-end application scenarios, among which sixteen representative AI tasks are distilled as the AI component benchmarks. We propose the permutations of essential AI and non-AI component benchmarks as end-to-end benchmarks. An end-to-end benchmark is a distillation of the essential attributes of an industry-scale application. We design and implement a highly extensible, configurable, and flexible benchmark framework, on the basis of which, we propose the guideline for building end-to-end benchmarks, and present the first end-to-end Internet service AI benchmark.
The preliminary evaluation shows the value of our benchmark suite—AIBench against MLPerf and TailBench for hardware and software designers, micro-architectural researchers, and code developers. The specifications, source code, testbed, and results are publicly available from the web site http://www.benchcouncil.org/AIBench/index.html.

Schedule

Time Agenda Presenter Resources
08:35-08:40 Opening Remark Wanling Gao [Talk]
08:40-08:55 The challenges of characterizing modern workloads Wanling Gao [Talk]
08:55-09:10 What is an end-to-end benchmark? Wanling Gao [Talk]
09:10-09:35 The motivation for agile domain-specific benchmarking methodology Wanling Gao [Talk]
09:35-09:50 What is agile domain-specific benchmarking methodology Wanling Gao [Talk]
09:50-10:05 Seventeen Industry Partners’ Benchmarking Requirements Fei Tang [Talk]
10:05-10:35 Coffee break
10:35-11:00 Ten end-to-end application scenarios distilled from the industry-scale applications Fei Tang [Talk]
11:00-12:00 16 Representative AI Tasks of AIBench
DC-AI-C1: Image Classification
Fanda Fan [Talk]
DC-AI-C2: Image generation Fanda Fan [Talk]
DC-AI-C3: Text-to-Text translation Jianan Chen [Talk]
DC-AI-C4: Image-to-Text Jianan Chen [Talk]
DC-AI-C5: Image-to-Image Fanda Fan [Talk]
DC-AI-C6: Speech recognition Jianan Chen [Talk]
DC-AI-C7: Face embedding Xingwang Xiong [Talk]
DC-AI-C8: 3D Face Recognition Xingwang Xiong [Talk]
12:00-14:00 Lunch
14:00-15:00 DC-AI-C9: Object detection Xingwang Xiong [Talk]
DC-AI-C10: Recommendation Xu Wen [Talk]
DC-AI-C11: Video prediction Xu Wen [Talk]
DC-AI-C12: Image compression Xu Wen [Talk]
DC-AI-C13: 3D object reconstruction Chunjie Luo [Talk]
DC-AI-C14: Text summarization Chunjie Luo [Talk]
DC-AI-C15: Spatial transformer Chunjie Luo [Talk]
DC-AI-C16: Learning to rank Chunjie Luo [Talk]
15:00-15:15 The AIBench Micro Benchmarks Rui Ren [Talk]
15:15-15:30 The AIBench Framework Tianshu Hao [Talk]
15:30-15:45 The Design and Implementation of an end-to-end benchmark---E-commerce Search Intelligence Tianshu Hao [Talk]
15:45-16:00 The guild line for building other end-to-end benchmarks Tianshu Hao [Talk]
16:00-16:15 End-to-end Benchmarking is Necessary for Online Server Fei Tang [Talk]
16:15-16:30 Can a Statistical Model Predict the End-to-end Tail latency? Lei Wang [Talk]
16:30-16:45 Tradeoff among Service Quality, Model Accuracy, and Model Complexity Rui Ren [Talk]
16:45-17:00 Tradeoff among Model Update Interval, Accuracy Improvement, and Training Overhead Using Offline Trainer Zihan Jiang [Talk]
17:00-17:15 Why Diversity of AI Tasks Matters for Benchmarking? Zihan Jiang [Talk]
17:15-17:30 Drill Down to Functional-level Code Zihan Jiang [Talk]
17:30-17:40 AI Benchmark Comparison Wanling Gao [Talk]

Publications

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite. [PDF]
Wanling Gao, Fei Tang, Jianfeng Zhan, Chuanxin Lan, Chunjie Luo, Lei Wang, Jiahui Dai, Zheng Cao, Xiongwang Xiong, Zihan Jiang, Tianshu Hao, Fanda Fan, Xu Wen, Fan Zhang, Yunyou Huang, Jianan Chen, Mengjia Du, Rui Ren, Chen Zheng, Daoyi Zheng, Haoning Tang, Kunlin Zhan, Biao Wang, Defei Kong, Minghe Yu, Chongkang Tan, Huan Li, Xinhui Tian, Yatao Li, Gang Lu, Junchao Shao, Zhenyu Wang, Xiaoyu Wang, Hainan Ye. Technical Report, 2020.

BenchCouncil’s View On Benchmarking AI and Other Emerging Workloads. [PDF]
Jianfeng Zhan, Lei Wang, Wanling Gao, and Rui Ren. Technical Report, 2019.

AIBench: An Industry Standard Internet Service AI Benchmark Suite. [PDF]
Wanling Gao, Fei Tang, Lei Wang, Jianfeng Zhan, Chunxin Lan, Chunjie Luo, Yunyou Huang, Chen Zheng, Jiahui Dai, Zheng Cao, Daoyi Zheng, Haoning Tang, Kunlin Zhan, Biao Wang, Defei Kong, Tong Wu, Minghe Yu, Chongkang Tan, Huan Li, Xinhui Tian, Yatao Li, Gang Lu, Junchao Shao, Zhenyu Wang, Xiaoyu Wang, and Hainan Ye. Technical Report, 2019.

AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking. [PDF]
Wanling Gao, Chunjie Luo, Lei Wang, Xingwang Xiong, Jianan Chen, Tianshu Hao, Zihan Jiang, Fanda Fan, Mengjia Du, Yunyou Huang, Fan Zhang, Xu Wen, Chen Zheng, Xiwen He, Jiahui Dai, Hainan Ye, Zheng Cao, Zhen Jia, Kent Zhan, Haoning Tang, Daoyi Zheng, Biwei Xie, Wei Li, Xiaoyu Wang, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

HPC AI500: A Benchmark Suite for HPC AI Systems. [PDF]
Zihan Jiang, Wanling Gao, Lei Wang, Xingwang Xiong, Yuchen Zhang, Xu Wen, Chunjie Luo, Hainan Ye, Xiaoyi Lu, Yunquan Zhang, Shengzhong Feng, Kenli Li, Weijia Xu, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

AIoTBench: Towards Comprehensive Benchmarking Mobile and Embedded device Intelligence. [PDF]
Chunjie Luo, Fan Zhang, Cheng Huang, Xingwang Xiong, Jianan Chen, Lei Wang, Wanling Gao, Hainan Ye, Tong Wu, Runsong Zhou, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking. [PDF]
Tianshu Hao, Yunyou Huang, Xu Wen, Wanling Gao, Fan Zhang, Chen Zheng, Lei Wang, Hainan Ye, Kai Hwang, Zujie Ren, and Jianfeng Zhan. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

DCMix: Generating mixed workloads for the cloud data center. [PDF]
Xingwang Xiong, Lei Wang, Wanling Gao, Rui Ren, Ke Liu, Chen Zheng, Yu Wen, and Yi Liang. 2018 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench18).

Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads. [PDF]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Fei Tang, Biwei Xie, Chen Zheng, Xu Wen, Xiwen He, Hainan Ye and Rui Ren. The 27th International Conference on Parallel Architectures and Compilation Techniques (PACT 2018).

BigDataBench: a Big Data Benchmark Suite from Internet Services. [PDF]
Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, WanlingGao, Zhen Jia, Yingjie Shi, Shujie Zhang, Cheng Zhen, Gang Lu, Kent Zhan, Xiaona Li, and Bizhu Qiu. The 20th IEEE International Symposium On High Performance Computer Architecture (HPCA-2014), February 15-19, 2014, Orlando, Florida, USA.

Data Motif-based Proxy Benchmarks for Big Data and AI Workloads. [PDF]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Zhen Jia, Daoyi Zheng, Chen Zheng, Xiwen He, Hainan Ye, Haibin Wang, and Rui Ren. 2018 IEEE International Symposium on Workload Characterization (IISWC 2018).

BigDataBench: a Scalable and Unified Big Data and AI Benchmark Suite. [PDF]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Rui Ren, Chen Zheng, Gang Lu, Jingwei Li, Zheng Cao, Shujie Zhang, and Haoning Tang. Technical Report, arXiv preprint arXiv:1802.08254, January 27, 2018.

BOPS, Not FLOPS! A New Metric and Roofline Performance Model For Datacenter Computing. [PDF]
Lei Wang, Jianfeng Zhan, Wanling Gao, Zihan Jiang, Rui Ren, Xiwen He, Chunjie Luo, Gang Lu, Jingwei Li. Technical Report, arXiv preprint arXiv:1801.09212, May 3, 2018.

Understanding Big Data Analytics Workloads on Modern Processors. [PDF]
Zhen Jia, Jianfeng Zhan, Lei Wang, Chunjie Luo, Wanling Gao, Yi Jin, Rui Han and Lixin Zhang. IEEE Transactions on Parallel and Distributed Systems, 28(6), 1797-1810, 2017.

Understanding Processors Design Decisions for Data Analytics in Homogeneous Data Centers. [PDF]
Zhen Jia, Wanling Gao, Yingjie Shi, Sally A. McKee, Jianfeng Zhan, Lei Wang, Lixin Zhang. IEEE Transactions on Big Data, 2017.

A Dwarf-based Scalable Big Data Benchmarking Methodology. [PDF]
Wanling Gao, Lei Wang, Jianfeng Zhan, Chunjie Luo, Daoyi Zheng, Zhen Jia, Biwei Xie, Chen Zheng, Qiang Yang, and Haibin Wang. arXiv preprint arXiv: 1711.03229

Characterizing data analysis workloads in data centers. [PDF]
Zhen Jia, Lei Wang, Jianfeng Zhan, Lixin Zhang, Chunjie Luo. 2013 IEEE International Symposium on Workload Characterization (IISWC 2013) (Best paper award).

Characterizing and Subsetting Big Data Workloads.[PDF]
Zhen Jia, Lei Wang, Jianfeng Zhan, Lixin Zhang, Chunjie Luo, Ninghui Sun. 2014 IEEE International Symposium on Workload Characterization (IISWC 2014)

Identifying Dwarfs Workloads in Big Data Analytics.  [PDF]
W Gao, C Luo, J Zhan, H Ye, X He, L Wang, Y Zhu, X Tian. 
arXiv preprint arXiv:1505.06872

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking. [PDF]
Zijian Ming, Chunjie Luo, Wanling Gao, Rui Han, Qiang Yang, Lei Wang, and Jianfeng Zhan. In Advancing Big Data Benchmarks (pp. 138-154). Springer International Publishing.

Biographies

Jianfeng Zhan
Jianfeng Zhan is a Full Professor at Institute of Computing Technology, Chinese Academy of Sciences, and University of Chinese Academy of Sciences. He has supervised over 80 graduate students (both MS and Ph.D), post-docs, and engineers. His research interests cover a wide spectrum in the areas of high performance and distributed systems. He has made strong and effective efforts to transfer his academic research into advanced technology to impact general-purpose production systems. Currently, he is leading the research efforts for modern datacenter software stacks, including BigDataBench---an open source big data and AI benchmark suite, and RainForest--- an operating system for datacenter computing. Since the publication in HPCA 2014, BigDataBench is widely used in both academia and industry in the world. He has transferred more than 40 OS and distributed system patents to top companies. He founded BenchCouncil---a multidisciplinary international benchmark council and served as TPDS associate editor. More details about Prof. Zhan are available at http://www.benchcouncil.org/zjf.html

Wanling Gao
Wanling Gao is an Assistant Professor in computer science at the Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. Her research interests focus on big data benchmark and big data analytics. She received her B.S. degree in 2012 from Huazhong University of Science and Technology and her PhD degree in 2019 from Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences in China.

Tianshu Hao
Tianshu Hao received the B.S. degree from Nankai University, Tianjin, China, in 2015. She is currently pursuing Ph. D. dergree in ICT, CAS. Her research interests focus on big data, edge computing, IoT and AI benchmarking.

Zihan Jiang
Zihan Jiang is a doctoral student in computer architecture at Institute of Computing Technology, Chinese Academy of Sciences and University of Chinese Academy of Sciences. His research interests includes AI benchmark and distributed deep learning. Currently, he works on HPC AI500 project.

Fei Tang
Fei Tang received the B.S. degree from Zhengzhou University, Zhengzhou, China, in 2016. He is currently pursuing Ph. D. dergree in ICT, CAS. His research interests focus on big data, benchmarking and search engine.