As a multi-discipline, i.e., system, architecture, data management and machine learning, research and engineering effort from both industry and academia, BenchCouncil AI Benchmarks is a comprehensive AI benchmark suite for HPC, Datacenter, Edge, and IoT. BenchCouncil BigDataBench is a comprehensive big data benchmark suite.
BigDataBench provides 13 representative real-world data sets and 25 benchmarks. The benchmarks cover six workload types including online services, offline analytics, graph analytics, data warehouse, NoSQL, and streaming from three important application domains, Internet services (including search engines, social networks, e-commerce), recognition sciences, and medical sciences. Our benchmark suite includes micro benchmarks, each of which is a single data motif, components benchmarks, which consist of the data motif combinations, and end-to-end application benchmarks, which are the combinations of component benchmarks. Meanwhile, data sets have great impacts on workloads behaviors and running performance (CGO’18). Hence, data varieties are considered with the whole spectrum of data types including structured, semi-structured, and unstructured data. Currently, the included data sources are text, graph, table, and image data. Using real data sets as the seed, the data generators—BDGS— generate synthetic data by scaling the seed data while keeping the data characteristics of raw data.
AIBench is the first industry scale AI benchmark suite, joint with seventeen industry partners. First, we present a highly extensible, configurable, and flexible benchmark framework, containing multiple loosely coupled modules like data input, prominent AI problem domains, online inference, offline training and automatic deployment tool modules. We analyze typical AI application scenarios from three most important Internet services domains, and then we abstract and identify sixteen prominent AI problem domains, including classification, image generation, text-to-text translation, image-to-text, image-to- image, speech-to-text, face embedding, 3D face recognition, object detection, video prediction, image compression, recommendation, 3D object reconstruction, text summarization, spatial transformer, and learning to rank. We implement sixteen component benchmarks for those AI problem domains, and further profile and implement twelve fundamental units of computation across different component benchmarks as the micro benchmarks.
HPC AI500 is a benchmark suite for evaluating HPC systems that running scientific DL workloads. Each workload from HPC AI500 bases on real scientific DL applications and covers the most representative scientific fields, namely climate analysis, cosmology, high energy physics, gravitational wave physics and computational biology. Currently, we choose 18 scientific DL benchmarks (For details, see Specification) from application scenarios, datasets, and software stack. Furthermore, we propose a set of metrics of comprehensively evaluating the HPC systems, considering both accuracy, performance as well as power and cost. In addition, we provide a scalable reference implementation of HPC AI500.
Edge AIBench is a benchmark suite for end-to-end edge computing including four typical application scenarios: ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle, which consider the complexity of all edge computing AI scenarios. In addition, Edge AIBench provides an end-to-end application benchmarking framework, including train, validate and inference stages. Table 1 shows the component benchmarks of Edge AIBench. Edge AIBench provides an end-to-end application benchmarking, consisting of train, inference, data collection and other parts using a general three-layer edge computing framework.
AIoTBench is a comprehensive benchmark suite to evaluate the AI ability of mobile and embedded devices. Our benchmark 1) covers different application domains, e.g. image recognition, speech recognition and natural language processing; 2) covers different platforms, including Android devices and Raspberry Pi; 3) covers different development tools, including TensorFlow and Caffe2; 4) offers both end-to-end application workloads and micro workloads.