AIBench: A Comprehensive AI Benchmark Suite for Datacenter, HPC, Edge, and AIoT


BenchCouncil AIBench

As a joint work with increasing industry partners, AIBench is a comprehensive AI benchmark project focusing on methodology, frameworks, and continuous improvements. It systematically tackles five challenges of AI benchmarking: prohibitive cost, conflicting requirements in different stages; short shelf-life and fast evolution of AI models; scalability challenge due to the fixed problem scale, and repeatability challenge due to the stochastic nature of AI.

Overall, AIBench distills and abstracts real-world application scenarios into AI Scenario, Training, Inference, Micro and Synthetics Benchmarks across Datacenter, HPC, IoT, and Edge.

AIBench Scenario benchmarks are proxies to industry-scale real-world applications scenarios. Each scenario benchmark models the critical paths of a real-world application scenario as a permutation of the AI and non-AI modules.

  • Edge AIBench is an instance of the scenario benchmark suites, modeling end-to-end performance across IoT, edge, and Datacenter. Four representative edge scenarios are covered, including ICU Patient Monitor, Surveillance Camera, Smart Home, and Autonomous Vehicle.

Currently, AIBench Training and AIBench Inference cover nineteen (will update) representative AI tasks with state-of-the-art models to guarantee diversity and representativeness. AIBench Training Subset provides the RPR subset for repeatable performance ranking and the WC subset for workload characterization. We keep these two subsets to a minimum for affordability.

Based on the AIBench Training RPR subset, HPC AI500 aims to evaluate large-scale HPC AI systems. AIoTBench implements the AI inference benchmarks on various IoT and embedded devices, emphasizing diverse lightweight AI frameworks and models.

AIBench Micro provides the intensively-used hotspot functions, profiled from the full AIBench benchmarks, for simulation-based architecture researches.

AIBench Synthetic is complementary to real-world benchmarks, with scalable problem sizes to model learning dynamics. AIBench Synthetic achieves scalability, which supports auto-generation of diverse models with different combinations of building blocks and connections, to investigate the impact of varying model structures.