Summary

The AI advancements have brought breakthroughs in processing images, video, speech, and audio, and hence boost industry-scale deployments of massive AI algorithms, systems, and architectures. The benchmarks accelerate the process, as they provide not only the design inputs, but also the evaluation methodology and metrics.

The Summary of AI Inference Tasks

To cover a wide spectrum of AI tasks, we thoroughly analyze the core scenarios among three primary Internet services, including search engine, social network, and e-commerce, as shown in Table 1. In total, we identify seventeen representative AI problem domains as follows.

  • Image classification. This task is to extract different thematic classes within the input data like an image or a text file, which is a supervised learning problem to define a set of target classes and train a model to recognize. It is a typical task in Internet services or other application domains, and widely used in multiple scenarios, like category prediction and spam detection.
  • Image generation. This task aims to provide an unsupervised learning problem to mimic the distribution of data and generate images. The typical scenario of this task includes image resolution enhancement, which can be used to generate high-resolution image.
  • Text-to-Text translation. This task need to translate text from one language to another, which is the most important field of computational linguistics. It can be used to translate the search query intelligently and translate dialogue.
  • Image-to-Text. This task is to generate the description of an image automatically. It can be used to generate image caption and recognize optical character within an image.
  • Image-to-Image. This task is to convert an image from one representation of an image to another representation. It can be used to synthesize the images with different facial ages and simulate virtual makeup. Face aging can help search the facial images ranging different age stages.
  • Speech recognition. This task is to recognize and translate the spoken language to text. This task is beneficial for voice search and voice dialogue translation.
  • Face embedding. This task is to transform a facial image to a vector in embedding space. The typical scenarios of this task are facial similarity analysis and face recognition.
  • 3D face recognition. This task is to recognize the 3D facial information from multiple images from different angles. This task mainly focuses on three-dimensional images and is beneficial to the facial similarity and facial authentication scenario.
  • Object detection. This task is to detect the objects within an image. The typical scenarios are vertical search like contented-based image retrieval and video object detection.
  • Recommendation. This task is to provide recommendations. This task is widely used for advertise recommendation, community recommendation, or product recommendation.
  • Video prediction. This task is to predict the future video frames through predicting previous frames transformation. The typical scenarios are video compression and video encoding, for efficient video storage and transmission.
  • Image compression. This task is to compress the images and reduce the redundancy. The task is important for Internet service in terms of data storage overhead and data transmission efficiency.
  • 3D object reconstruction. This task is to predict and reconstruct 3D objects. The typical scenarios are maps search, light field rendering and virtual reality.
  • Text summarization. This task is to generate the text summary, which is important for search results preview, headline generation, and keyword discovery.
  • Spatial transformer. This task is to perform spatial transformations. An typical scenario of this task is space invariance image retrieval, so that the image can be retrieved even if the image is extremely stretched.
  • Learning to rank. This task is to learn the attributes of searched content and rank the scores for the results, which is the key for searching service.
  • Neural architecture search. This task is to automatically design neural networks.

Contributors

Prof. Jianfeng Zhan, ICT, Chinese Academy of Sciences, and BenchCouncil    
Dr. Wanling Gao, ICT, Chinese Academy of Sciences    
Fei Tang, ICT, Chinese Academy of Sciences    
Dr. Lei Wang, ICT, Chinese Academy of Sciences    
Xu Wen, ICT, Chinese Academy of Sciences    
Chuanxin Lan, ICT, Chinese Academy of Sciences    
Chunjie Luo, ICT, Chinese Academy of Sciences
Yunyou Huang, ICT, Chinese Academy of Sciences
Dr. Chen Zheng, ICT, Chinese Academy of Sciences, and BenchCouncil    
Dr. Zheng Cao, Alibaba     
Hainan Ye, Beijing Academy of Frontier Sciences and BenchCouncil     
Jiahui Dai, Beijing Academy of Frontier Sciences and BenchCouncil     
Daoyi Zheng, Baidu     
Haoning Tang, Tencent     
Kunlin Zhan, 58.com     
Biao Wang, NetEase     
Defei Kong, ByteDance     
Tong Wu, China National Institute of Metrology     
Minghe Yu, Zhihu     
Chongkang Tan, Lenovo     
Huan Li, Paypal     
Dr. Xinhui Tian, Moqi     
Yatao Li, Microsoft Research Asia     
Dr. Gang Lu, Huawei     
Junchao Shao, JD.com     
Zhenyu Wang, CloudTa     
Xiaoyu Wang, Intellifusion     

Ranking

AIBench results are released.

License

AIBench is available for researchers interested in AI. Software components of AIBench are all available as open-source software and governed by their own licensing terms. Researchers intending to use AIBench are required to fully understand and abide by the licensing terms of the various components. AIBench is open-source under the Apache License, Version 2.0. Please use all files in compliance with the License. Our AIBench Software components are all available as open-source software and governed by their own licensing terms. If you want to use our AIBench you must understand and comply with their licenses. Software developed externally (not by AIBench group)

  • Redistribution of source code must comply with the license and notice disclaimers
  • Redistribution in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimers in the documentation and/or other materials provided by the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE ICT CHINESE ACADEMY OF SCIENCES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.