Federated learning (FL) is a new machine learning paradigm, the goal of which is to build a machine learning model based on data sets distributed on multiple devices --so called Isolated Data Island -- while keeping their data secure and private. Most existing work manually splits commonly-used public datasets into partitions to simulate real-world Isolated Data Island while failing to capture the intrinsic characteristics of real-world domain data, like medicine, finance or AIoT. To bridge this huge gap, this paper presents and characterizes an Isolated Data Island benchmark suite, named FLBench, for benchmarking federated learning algorithms. FLBench contains three domains: medical, financial and AIoT. By configuring various domains, FLBench is qualified for evaluating the important research aspects of federated learning, and hence become a promising platform for developing novel federated learning algorithms. Finally, FLBench is fully open-sourced and in fast-evolution. We package it as an automated deployment tool.
FLBench framework includes four modules.
Input Data: Most of the current researches on FL is carried out on the simulation scenario, which is constructed by common used dataset such as CIFAR-10,since it is very difficult to access the real 'Isolated Data Island' scenario for researchers. However, there is a huge difference between the data of common used dataset and the classic real 'Isolated Data Island' scenario data in data type and data mode, which leads to that FLalgorithms developed based on simulated data cannot be migrated to real typical data island scenarios. To solve this issue, we collect data from three most concern data island scenarios include medicine, finance and intelligent terminal. In addition, a special data pre-process suite is necessary for medicine data, sine medicine data need special processing.
Scenario Configuration: In order to achieve the robustness and multi-faceted evaluation of the algorithm, we propose a scenario configuration function. First, we make statistics on the current innovation methods of federal learning research, and then classify the innovation directions of federated learning into the following categories: communication, scenarios transformation, privacy-preserving, data distribution heterogeneity, cooperation strategy. Second, for each innovation direction on each domain, we provide a basic configuration according the native distribution of data, and a API to modify the configuration to simulate various scenario according requirements.
Scenario: Benchmark has two functions: first, it provides an open and a fair comparison; second, it will provide research basis for later researchers to develop more advanced algorithms and determine the selection of some important parameters. Thus, we construct two kinds of scenarios: consistent scenario and customized scenario according to modifying the basic configuration for above functions. In addition, the most common used metrics are also used to evaluate the FL algorithm.
Automated Deployment Tool: FLBench will be updated step by step to make it adapt to the future development needs. In addition, we continue to expand the benchmark and provide more APIs, excuses and other scenarios benchmarking. We hope that more people will join our benchmark research, which will make our benchmark more perfect and comprehensive.
Currently, the FLBench contains: four datasets(medicine: ADNI, MIMIC-III; finance: Adult dataset; AIoT: iNaturalist-User-120k), one basic configuration file(Alzheimer's diagnosis scenario configuration). The Alzheimer's diagnosis scenario configuration is able to provide various scenarios for NO-IID (data distribution heterogeneity) researches in medicine domain. Researchers will be able to download our FLBench on BenchCouncil soon.
FLBench is a fully open and evolving benchmark, next we will based on the FLBench framework provide 3*3 = 9 datasets for three domains(medicine, finance, and AIoT), 3*3*5 = 45 basic configuration files on different research aspects(communication, scenarios transformation, privacy-preserving, data distribution heterogeneity, and cooperation strategy). Each configuration file is able to provide various scenarios according to the requirements of the specific research.