TY - JOUR
T1 - Swarm Learning for decentralized and confidential clinical machine learning
AU - COVID-19 Aachen Study (COVAS)
AU - Deutsche COVID-19 Omics Initiative (DeCOI)
AU - Warnat-Herresthal, Stefanie
AU - Schultze, Hartmut
AU - Shastry, Krishnaprasad Lingadahalli
AU - Manamohan, Sathyanarayanan
AU - Mukherjee, Saikat
AU - Garg, Vishesh
AU - Sarveswara, Ravi
AU - Händler, Kristian
AU - Pickkers, Peter
AU - Aziz, N. Ahmad
AU - Ktena, Sofia
AU - Tran, Florian
AU - Bitzer, Michael
AU - Ossowski, Stephan
AU - Casadei, Nicolas
AU - Herr, Christian
AU - Petersheim, Daniel
AU - Behrends, Uta
AU - Kern, Fabian
AU - Fehlmann, Tobias
AU - Schommers, Philipp
AU - Lehmann, Clara
AU - Augustin, Max
AU - Rybniker, Jan
AU - Altmüller, Janine
AU - Mishra, Neha
AU - Bernardes, Joana P.
AU - Krämer, Benjamin
AU - Bonaguro, Lorenzo
AU - Schulte-Schrepping, Jonas
AU - De Domenico, Elena
AU - Siever, Christian
AU - Kraut, Michael
AU - Desai, Milind
AU - Monnet, Bruno
AU - Saridaki, Maria
AU - Siegel, Charles Martin
AU - Drews, Anna
AU - Nuesch-Germano, Melanie
AU - Theis, Heidi
AU - Heyckendorf, Jan
AU - Schreiber, Stefan
AU - Kim-Hellmuth, Sarah
AU - Balfanz, Paul
AU - Colome-Tatche, Maria
AU - Grundmann, Hajo
AU - Janssen, Stefan
AU - Li, Yang
AU - da Rocha, Ulisses Nunes
AU - Singh, Yogesh
AU - Schulze, Joachim L.
N1 - Funding Information:
Acknowledgements We thank the Michael J. Fox Foundation and the Parkinson’s Progression Markers Initiative (PPMI) for contributing RNA-seq data; the CORSAAR study group for additional blood transcriptome samples; the collaborators who contributed to the collection of COVID-19 samples (B. Schlegelberger, I. Bernemann, J. C. Hellmuth, L. Jocham, F. Hanses, U. Hehr, Y. Khodamoradi, L. Kaldjob, R. Fendel, L. T. K. Linh, P. Rosenberger, H. Häberle and J. Böhne); and the NGS Competence Center Tübingen (NCCT), who contributed to the generation of data and the data sharing (in particular, J. Frick, M. Sonnabend, J. Geissert, A. Angelov, M. Pogoda, Y. Singh, S. Poths, S. Nahnsen and M. Gauder). This work was supported in part by the German Research Foundation (DFG) to J.L.S., O.R., P.R., P.N. (INST 37/1049-1, INST 216/981-1, INST 257/605-1, INST 269/768-1, INST 217/988-1, INST 217/577-1, INST 217/1011-1, INST 217/1017-1 and INST 217/1029-1); under Germany’s Excellence Strategy (DFG – EXC2151 – 390873048); by the HGF Incubator grant sparse2big (ZT-I-0007); by EU projects SYSCID (grant 733100, P.R.) and ImmunoSep (grant 84722, J.L.S.); and by HPE to the DZNE for generating whole blood transcriptome data from patients with COVID-19. J.L.S. was further supported by the BMBF-funded excellence project Diet–Body–Brain (DietBB) (grant 01EA1809A), and J.L.S. and J.R. by NaFoUniMedCovid19 (FKZ: 01KX2021, project acronym COVIM). S.K. is supported by the Hellenic Institute for the Study of Sepsis. The clinical study in Greece was supported by the Hellenic Institute for the Study of Sepsis. E.J.G.-B. received funding from the FrameWork 7 programme HemoSpec (granted to the National and Kapodistrian University of Athens), the Horizon2020 Marie-Curie Project European Sepsis Academy (676129, granted to the National and Kapodistrian University of Athens), and the Horizon 2020 European Grant ImmunoSep (granted to the Hellenic Institute for the Study of Sepsis). P.R. was supported by DFG ExC2167, a stimulus fund from Schleswig-Holstein and the DFG NGS Centre CCGA. The clinical study in Munich was supported by the Care-for-Rare Foundation. S.K.-H. is a scholar of the Reinhard-Frank Stiftung. D.P. is funded by the Hector Fellow Academy. The work was additionally supported by the Michael J. Fox Foundation for Parkinson’ Research under grant 14446. M.G.N. was supported by an ERC Advanced Grant (833247) and a Spinoza Grant of the Netherlands Organization for Scientific Research. R.B. and A.K. were
Funding Information:
supported by Dr. Rolf M. Schwiete Stiftung, Staatskanzlei des Saarlandes and Saarland University. J.N. is supported by the DFG (SFB TR47, SPP1937) and the Hector Foundation (M88). M.A. is supported by COVIM: NaFoUniMedCovid19 (FKZ: 01KX2021). M. Becker is supported by the HGF Helmholtz AI grant Pro-Gene-Gen (ZT-I-PF-5-23).
Funding Information:
Competing interests H.S., K.L.S., S. Manamohan, Saikat Mukherjee, V.G., R.S., C.S., M.D., B.M, C.M.S., S.C., M.S.W. and E.L.G. are employees of Hewlett Packard Enterprise. Hewlett Packard Enterprise developed the SLL in its entirety as described in this work and has submitted multiple associated patent applications. E.J.G.-B. received honoraria from AbbVie USA, Abbott CH, InflaRx GmbH, MSD Greece, XBiotech Inc. and Angelini Italy and independent educational grants from AbbVie, Abbott, Astellas Pharma Europe, AxisShield, bioMérieux Inc, InflaRx GmbH, and XBiotech Inc. All other authors declare no competing interests.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/6/10
Y1 - 2021/6/10
N2 - Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine1,2. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes3. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation4,5. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.
AB - Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine1,2. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes3. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation4,5. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.
U2 - 10.1038/s41586-021-03583-3
DO - 10.1038/s41586-021-03583-3
M3 - Article
C2 - 34040261
AN - SCOPUS:85106583195
SN - 0028-0836
VL - 594
SP - 265
EP - 270
JO - Nature
JF - Nature
IS - 7862
ER -