Federated learning enables big data for rare cancer boundary detection

Pati S, Baid U, Edwards B, Sheller M, Wang SH, Reina GA, Foley P, Gruzdev A, Karkada D, Davatzikos C, Sako C, Ghodasara S, Bilello M, Mohan S, Vollmuth P, Brugnara G, Preetha CJ, Sahm F, Maier-Hein K, Zenk M, Bendszus M, Wick W, Calabrese E, Rudie J, Villanueva-Meyer J, Cha S, Ingalhalikar M, Jadhav M, Pandey U, Saini J, Garrett J, Larson M, Jeraj R, Currie S, Frood R, Fatania K, Huang RY, Chang K, Quintero CB, Capellades J, Puig J, Trenkler J, Pichler J, Necker G, Haunschmidt A, Meckel S, Shukla G, Liem S, Alexander GS, Lombardo J, Palmer JD, Flanders AE, Dicker AP, Sair HI, Jones CK, Venkataraman A, Jiang M, So TY, Chen C, Heng PA, Dou Q, Kozubek M, Lux F, Michálek J, Matula P, Keřkovský M, Kopřivová T, Dostál M, Vybíhal V, Vogelbaum MA, Mitchell JR, Farinhas J, Maldjian JA, Yogananda CGB, Pinho MC, Reddy D, Holcomb J, Wagner BC, Ellingson BM, Cloughesy TF, Raymond C, Oughourlian T, Hagiwara A, Wang C, To MS, Bhardwaj S, Chong C, Agzarian M, Falcão AX, Martins SB, Teixeira BC, Sprenger F, Menotti D, Lucio DR, LaMontagne P, Marcus D, Wiestler B, Kofler F, Ezhov I, Metz M, Jain R, Lee M, Lui YW, McKinley R, Slotboom J, Radojewski P, Meier R, Wiest R, Murcia D, Fu E, Haas R, Thompson J, Ormond DR, Badve C, Sloan AE, Vadmal V, Waite K, Colen RR, Pei L, Ak M, Srinivasan A, Bapuraj JR, Rao A, Wang N, Yoshiaki O, Moritani T, Turk S, Lee J, Prabhudesai S, Morón F, Mandel J, Kamnitsas K, Glocker B, Dixon LV, Williams M, Zampakis P, Panagiotopoulos V, Tsiganos P, Alexiou S, Haliassos I, Zacharaki EI, Moustakas K, Kalogeropoulou C, Kardamakis DM, Choi YS, Lee SK, Chang JH, Ahn SS, Luo B, Poisson L, Wen N, Tiwari P, Verma R, Bareja R, Yadav I, Chen J, Kumar N, Smits M, van der Voort SR, Alafandi A, Incekara F, Wijnenga MM, Kapsas G, Gahrmann R, Schouten JW, Dubbink HJ, Vincent AJ, van den Bent MJ, French PJ, Klein S, Yuan Y, Sharma S, Tseng TC, Adabi S, Niclou SP, Keunen O, Hau AC, Vallières M, Fortin D, Lepage M, Landman B, Ramadass K, Xu K, Chotai S, Chambless LB, Mistry A, Thompson RC, Gusev Y, Bhuvaneshwar K, Sayah A, Bencheqroun C, Belouali A, Madhavan S, Booth TC, Chelliah A, Modat M, Shuaib H, Dragos C, Abayazeed A, Kolodziej K, Hill M, Abbassy A, Gamal S, Mekhaimar M, Qayati M, Reyes M, Park JE, Yun J, Kim HS, Mahajan A, Muzi M, Benson S, Beets-Tan RG, Teuwen J, Herrera-Trujillo A, Trujillo M, Escobar W, Abello A, Bernal J, Gómez J, Choi J, Baek S, Kim Y, Ismael H, Allen B, Buatti JM, Kotrotsou A, Li H, Weiss T, Weller M, Bink A, Pouymayou B, Shaykh HF, Saltz J, Prasanna P, Shrestha S, Mani KM, Payne D, Kurc T, Pelaez E, Franco-Maldonado H, Loayza F, Quevedo S, Guevara P, Torche E, Mendoza C, Vera F, Ríos E, López E, Velastin SA, Ogbole G, Soneye M, Oyekunle D, Odafe-Oyibotha O, Osobu B, Shu’aibu M, Dorcas A, Dako F, Simpson AL, Hamghalam M, Peoples JJ, Hu R, Tran A, Cutler D, Moraes FY, Boss MA, Gimpel J, Veettil DK, Schmidt K, Bialecki B, Marella S, Price C, Cimino L, Apgar C, Shah P, Menze B, Barnholtz-Sloan JS, Martin J, Bakas S (2022)

Publication Type: Journal article

Publication year: 2022

Journal

Nature Communications Nature Publishing Group: Nature Communications

Book Volume: 13

Article Number: 7346

Journal Issue: 1

DOI: 10.1038/s41467-022-33407-5

Abstract

Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing.

How to cite

APA:

Pati, S., Baid, U., Edwards, B., Sheller, M., Wang, S.H., Reina, G.A.,... Bakas, S. (2022). Federated learning enables big data for rare cancer boundary detection. Nature Communications, 13(1). https://doi.org/10.1038/s41467-022-33407-5

MLA:

Pati, Sarthak, et al. "Federated learning enables big data for rare cancer boundary detection." Nature Communications 13.1 (2022).

BibTeX: Download