Hartebrodt A, Nasirigerdeh R, Blumenthal DB, Röttger R (2021)
Publication Type: Conference contribution, Conference Contribution
Publication year: 2021
Publisher: IEEE
Pages Range: 1090-1095
Conference Proceedings Title: 21st IEEE International Conference on Data Mining (ICDM)
Event location: Auckland, New Zealand
DOI: 10.1109/ICDM51629.2021.00127
Federated learning (FL) has emerged as a privacy-aware alternative to centralized data analysis, especially for biomedical analyses such as genome-wide association studies (GWAS). The data remains with the owner, which enables studies previously impossible due to privacy protection regulations. Principal component analysis (PCA) is a frequent preprocessing step in GWAS, where the eigenvectors of the sample-by-sample covariance matrix are used as covariates in the statistical tests. Therefore, a federated version of PCA suitable for vertical data partitioning is required for federated GWAS. Existing federated PCA algorithms exchange the complete sample eigenvectors, a potential privacy breach. In this paper, we present a federated PCA algorithm for vertically partitioned data which does not exchange the sample eigenvectors and is hence suitable for federated GWAS. We show that it outperforms existing federated solutions in terms of convergence behavior and scalability. Additionally, we provide a user-friendly privacy-aware web tool to promote acceptance of federated PCA among GWAS researchers.
APA:
Hartebrodt, A., Nasirigerdeh, R., Blumenthal, D.B., & Röttger, R. (2021). Federated Principal Component Analysis for Genome-Wide Association Studies. In 21st IEEE International Conference on Data Mining (ICDM) (pp. 1090-1095). Auckland, New Zealand: IEEE.
MLA:
Hartebrodt, Anne, et al. "Federated Principal Component Analysis for Genome-Wide Association Studies." Proceedings of the 21st IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand IEEE, 2021. 1090-1095.
BibTeX: Download