Federated Principal Component Analysis for Genome-Wide Association Studies

Hartebrodt A, Nasirigerdeh R, Blumenthal DB, Röttger R (2021)


Publication Type: Conference contribution, Conference Contribution

Publication year: 2021

Publisher: IEEE

Pages Range: 1090-1095

Conference Proceedings Title: 21st IEEE International Conference on Data Mining (ICDM)

Event location: Auckland, New Zealand

DOI: 10.1109/ICDM51629.2021.00127

Abstract

Federated learning (FL) has emerged as a privacy-aware alternative to centralized data analysis, especially for biomedical analyses such as genome-wide association studies (GWAS). The data remains with the owner, which enables studies previously impossible due to privacy protection regulations. Principal component analysis (PCA) is a frequent preprocessing step in GWAS, where the eigenvectors of the sample-by-sample covariance matrix are used as covariates in the statistical tests. Therefore, a federated version of PCA suitable for vertical data partitioning is required for federated GWAS. Existing federated PCA algorithms exchange the complete sample eigenvectors, a potential privacy breach. In this paper, we present a federated PCA algorithm for vertically partitioned data which does not exchange the sample eigenvectors and is hence suitable for federated GWAS. We show that it outperforms existing federated solutions in terms of convergence behavior and scalability. Additionally, we provide a user-friendly privacy-aware web tool to promote acceptance of federated PCA among GWAS researchers.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Hartebrodt, A., Nasirigerdeh, R., Blumenthal, D.B., & Röttger, R. (2021). Federated Principal Component Analysis for Genome-Wide Association Studies. In 21st IEEE International Conference on Data Mining (ICDM) (pp. 1090-1095). Auckland, New Zealand: IEEE.

MLA:

Hartebrodt, Anne, et al. "Federated Principal Component Analysis for Genome-Wide Association Studies." Proceedings of the 21st IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand IEEE, 2021. 1090-1095.

BibTeX: Download