Sfaira accelerates data and model reuse in single cell genomics

Fischer DS, Dony L, Konig M, Moeed A, Zappia L, Heumos L, Tritschler S, Holmberg O, Aliee H, Theis FJ (2021)


Publication Type: Journal article

Publication year: 2021

Journal

Book Volume: 22

Article Number: 248

Journal Issue: 1

DOI: 10.1186/s13059-021-02452-6

Abstract

Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.

Involved external institutions

How to cite

APA:

Fischer, D.S., Dony, L., Konig, M., Moeed, A., Zappia, L., Heumos, L.,... Theis, F.J. (2021). Sfaira accelerates data and model reuse in single cell genomics. Genome Biology, 22(1). https://doi.org/10.1186/s13059-021-02452-6

MLA:

Fischer, David S., et al. "Sfaira accelerates data and model reuse in single cell genomics." Genome Biology 22.1 (2021).

BibTeX: Download