Predicting mean ribosome load for 5’UTR of any length using deep learning

Karollus A, Avsec Z, Gagneur J (2021)


Publication Type: Journal article

Publication year: 2021

Journal

Book Volume: 17

Article Number: e1008982

Journal Issue: 5

DOI: 10.1371/journal.pcbi.1008982

Abstract

The 5’ untranslated region plays a key role in regulating mRNA translation and consequently protein abundance. Therefore, accurate modeling of 5’UTR regulatory sequences shall provide insights into translational control mechanisms and help interpret genetic variants. Recently, a model was trained on a massively parallel reporter assay to predict mean ribosome load (MRL)—a proxy for translation rate—directly from 5’UTR sequence with a high degree of accuracy. However, this model is restricted to sequence lengths investigated in the reporter assay and therefore cannot be applied to the majority of human sequences without a substantial loss of information. Here, we introduced frame pooling, a novel neural network operation that enabled the development of an MRL prediction model for 5’UTRs of any length. Our model shows state-of-the-art performance on fixed length randomized sequences, while offering better generalization performance on longer sequences and on a variety of translation-related genome-wide datasets. Variant interpretation is demonstrated on a 5’UTR variant of the gene HBB associated with beta-thalassemia. Frame pooling could find applications in other bioinformatics predictive tasks. Moreover, our model, released open source, could help pinpoint pathogenic genetic variants.

Involved external institutions

How to cite

APA:

Karollus, A., Avsec, Z., & Gagneur, J. (2021). Predicting mean ribosome load for 5’UTR of any length using deep learning. PLoS Computational Biology, 17(5). https://dx.doi.org/10.1371/journal.pcbi.1008982

MLA:

Karollus, Alexander, Ziga Avsec, and Julien Gagneur. "Predicting mean ribosome load for 5’UTR of any length using deep learning." PLoS Computational Biology 17.5 (2021).

BibTeX: Download