Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech

Hernandez A, Klumpp P, Das BK, Maier A, Yang SH (2022)


Publication Language: English

Publication Type: Conference contribution, Original article

Publication year: 2022

Publisher: Springer Cham

Series: Lecture Notes in Computer Science

City/Town: Springer Nature Switzerland AG

Book Volume: 13502

Pages Range: 291-300

Conference Proceedings Title: Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings

Event location: Brno, Czech Republic CZ

ISBN: 978-3-031-16270-1

URI: https://link.springer.com/chapter/10.1007/978-3-031-16270-1_24

DOI: 10.1007/978-3-031-16270-1_24

Abstract

The demand for both quantity and quality of online educational resources has skyrocketed during the last two years’ pandemic. Entire course series had since been recorded and distributed online. To reach a broader audience, videos could be transcribed, combined with supplementary material (e.g. lecture slides) and published in the style of blog posts. This had been done previously for Autoblog 2020, a corpus of lecture recordings that had been converted to blog posts, using automated speech recognition (ASR) for subtitle creation. This work aims to introduce a second series of recorded and manually transcribed lecture videos. The corresponding data includes lecture videos, slides, and blog posts/transcripts with aligned slide images and is published under creative commons license. A state-of-the-art Wav2Vec ASR model was used for automatic transcription of the content, using different n-gram language models (LM). The results were compared to the human ground truth annotation. Findings indicated that the ASR performed well on spontaneous lecture speech. Furthermore, LMs trained on large amounts of data with fewer out-of-vocabulary words were outperformed by much smaller LMs estimated over in-domain language. Annotated lecture recordings were deemed helpful for the creation of task-specific ASR solutions as well as their validation against a human ground truth.

Authors with CRIS profile

How to cite

APA:

Hernandez, A., Klumpp, P., Das, B.K., Maier, A., & Yang, S.H. (2022). Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings (pp. 291-300). Brno, Czech Republic, CZ: Springer Nature Switzerland AG: Springer Cham.

MLA:

Hernandez, Abner, et al. "Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech." Proceedings of the 25th International Conference on Text, Speech and Dialogue, Brno, Czech Republic Ed. Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala, Springer Nature Switzerland AG: Springer Cham, 2022. 291-300.

BibTeX: Download