Hernandez A, Klumpp P, Das BK, Maier A, Yang SH (2022)
Publication Language: English
Publication Type: Conference contribution, Original article
Publication year: 2022
Publisher: Springer Cham
Series: Lecture Notes in Computer Science
City/Town: Springer Nature Switzerland AG
Book Volume: 13502
Pages Range: 291-300
Conference Proceedings Title: Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings
        Event location: Brno, Czech Republic
        
            
    
ISBN: 978-3-031-16270-1
URI: https://link.springer.com/chapter/10.1007/978-3-031-16270-1_24
DOI: 10.1007/978-3-031-16270-1_24
The demand for both quantity and quality of online educational resources has skyrocketed during the last two years’ pandemic. Entire course series had since been recorded and distributed online. To reach a broader audience, videos could be transcribed, combined with supplementary material (e.g. lecture slides) and published in the style of blog posts. This had been done previously for Autoblog 2020, a corpus of lecture recordings that had been converted to blog posts, using automated speech recognition (ASR) for subtitle creation. This work aims to introduce a second series of recorded and manually transcribed lecture videos. The corresponding data includes lecture videos, slides, and blog posts/transcripts with aligned slide images and is published under creative commons license. A state-of-the-art Wav2Vec ASR model was used for automatic transcription of the content, using different n-gram language models (LM). The results were compared to the human ground truth annotation. Findings indicated that the ASR performed well on spontaneous lecture speech. Furthermore, LMs trained on large amounts of data with fewer out-of-vocabulary words were outperformed by much smaller LMs estimated over in-domain language. Annotated lecture recordings were deemed helpful for the creation of task-specific ASR solutions as well as their validation against a human ground truth.
APA:
Hernandez, A., Klumpp, P., Das, B.K., Maier, A., & Yang, S.H. (2022). Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech. In Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala (Eds.), Text, Speech, and Dialogue 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings (pp. 291-300). Brno, Czech Republic, CZ: Springer Nature Switzerland AG: Springer Cham.
MLA:
Hernandez, Abner, et al. "Autoblog 2021: The Importance of Language Models for Spontaneous Lecture Speech." Proceedings of the 25th International Conference on Text, Speech and Dialogue, Brno, Czech Republic Ed. Petr Sojka, Aleš Horák, Ivan Kopeček, Karel Pala, Springer Nature Switzerland AG: Springer Cham, 2022. 291-300.
BibTeX: Download