Baumann I, Wagner D, Riedhammer K, Nöth E, Bocklet T (2023)
Publication Type: Conference contribution
Publication year: 2023
Publisher: Institute of Electrical and Electronics Engineers Inc.
Conference Proceedings Title: 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023
Event location: Taipei, TWN
ISBN: 9798350306897
DOI: 10.1109/ASRU57964.2023.10389704
The analysis of phonological processes is crucial in evaluating speech development disorders in children, but encounters challenges due to limited children audio data. This work focuses on automatic vowel error detection using a two-stage pipeline. The first stage uses a fine-tuned cross-lingual phone recognizer (wav2vec 2.0) to extract phone sequences from audio. The second stage employs a language model (BERT) for classification from a phone sequence, entirely trained on synthetic transcripts, to counteract the very broad range of potential mistakes. We evaluate the system on nonword audio recordings recited by preschool children from a speech development test. The results show that the classifier trained on synthetic data performs well, but its efficacy relies on the quality of the phone recognizer. The best classifier achieves an 94.7% F1 score when evaluated against phonetic ground truths, whereas the F1 score is 76.2% when using automatically recognized phone sequences.
APA:
Baumann, I., Wagner, D., Riedhammer, K., Nöth, E., & Bocklet, T. (2023). Detection of Vowel Errors in Children's Speech using Synthetic Phonetic Transcripts. In 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023. Taipei, TWN: Institute of Electrical and Electronics Engineers Inc..
MLA:
Baumann, Ilja, et al. "Detection of Vowel Errors in Children's Speech using Synthetic Phonetic Transcripts." Proceedings of the 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023, Taipei, TWN Institute of Electrical and Electronics Engineers Inc., 2023.
BibTeX: Download