Yurt M, Kantharaju P, Disch S, Niedermeier A, Escalante-B AN, Morgenshtern V (2021)
Publication Type: Conference contribution
Publication year: 2021
Publisher: International Speech Communication Association
Book Volume: 4
Pages Range: 3171-3175
Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Event location: Brno, CZE
ISBN: 9781713836902
DOI: 10.21437/Interspeech.2021-645
Accurate phoneme detection and processing can enhance speech intelligibility in hearing aids and audio & speech codecs. As fricative phonemes have an important part of their energy concentrated in high frequency bands, frequency lowering algorithms are used in hearing aids to improve fricative intelligibility for people with high-frequency hearing loss. In traditional audio codecs, while processing speech in blocks, spectral smearing around fricative phoneme borders results in pre and post echo artifacts. Hence, detecting the fricative borders and adapting the processing accordingly could enhance the quality of speech. Until recently, phoneme detection and analysis were mostly done by extracting features specific to the class of phonemes. In this paper, we present a deep learning based fricative phoneme detection algorithm that exceeds the state-of-the-art fricative phoneme detection accuracy on the TIMIT speech corpus. Moreover, we compare our method to other approaches that employ classical signal processing for fricative detection and also evaluate it on the TIMIT files coded with AAC codec followed by bandwidth limitation. Reported results of our deep learning approach on original TIMIT files are reproducible and come with an easy to use code that could serve as a baseline for any future research on this topic.
APA:
Yurt, M., Kantharaju, P., Disch, S., Niedermeier, A., Escalante-B, A.N., & Morgenshtern, V. (2021). Fricative phoneme detection using deep neural networks and its comparison to traditional methods. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 3171-3175). Brno, CZE: International Speech Communication Association.
MLA:
Yurt, Metehan, et al. "Fricative phoneme detection using deep neural networks and its comparison to traditional methods." Proceedings of the 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, Brno, CZE International Speech Communication Association, 2021. 3171-3175.
BibTeX: Download