Fricative phoneme detection using deep neural networks and its comparison to traditional methods

Yurt M, Kantharaju P, Disch S, Niedermeier A, Escalante-B AN, Morgenshtern V (2021)


Publication Type: Conference contribution

Publication year: 2021

Publisher: International Speech Communication Association

Book Volume: 4

Pages Range: 3171-3175

Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Event location: Brno, CZE

ISBN: 9781713836902

DOI: 10.21437/Interspeech.2021-645

Abstract

Accurate phoneme detection and processing can enhance speech intelligibility in hearing aids and audio & speech codecs. As fricative phonemes have an important part of their energy concentrated in high frequency bands, frequency lowering algorithms are used in hearing aids to improve fricative intelligibility for people with high-frequency hearing loss. In traditional audio codecs, while processing speech in blocks, spectral smearing around fricative phoneme borders results in pre and post echo artifacts. Hence, detecting the fricative borders and adapting the processing accordingly could enhance the quality of speech. Until recently, phoneme detection and analysis were mostly done by extracting features specific to the class of phonemes. In this paper, we present a deep learning based fricative phoneme detection algorithm that exceeds the state-of-the-art fricative phoneme detection accuracy on the TIMIT speech corpus. Moreover, we compare our method to other approaches that employ classical signal processing for fricative detection and also evaluate it on the TIMIT files coded with AAC codec followed by bandwidth limitation. Reported results of our deep learning approach on original TIMIT files are reproducible and come with an easy to use code that could serve as a baseline for any future research on this topic.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Yurt, M., Kantharaju, P., Disch, S., Niedermeier, A., Escalante-B, A.N., & Morgenshtern, V. (2021). Fricative phoneme detection using deep neural networks and its comparison to traditional methods. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 3171-3175). Brno, CZE: International Speech Communication Association.

MLA:

Yurt, Metehan, et al. "Fricative phoneme detection using deep neural networks and its comparison to traditional methods." Proceedings of the 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, Brno, CZE International Speech Communication Association, 2021. 3171-3175.

BibTeX: Download