Phonet: A tool based on gated recurrent neural networks to extract phonological posteriors from speech

Vasquez Correa J, Klumpp P, Orozco Arroyave JR, Nöth E (2019)


Publication Type: Conference contribution

Publication year: 2019

Publisher: International Speech Communication Association

Book Volume: 2019-September

Pages Range: 549-553

Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Event location: Graz AT

DOI: 10.21437/Interspeech.2019-1405

Abstract

There are a lot of features that can be extracted from speech signals for different applications such as automatic speech recognition or speaker verification. However, for pathological speech processing there is a need to extract features about the presence of the disease or the state of the patients that are comprehensible for clinical experts. Phonological posteriors are a group of features that can be interpretable by the clinicians and at the same time carry suitable information about the patient's speech. This paper presents a tool to extract phonological posteriors directly from speech signals. The proposed method consists of a bank of parallel bidirectional recurrent neural networks to estimate the posterior probabilities of the occurrence of different phonological classes. The proposed models are able to detect the phonological classes with accuracies over 90%. In addition, the trained models are available to be used by the research community interested in the topic.

Authors with CRIS profile

How to cite

APA:

Vasquez Correa, J., Klumpp, P., Orozco Arroyave, J.R., & Nöth, E. (2019). Phonet: A tool based on gated recurrent neural networks to extract phonological posteriors from speech. In Gernot Kubin, Thomas Hain, Bjorn Schuller, Dina El Zarka, Petra Hodl (Eds.), Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 549-553). Graz, AT: International Speech Communication Association.

MLA:

Vasquez Correa, Juan, et al. "Phonet: A tool based on gated recurrent neural networks to extract phonological posteriors from speech." Proceedings of the 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, Graz Ed. Gernot Kubin, Thomas Hain, Bjorn Schuller, Dina El Zarka, Petra Hodl, International Speech Communication Association, 2019. 549-553.

BibTeX: Download