Emotion Recognition Using Imperfect Speech Recognition

Metze F, Batliner A, Eyben F, Polzehl T, Schuller B, Steidl S (2010)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2010

Original Authors: Metze Florian, Batliner Anton, Eyben Florian, Polzehl Tim, Schuller Björn, Steidl Stefan

Pages Range: 478-481

Conference Proceedings Title: Proceedings of Interspeech

Event location: Makuhari JP

URI: http://www5.informatik.uni-erlangen.de/Forschung/Publikationen/2010/Metze10-ERU.pdf

Abstract

This paper investigates the use of speech-to-text methods for assigning an emotion class to a given speech utterance. Previous work shows that an emotion extracted from text can convey complementary evidence to the information extracted by classifiers based on spectral, or other non-linguistic features. As speech-to-text usually presents significantly more computational effort, in this study we investigate the degree of speech-to-text accuracy needed for reliable detection of emotions from an automatically generated transcription of an utterance. We evaluate the use of hypotheses in both training and testing, and compare several classification approaches on the same task. Our results show that emotion recognition performance stays roughly constant as long as word accuracy doesn’t fall below a reasonable value, making the use of speech-to-text viable for training of emotion classifiers based on linguistics.

Authors with CRIS profile

Involved external institutions

How to cite

APA:

Metze, F., Batliner, A., Eyben, F., Polzehl, T., Schuller, B., & Steidl, S. (2010). Emotion Recognition Using Imperfect Speech Recognition. In ISCA (Eds.), Proceedings of Interspeech (pp. 478-481). Makuhari, JP.

MLA:

Metze, Florian, et al. "Emotion Recognition Using Imperfect Speech Recognition." Proceedings of the INTERSPEECH 2010 - ICSLP, 11th International Conference on Spoken Language Processing, Makuhari Ed. ISCA, 2010. 478-481.

BibTeX: Download