Steidl S (2009)
Publication Language: English
Publication Type: Authored book, Volume of book series
Publication year: 2009
Original Authors: Steidl Stefan
Publisher: Logos Verlag
Series: Studien zur Mustererkennung
City/Town: Berlin
Book Volume: 28
Pages Range: 260.0
ISBN: 978-3832521455
URI: http://www5.informatik.uni-erlangen.de/Forschung/Publikationen/2009/Steidl09-ACO.pdf
The recognition of the user’s emotion-related state is one important step in making human-machine communication more natural. In this work, the focus is set on mono-modal systems with speech as only input channel. Current research has to shift from emotion portrayals to those states that actually appear in application-oriented scenarios. These states are mainly weak emotion-related states and mixtures of different states. The presented FAU Aibo Emotion Corpus is a major contribution in
this area. It is a corpus of spontaneous, emotionally colored speech of children at the age of 10 to 13 years interacting with the Sony robot Aibo. 11 emotion-related states are labeled on the word level. Experiments are conducted on three subsets of the corpus on the word, the turn, and the intermediate chunk level. Best results have been obtained on the chunk level where a classwise averaged recognition rate of almost 70 % for the 4-class problem Anger, Emphatic, Neutral, and Motherese has been achieved. Applying the proposed entropy based measure for the evaluation of
decoders, the performance of the machine classifier on the word level is even slightly better than the one of the average human labeler. The presented set of features covers both acoustic and linguistic features. The linguistic features perform slightly worse than the acoustic features. An improvement can be achieved by combining both knowledge sources. The acoustic features are categorized into prosodic, spectral, and voice quality features. The energy and duration based prosodic features and
the spectral MFCC features are the most relevant acoustic features in this scenario. Unigram models and bag-of-words features are the most relevant linguistic features.
APA:
Steidl, S. (2009). Automatic Classification of Emotion-Related User States in Spontaneous Children's Speech. Berlin: Logos Verlag.
MLA:
Steidl, Stefan. Automatic Classification of Emotion-Related User States in Spontaneous Children's Speech. Berlin: Logos Verlag, 2009.
BibTeX: Download