Steidl S, Batliner A, Nöth E, Hornegger J (2008)
Publication Type: Authored book, Volume of book series
Publication year: 2008
Original Authors: Steidl S., Batliner A., Nöth E., Hornegger J.
Publisher: Springer-verlag
City/Town: Berlin
Book Volume: null
Pages Range: 525-534
Event location: Brno
Journal Issue: null
DOI: 10.1007/978-3-540-87391-4_67
Prosodic features modelling pitch, energy, and duration play a major role in speech emotion recognition. Our word level features, especially duration and pitch features, rely on correct word segmentation and F0 extraction. For the FAU Aibo Emotion Corpus, the automatic segmentation of a forced alignment of the spoken word sequence and the automatically extracted F0 values have been manually corrected. Frequencies of different types of segmentation and F0errors are given and their influence on emotion recognition using different groups of prosodic features is evaluated. The classification results show that the impact of these errors on emotion recognition is small. © 2008 Springer-Verlag Berlin Heidelberg.
APA:
Steidl, S., Batliner, A., Nöth, E., & Hornegger, J. (2008). Quantification of segmentation and F0 errors and their effect on emotion recognition. Berlin: Springer-verlag.
MLA:
Steidl, Stefan, et al. Quantification of segmentation and F0 errors and their effect on emotion recognition. Berlin: Springer-verlag, 2008.
BibTeX: Download