Emotion Recognition from Speech under Environmental Noise Conditions using Wavelet Decomposition

Conference contribution

Publication Details

Author(s): Vasquez-Correa JC, Garcia N, Orozco-Arroyave JR, Arias-Londono JD, Vargas-Bonilla JF, Nöth E
Publication year: 2015
Pages range: 247-252
ISSN: 1071-6572


Automatic emotion recognition considering speech signals has attracted the attention of the research community in the last years. One of the main challenges is to find suitable features to represent the affective state of the speaker. In this paper, a new set of features derived from the wavelet packet transform is proposed to classify different negative emotions such as anger, fear, and disgust, and to differentiate between those negative emotions and neutral state, or positive emotions such as happiness. Different wavelet decompositions are considered both for voiced and unvoiced segments, in order to determine a frequency band where the emotions are concentrated. Several measures are calculated in the wavelet decomposed signals, including log-energy, entropy measures, mel frequency cepstral coefficients, and the Lempel-Ziv complexity. The experiments consider two different databases extensively used in emotion recognition: the Berlin emotional database, and the enterface05 database. Also, in order to approximate to real-world conditions in terms of the quality of recorded speech, such databases are degraded using different environmental noise such as cafeteria babble, and street noise. The addition of noise is performed considering several signal to noise ratio levels which range from -3 to 6 dB. Finally, the effect produced by two different speech enhancement methods is evaluated. According to results, the features calculated from the lower frequency wavelet decomposition coefficients are able to recognize the fear-type emotions in speech. Also, one of the speech enhancement algorithms has proven to be useful to improve of the accuracy in cases of speech signals affected by highly background noise.

FAU Authors / FAU Editors

Nöth, Elmar Prof. Dr.-Ing.
Professur für Informatik (Mustererkennung)

How to cite

Vasquez-Correa, J.C., Garcia, N., Orozco-Arroyave, J.R., Arias-Londono, J.D., Vargas-Bonilla, J.F., & Nöth, E. (2015). Emotion Recognition from Speech under Environmental Noise Conditions using Wavelet Decomposition. (pp. 247-252).

Vasquez-Correa, J. C., et al. "Emotion Recognition from Speech under Environmental Noise Conditions using Wavelet Decomposition." 2015. 247-252.


Last updated on 2018-19-04 at 03:19