Multi-style training of HMMS with stereo data for reverberation-robust speech recognition

Conference contribution
(Conference Contribution)

Publication Details

Author(s): Sehr A, Hofmann C, Maas R, Kellermann W
Publication year: 2011
Pages range: 196-199
ISBN: 9781457709999


A novel training algorithm using data pairs of clean and reverberant feature vectors for estimating robust Hidden Markov Models (HMMs), introduced in [1] for matched training, is employed in this paper for multi-style training. The multi-style HMMs are derived from well-trained clean-speech HMMs by aligning the clean data to the clean-speech HMM and using the resulting state-frame alignment to estimate the Gaussian mixture densities from the reverberant data of several different rooms. Thus, the temporal alignment is fixed for all reverberation conditions contained in the multi-style training set so that the model mismatch between the different rooms is reduced. Therefore, this training approach is particularly suitable for multi-style training. Multi-style HMMs trained by the proposed approach and adapted to the current room condition using maximum likelihood linear regression significantly outperform the corresponding adapted multi-style HMMs trained by the conventional Baum-Welch algorithm. In strongly reverberant rooms, the proposed adapted multi-style HMMs even outper-form Baum-Welch HMMs trained on matched data. © 2011 IEEE.

FAU Authors / FAU Editors

Hofmann, Christian
Professur für Nachrichtentechnik
Kellermann, Walter Prof. Dr.-Ing.
Professur für Nachrichtentechnik
Maas, Roland
Lehrstuhl für Multimediakommunikation und Signalverarbeitung
Sehr, Armin Dr.-Ing.
Professur für Nachrichtentechnik

How to cite

Sehr, A., Hofmann, C., Maas, R., & Kellermann, W. (2011). Multi-style training of HMMS with stereo data for reverberation-robust speech recognition. (pp. 196-199). Edinburgh, GB.

Sehr, Armin, et al. "Multi-style training of HMMS with stereo data for reverberation-robust speech recognition." Proceedings of the 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11, Edinburgh 2011. 196-199.


Last updated on 2018-14-12 at 13:50