Efficient training of acoustic models for reverberation-robust medium-vocabulary automatic speech recognition

Conference contribution
(Conference Contribution)


Publication Details

Author(s): Sehr A, Barfuß H, Hofmann C, Maas R, Kellermann W
Publisher: IEEE Computer Society
Publication year: 2014
Pages range: 177-181
ISBN: 978-1-4799-3109-5
Language: English


Abstract


A recently proposed concept for training reverberation-robust acoustic models for automatic speech recognition using pairs of clean and reverberant data is extended from word models to tied-state triphone models in this paper. The key idea of the concept, termed ICEWIND, is to use the clean data for the temporal alignment and the reverberant data for the estimation of the emission densities. Experiments with the 5000-word Wall Street Journal corpus confirm the benefits of ICEWIND with tied-state triphones: While the training time is reduced by more than 90%, the word accuracy is improved at the same time, both for room-specific and multi-style hidden Markov models. Since the acoustic models trained with ICEWIND need less Gaussian components for the emission densities to achieve comparable recognition rates as Baum-Welch acoustic models, ICEWIND also allows for a reduced decoding complexity. © 2014 IEEE.


FAU Authors / FAU Editors

Barfuß, Hendrik
Professur für Nachrichtentechnik
Hofmann, Christian
Professur für Nachrichtentechnik
Kellermann, Walter Prof. Dr.-Ing.
Professur für Nachrichtentechnik
Maas, Roland
Lehrstuhl für Multimediakommunikation und Signalverarbeitung


External institutions
Hochschule für Technik und Wirtschaft Berlin (HTW)


How to cite

APA:
Sehr, A., Barfuß, H., Hofmann, C., Maas, R., & Kellermann, W. (2014). Efficient training of acoustic models for reverberation-robust medium-vocabulary automatic speech recognition. In Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA) (pp. 177-181). Nancy, FR: IEEE Computer Society.

MLA:
Sehr, Armin, et al. "Efficient training of acoustic models for reverberation-robust medium-vocabulary automatic speech recognition." Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA), Nancy IEEE Computer Society, 2014. 177-181.

BibTeX: 

Last updated on 2019-03-06 at 07:14