Sehr A, Barfuß H, Hofmann C, Maas R, Kellermann W (2014)
Publication Language: English
Publication Status: Published
Publication Type: Conference contribution, Conference Contribution
Publication year: 2014
Publisher: IEEE Computer Society
Pages Range: 177-181
Article Number: 6843275
ISBN: 978-1-4799-3109-5
DOI: 10.1109/HSCMA.2014.6843275
A recently proposed concept for training reverberation-robust acoustic models for automatic speech recognition using pairs of clean and reverberant data is extended from word models to tied-state triphone models in this paper. The key idea of the concept, termed ICEWIND, is to use the clean data for the temporal alignment and the reverberant data for the estimation of the emission densities. Experiments with the 5000-word Wall Street Journal corpus confirm the benefits of ICEWIND with tied-state triphones: While the training time is reduced by more than 90%, the word accuracy is improved at the same time, both for room-specific and multi-style hidden Markov models. Since the acoustic models trained with ICEWIND need less Gaussian components for the emission densities to achieve comparable recognition rates as Baum-Welch acoustic models, ICEWIND also allows for a reduced decoding complexity. © 2014 IEEE.
APA:
Sehr, A., Barfuß, H., Hofmann, C., Maas, R., & Kellermann, W. (2014). Efficient training of acoustic models for reverberation-robust medium-vocabulary automatic speech recognition. In Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA) (pp. 177-181). Nancy, FR: IEEE Computer Society.
MLA:
Sehr, Armin, et al. "Efficient training of acoustic models for reverberation-robust medium-vocabulary automatic speech recognition." Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA), Nancy IEEE Computer Society, 2014. 177-181.
BibTeX: Download