A novel approach for matched reverberant training of HMMs using data pairs

Sehr A, Hofmann C, Maas R, Kellermann W (2010)


Publication Language: English

Publication Status: Published

Publication Type: Conference contribution, Conference Contribution

Publication year: 2010

Pages Range: 566-569

Event location: Makuhari, Chiba JP

URI: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=79959848054∨igin=inward

Abstract

For robust distant-talking speech recognition, a novel HMM training approach using data pairs is proposed. The data pairs of clean and reverberant feature vectors, also called stereo data, are used for deriving the HMM parameters of a matched-condition reverberant HMM from a well-trained clean-speech HMM in two steps. In the first step, the alignment of the frames to the states is determined from the clean data and the clean-speech HMM. This state-frame alignment (SFA) is then used in the second step to estimate the Gaussian mixture densities for each state of the reverberant HMM by applying the Expectation Maximization (EM) algorithm to the reverberant data. Thus, a more accurate temporal alignment is achieved than by standard matched condition training, and the discrimination capability of the HMMs is increased. Connected digit recognition experiments show that the proposed approach decreases the word error rate (WER) by up to 44% while substantially reducing the training complexity. These improvements will make reverberant training attractive for a wider range of applications. © 2010 ISCA.

Authors with CRIS profile

How to cite

APA:

Sehr, A., Hofmann, C., Maas, R., & Kellermann, W. (2010). A novel approach for matched reverberant training of HMMs using data pairs. In Proceedings of the 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 (pp. 566-569). Makuhari, Chiba, JP.

MLA:

Sehr, Armin, et al. "A novel approach for matched reverberant training of HMMs using data pairs." Proceedings of the 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba 2010. 566-569.

BibTeX: Download