Hümmer C, Stadter P, Kellermann W (2016)
Publication Language: English
Publication Type: Conference contribution
Publication year: 2016
Pages Range: 362-366
ISBN: 978-3-8007-4275-2
Uncertainty decoding combines a probabilistic distortion model with the acoustic model of a speech recognition system. This can be realized for DNN-based acoustic models by drawing feature samples from an estimated probability distribution and averaging the resulting set of posterior likelihoods at the output of the DNN. According to this principle, we consider a probabilistic feature description in the logmelspec domain to model the front-end estimation errors produced by a coherence-based Wiener filter. As main innovation with respect to previous work, we employ a sampling strategy based on the eigenvalue decomposition to capture (instead of neglect) the cross-correlations between the acoustic features as part of the uncertainty decoding scheme. The experimental results for real recordings provided by the REVERB Challenge task highlight the effectiveness of this sampling strategy in improving the recognition accuracy of a DNN-HMM hybrid system.
APA:
Hümmer, C., Stadter, P., & Kellermann, W. (2016). Uncertainty decoding using a sampling strategy based on the eigenvalue decomposition. In Proceedings of the 12th ITG Symposium on Speech Communication (pp. 362-366). Paderborn, DE.
MLA:
Hümmer, Christian, Philipp Stadter, and Walter Kellermann. "Uncertainty decoding using a sampling strategy based on the eigenvalue decomposition." Proceedings of the 12th ITG Symposium on Speech Communication, Paderborn 2016. 362-366.
BibTeX: Download