Real-time dereverberation for deep neural network speech recognition

Schwarz A, Hümmer C, Maas R, Kellermann W (2015)

Publication Language: English

Publication Type: Conference contribution

Publication year: 2015

City/Town: Nuremberg, Germany

Pages Range: 139-142

Event location: Nuremberg

DOI: 10.13140/RG.2.1.3660.0480

Abstract

We evaluate a real-time multi-channel dereverberation method for the application to speech recognition with deep neural networks (DNN). The dereverberation method is based on modeling the reverberated signal as a mixture of a fully coherent direct path signal and a diffuse reverberation component, and estimating the coherent-to-diffuse power ratio (CDR) from the spatial coherence of the signals. The method can operate in real-time, i.e., without requiring processing of entire utterances. We compare CDR estimators which are “blind”, i.e., do not require information about the direction of arrival (DOA) of the target signal, with estimators which make use of a DOA estimate. The impact of the dereverberation method on speech recognition accuracy with different DNN-based acoustic models is investigated with the REVERB challenge corpus and the Kaldi speech recognition toolkit.

Authors with CRIS profile

Roland Maas Lehrstuhl für Multimediakommunikation und Signalverarbeitung (LMS) Walter Kellermann Professur für Signalverarbeitung

How to cite

APA:

Schwarz, A., Hümmer, C., Maas, R., & Kellermann, W. (2015). Real-time dereverberation for deep neural network speech recognition. In Proceedings of the Jahrestagung für Akustik (DAGA) (pp. 139-142). Nuremberg, DE: Nuremberg, Germany.

MLA:

Schwarz, Andreas, et al. "Real-time dereverberation for deep neural network speech recognition." Proceedings of the Jahrestagung für Akustik (DAGA), Nuremberg Nuremberg, Germany, 2015. 139-142.

BibTeX: Download