Single-channel blind direct-to-reverberation ratio estimation using masking

Mack W, Deng S, Habets E (2020)


Publication Type: Conference contribution

Publication year: 2020

Publisher: International Speech Communication Association

Book Volume: 2020-October

Pages Range: 5066-5070

Conference Proceedings Title: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Event location: Shanghai, CHN

DOI: 10.21437/Interspeech.2020-2171

Abstract

Acoustic parameters, like the direct-to-reverberation ratio (DRR), can be used in audio processing algorithms to perform, e.g., dereverberation or in audio augmented reality. Often, the DRR is not available and has to be estimated blindly from recorded audio signals. State-of-the-art DRR estimation is achieved by deep neural networks (DNNs), which directly map a feature representation of the acquired signals to the DRR. Motivated by the equality of the signal-to-reverberation ratio and the (channel-based) DRR under certain conditions, we formulate single-channel DRR estimation as an extraction task of two signal components from the recorded audio. The DRR can be obtained by inserting the estimated signals in the definition of the DRR. The extraction is performed using time-frequency masks. The masks are estimated by a DNN trained end-to-end to minimize the mean-squared error between the estimated and the oracle DRR. We conduct experiments with different preprocessing and mask estimation schemes. The proposed method outperforms state-of-the-art single- and multi-channel methods on the ACE challenge data corpus.

Authors with CRIS profile

How to cite

APA:

Mack, W., Deng, S., & Habets, E. (2020). Single-channel blind direct-to-reverberation ratio estimation using masking. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 5066-5070). Shanghai, CHN: International Speech Communication Association.

MLA:

Mack, Wolfgang, Shuwen Deng, and Emanuël Habets. "Single-channel blind direct-to-reverberation ratio estimation using masking." Proceedings of the 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020, Shanghai, CHN International Speech Communication Association, 2020. 5066-5070.

BibTeX: Download