Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features

Hümmer C, Delcroix M, Ogawa A, Kinoshita K, Nakatani T, Kellermann W (2017)


Publication Language: English

Publication Type: Conference contribution

Publication year: 2017

Pages Range: 4875-4879

Event location: New Orleans US

ISBN: 978-1-5090-4117-6

DOI: 10.1109/ICASSP.2017.7953083

Abstract

We propose a new concept for adapting CNN-based acoustic models using spatial diffuseness features as auxiliary information about the acoustic environment: the spatial diffuseness features are simultaneously employed as acoustic-model input features and to estimate environmental cues for context adaptation, where one convolutional layer is factorized into several sub-layers to represent different acoustic conditions. This context-adaptive CNN-based acoustic model facilitates an online environmental adaptation and is experimentally verified for the real-world recordings provided by the CHiME-3 task. The best performing setup reduces the average word error rate scores achieved by the baseline system (without using spatial diffuseness features) from 19.4% to 15.9% and 12.2% to 10.7% considering two experimental setups with and without front-end signal enhancement, respectively.

Authors with CRIS profile

How to cite

APA:

Hümmer, C., Delcroix, M., Ogawa, A., Kinoshita, K., Nakatani, T., & Kellermann, W. (2017). Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4875-4879). New Orleans, US.

MLA:

Hümmer, Christian, et al. "Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features." Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans 2017. 4875-4879.

BibTeX: Download