Efficient target activity detection based on recurrent neural networks

Gerber D, Meier S, Kellermann W (2017)


Publication Language: English

Publication Type: Conference contribution, Conference Contribution

Publication year: 2017

Pages Range: 46-50

Event location: San Francisco, CA US

ISBN: 978-1-5090-5925-6

URI: http://ieeexplore.ieee.org/document/7895559/

DOI: 10.1109/HSCMA.2017.7895559

Abstract

This paper addresses the problem of Target Activity Detection (TAD) for binaural listening devices. TAD denotes the problem of robustly detecting the activity of a target speaker in a harsh acoustic environment, which comprises interfering speakers and noise (‘cocktail party scenario’). In previous work, it has been shown that employing a Feed-forward Neural Network (FNN) for detecting the target speaker activity is a promising approach to combine the advantage of different TAD features (used as network inputs). In this contribution, we exploit a larger context window for TAD and compare the performance of FNNs and Recurrent Neural Networks (RNNs) with an explicit focus on small network topologies as desirable for embedded acoustic signal processing systems. More specifically, the investigations include a comparison between three different types of RNNs, namely plain RNNs, Long Short-Term Memories, and Gated Recurrent Units. The results indicate that all versions of RNNs outperform FNNs for the task of TAD.

Authors with CRIS profile

How to cite

APA:

Gerber, D., Meier, S., & Kellermann, W. (2017). Efficient target activity detection based on recurrent neural networks. In Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA) (pp. 46-50). San Francisco, CA, US.

MLA:

Gerber, Daniel, Stefan Meier, and Walter Kellermann. "Efficient target activity detection based on recurrent neural networks." Proceedings of the Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), San Francisco, CA 2017. 46-50.

BibTeX: Download