STFT bin selection for localization algorithms based on the sparsity of speech signal spectra

Conference contribution


Publication Details

Author(s): Brendel A, Huang C, Kellermann W
Publication year: 2018
Pages range: 2561-2568
ISSN: 2226-5147
Language: English


Abstract

Many algorithms for localizing, tracking or Direction of Arrival (DOA) estimation of speech sources, rely on the so-called W-disjoint orthogonality, i.e., only one speaker is assumed to be active at a certain time-frequency bin. Based on this assumption, bin-wise DOA estimates can be computed from pairwise phase differences of each time-frequency bin and clustered afterwards. Averaging the estimates of each cluster, i.e., computing the cluster centroids, increases the robustness of the localization estimate. However, clustering can be computationally demanding due to the large amount of DOA estimates, and at the same time highly sensitive to errors as potentially many of them may not be reliable due to noise and reverberation. Therefore, an efficient selection algorithm for reliable Short-Time Fourier Transform (STFT) bins is desirable that aims at increasing the accuracy of the estimate while simultaneously reducing the computational complexity. In this contribution, we investigate different selection methods for STFT bins as suitable for localization algorithms for speech sources, which are based on the W-disjoint orthogonality, and exploit bin-wise speech signal power, Coherent-to-Diffuse Power Ratio (CDR), and Speech Presence Probability (SPP). The effectiveness of the selection processes is studied for different localization algorithms.


FAU Authors / FAU Editors

Huang, Chengyu
Professur für Nachrichtentechnik
Kellermann, Walter Prof. Dr.-Ing.
Professur für Nachrichtentechnik


How to cite

APA:
Brendel, A., Huang, C., & Kellermann, W. (2018). STFT bin selection for localization algorithms based on the sparsity of speech signal spectra. In Proceedings of the EURONOISE 2018 (pp. 2561-2568). Heraklion, Crete, GR.

MLA:
Brendel, Andreas, Chengyu Huang, and Walter Kellermann. "STFT bin selection for localization algorithms based on the sparsity of speech signal spectra." Proceedings of the EURONOISE 2018, Heraklion, Crete 2018. 2561-2568.

BibTeX: 

Last updated on 2019-03-06 at 07:11