USCJun 15, 2026arXiv:2606.17263

Direction of arrival estimation from distant microphone data using single frequency filtering

Sushmita Thakallapalli, Sudarsana Reddy Kadiri, Nilesh Madhu, Suryakanth V Gangashetty

AI Summary

This study introduces a novel method for improving the robustness of narrowband (NB) direction-of-arrival (DoA) estimation in distant microphone data by employing single frequency filtering (SFF) to enhance cross-correlation of speech-present time-frequency regions. The key finding reveals that the SFF-based NB estimator consistently outperforms existing state-of-the-art NB methods and even surpasses some broadband (BB) estimators across various reverberation and noise conditions. This advancement addresses the limitations of traditional NB methods, particularly their susceptibility to spatial aliasing, while leveraging frequency sparsity for multiple speaker detection in real-time scenarios.

Key Contribution

The SFF-based narrowband DoA estimator not only mitigates spatial aliasing but also outperforms leading broadband methods in challenging acoustic environments.

Abstract

In distant microphones, broadband (BB) methods for direction-of-arrival (DoA) estimation are more suitable than narrowband (NB) methods. Due to the aggregation of their optimization function across all frequency bands, BB estimators are robust to spatial aliasing, a known problem in processing distant microphone data. In NB methods, DoA estimation is performed by utilizing \textit{local} information in each frequency band and hence the estimation is affected by spatial aliasing. However, unlike BB methods, NB methods exploit frequency sparsity to estimate the DoAs of \textit{multiple speakers} in a \textit{single time frame}. In this article, a method to improve the robustness of a NB DoA estimator to spatial aliasing is developed. The proposed method is based on cross-correlation of speech-present time-frequency regions obtained by single frequency filtering (SFF) of the microphone signals. The SFF spectrum is chosen because SFF components have regions of high signal-to-noise ratio both in time and frequency and because speech and non-speech discrimination is robust to degradations in the SFF domain. The proposed NB estimator is compared to four state-of-the-art estimators (one NB and three BB) using detection and accuracy metrics on simulated and real-world data in different reverberation and noise conditions. The results show that in all the environments, the SFF-based NB approach outperforms the state-of-the-art NB approach. Furthermore, the performance of the SFF-based approach is better than some of the BB estimators.

Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Direction of arrival estimation from distant microphone data using single frequency filtering

Related Papers