An Analysis of Joint Nonlinear Spatial Filtering for Spatial Aliasing Reduction

Alina Mannanova, Jakob Kienegger, Timo Gerkmann

公開日: 2025/9/30

Abstract

The performance of traditional linear spatial filters for speech enhancement is constrained by the physical size and number of channels of microphone arrays. For instance, for large microphone distances and high frequencies, spatial aliasing may occur, leading to unwanted enhancement of signals from non-target directions. Recently, it has been proposed to replace linear beamformers by nonlinear deep neural networks for joint spatial-spectral processing. While it has been shown that such approaches result in higher performance in terms of instrumental quality metrics, in this work we highlight their ability to efficiently handle spatial aliasing. In particular, we show that joint spatial and tempo-spectral processing is more robust to spatial aliasing than traditional approaches that perform spatial processing alone or separately with tempo-spectral filtering. The results provide another strong motivation for using deep nonlinear networks in multichannel speech enhancement, beyond their known benefits in managing non-Gaussian noise and multiple speakers, especially when microphone arrays with rather large microphone distances are used.

全文を読む (arXiv.org)