10 Time–Frequency Domain Spatial Audio Enhancement

Symeon Delikaris-Manias1 and Pasi Pertilä2

1Department of Signal Processing and Acoustics, Aalto University, Finland

2Department of Signal Processing, Tampere University of Technology, Finland

10.1 Introduction

Signal enhancement from noise and interference has been a research problem of interest for several decades. The two main approaches of research in enhancement are single channel and multiple channels, with the latter gaining more popularity due to the increase in computational power and the fact that most modern devices are equipped with multiple sensors. On the other hand, the field of monophonic speech enhancement has also seen increased research efforts in recent years due to the limited requirements on microphone sensors; approaches based on deep neural network (DNN) learning are the most common, for example in Narayanan and Wang (2013), Wang et al. (2013), Wang and Wang (2013), Weninger et al. (2014), Erdogan et al. (2015), and Williamson et al. (2016). While arguably the monophonic case is the most general and challenging scenario, the availability of multiple microphones broadens the number of options to pursue for enhancement.

Multi-microphone devices enable flexible recording of sound sources in the presence of interferers, noise, and reverberation. The most common enhancement techniques for microphone arrays are based on the design of directional filters or beamforming. Directional filtering with microphone ...

Get Parametric Time-Frequency Domain Spatial Audio now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.