This item is available under a Creative Commons License for non-commercial use only
Electrical and electronic engineering, Applied mathematics
This paper proposes the use of a synchronized linear transform, the synchronized short-time-Fourier-transform (sSTFT), for time-frequency analysis of anechoic mixtures. We address the short comings of the commonly used time-frequency linear transform in multichannel settings, namely the classical short-time-Fourier-transform (cSTFT). We propose a series of desirable properties for the linear transform used in a multichannel source separation scenario: stationary invertibility, relative delay, relative attenuation, and finally delay invariant relative windowed-disjoint orthogonality (DIRWDO). Multisensor source separation techniques which operate in the time-frequency domain, have an inherent error unless consideration is given to the multichannel properties proposed in this paper. The sSTFT preserves these relationships for multichannel data. The crucial innovation of the sSTFT is to locally synchronize the analysis to the observations as opposed to a global clock. Improvement in separation performance can be achieved because assumed properties of the time-frequency transform are satisfied when it is appropriately synchronized. Numerical experiments show the sSTFT improves instantaneous subsample relative parameter estimation in low noise conditions and achieves good synthesis.
de Fréin, R. & Rickard,S.T. (2011) "The Synchronized Short-Time-Fourier-Transform: Properties and Definitions for Multichannel Source Separation," in IEEE Transactions on Signal Processing, vol. 59, no. 1, pp. 91-103, Jan. 2011. doi: 10.1109/TSP.2010.2088392