This item is available under a Creative Commons License for non-commercial use only
Electrical and electronic engineering
Many sound source separation algorithms, such as NMF and related approaches, disregard phase information and operate only on magnitude or power spectrograms. In this context, generalised Wiener filters have been widely used to generate masks which are applied to the original complex-valued spectrogram before inversion to the time domain, as these masks have been shown to give good results. However, these masks may not be optimal from a perceptual point of view. To this end, we propose new families of masks and compare their performance to generalised Wiener filter masks using three different factorisation-based separation algorithms. Further, to-date no analysis of how the performance of masking varies with the number of iterations performed when estimating the separated sources. We perform such an analysis and show that when using these masks, running to convergence may not be required in order to obtain good separation performance.
Fitzgerald, D. & Jaiswal, R. (2012) On the use of masking filters in sound source separation. Proc. of the 15th International Conference on Digital Audio Effects (DAFx-12), York, UK , September 17-21, 2012.