Musical Sound Source Separation using Extended Tensor Decompositions

Derry Fitzgerald, Dublin Institute of Technology

Document Type Conference Paper

International Symposium on Nonlinear Theory and its Applications, Sapporo, Japan, 2009


Recently, tensor decompositions have found use in sound source separation. In particular, non-negative tensor decompositions have received a lot of attention due to their ability to decompose audio spectrograms into meaningful ”parts” such as individual notes. Extensions to the basic non-negative tensor factorisation framework allow the incorporation of additional constraints, such as shift-invariance in both frequency and time. This enables the factorisations to capture more complex structures than individual notes, such as individual sources playing different pitches and time-evolving instrument timbres. Further music specific constraints such as harmonicity and source-filter modeling have been shown to improve separation performance for musical signals. Other recent advances also allow the incorporation of Bayesian priors into these models, thereby further improving the separations obtained.