Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Publication Details

2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob) (2017) Rome, Italy Oct. 9, 2017 to Oct. 11, 2017

Abstract

We propose a signal-channel, adaptive threshold selection technique for binary mask construction, namely APHONIC, (AdaPtive tHreshOlding for NoIse Cancellation) for smart mobile environments. Using this mask, we introduce two noise cancellation techniques that perform robustly in the presence of real-world interfering signals that are typically encountered by mobile users: a violin busker, a subway and busy city square sounds. We demonstrate that when the power of the time-frequency components of the voice of a mobile user does not significantly overlap with the components of the interference signal, the threshold learning and noise cancellation techniques significantly improve the Signal-to-Interference Ratio (SIR) and the Signal-Distortion Ratio (SDR) of the recovered voice. When a mobile user's speech is mixed with music or with the sounds of a city square, or subway station, the speech energy is captured by a few large magnitude coefficients and APHONIC improves the SIR by greater than 20dB and the SDR by up to 5dB. The robustness of the threshold selection step and the noise cancellation algorithms is evaluated using environments typically experienced by mobile phone users. Listening tests indicate that the interference signal is no longer audible in the denoised signals. We outline how this approach could be used in many mobile voice-driven applications.

DOI

ieeecomputersociety.org/10.1109/WiMOB.2017.8115847


Share

COinS