On-line NMF-based Stereo Up-Mixing of Speech Improves Perceived Reduction of Non-Stationary Noise

Abstract

Speech de-noising algorithms often suffer from introduction of artifacts, either by removal of parts of the speech signal, or imperfect noise reduction causing the remaining noise to sound unnatural and disturbing. This contribution proposes to spatially distribute monaural noisy speech signals based on single-channel source separation, in order to improve the perceived speech quality. Stereo up-mixing is utilized on the estimated speech and noise sources instead of simply suppressing the noise. This paper investigates the case of non-negative matrix factorization (NMF) speech enhancement applied to high levels of non-stationary noise. NMF-based and spectral subtraction speech enhancement algorithms are evaluated in a listening test in terms of speech intelligibility, presence of interfering noises and overall quality with respect to the unprocessed signal. In the result, the listening test provides evidence for superior noise reduction by NMF, yet also a drop in perceived speech quality that is not covered by the employed set of common objective metrics. However, stereo up-mixing of NMF-separated speech and noise delivers high subjective noise reduction while preserving the perceived speech quality.

Publication
Audio Engineering Society Conference: 53rd International Conference: Semantic Audio

Related