Archives of Acoustics, 36, 3, pp. 519–532, 2011

Speech Enhancement Based on the Multi-Scales and Multi-Thresholds of the Auditory Perception Wavelet Transform

Zhi TAO
School of Electronic Information; School of Physical Science and Technology

He-Ming ZHAO
School of Electronic Information

Xiao-Jun ZHANG
School of Physical Science and Technology

Di WU
School of Electronic Information; School of Physical Science and Technology

This paper proposes a speech enhancement method using the multi-scales and
multi-thresholds of the auditory perception wavelet transform, which is suitable for
a low SNR (signal to noise ratio) environment. This method achieves the goal of noise
reduction according to the threshold processing of the human ear’s auditory masking
effect on the auditory perception wavelet transform parameters of a speech signal.
At the same time, in order to prevent high frequency loss during the process of noise
suppression, we first make a voicing decision based on the speech signals. Afterwards,
we process the unvoiced sound segment and the voiced sound segment according to
the different thresholds and different judgments. Lastly, we perform objective and
subjective tests on the enhanced speech. The results show that, compared to other
spectral subtractions, our method keeps the components of unvoiced sound intact,
while it suppresses the residual noise and the background noise. Thus, the enhanced
speech has better clarity and intelligibility.
Keywords: speech enhancement; low SNR; auditory perception wavelet transform; unvoiced enhancement; masking effect
Full Text: PDF
Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN).