Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(2.77MB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Speech Using a Hard Thresholding Function
Tahsina Farah Sanam, Celia Shahnaz
Pages - 22 - 43     |    Revised - 15-03-2012     |    Published - 16-04-2012
Volume - 6   Issue - 2    |    Publication Date - April 2012  Table of Contents
MORE INFORMATION
KEYWORDS
Teager Energy Operator, Statistical Modeling, Thrsholding Function, Wavelet Packet Transform
ABSTRACT
In this paper a new thresholding based speech enhancement approach is presented, where the threshold is statistically determined by employing the Teager energy operation on the Wavelet Packet (WP) coefficients of noisy speech. The threshold thus obtained is applied on the WP coefficients of the noisy speech by using a hard thresholding function in order to obtain an enhanced speech. Detailed simulations are carried out in the presence of white, car, pink, and babble noises to evaluate the performance of the proposed method. Standard objective measures, spectrogram representations and subjective listening tests show that the proposed method outperforms the existing state-of-the-art thresholding based speech enhancement approaches for noisy speech from high to low levels of SNR.
CITED BY (2)  
1 Singh, S., Tripathy, M., & Anand, R. S. (2015). Binary mask based method for enhancement of mixed noise speech of low SNR input. International Journal of Speech Technology, 18(4), 609-617.
2 Singh, S., Tripathy, M., & Anand, R. S. (2013, September). Noise removal in single channel Hindi speech patterns by using binary mask thresholding function in various mother wavelets. In Signal Processing, Computing and Control (ISPCC), 2013 IEEE International Conference on (pp. 1-4). IEEE.
1 Google Scholar
2 CiteSeerX
3 Scribd
4 SlideShare
5 PdfSR
1 O’Shaughnessy, D., “Speech Communications: Human and Machine”, 2nd Edition, Wiley-IEEE Press, 1999.
2 Loizou, P. C., “Speech Enhancement: Theory and Practic,” Boca Raton: CRC Press, 2007.
3 Deller, J. Jr., Hansen, J. and Proakis, J., “Discrete-Time Processing of Speech Signals,” NY:IEEE Press, 2000.
4 Lim, J., and Oppenheim, A., “Enhancement and bandwidth compression of noisy speech”‘Proc.IEEE, vol. 67, No. 12, pp. 221-239, Dec. 1979.
5 Virag, N., “Single channel speech enhancement based on masking properties of the human auditory system,” IEEE Transactions on Speech and Audio Processing, volume 7, no. 2, pp.126-137, Mar 1999.
6 Ephraim, Y. and Van Trees, H. L., “A signal subspace approach for speech enhancement,”IEEE Trans. Speech Audio Processing, volume 3, pp. 251-266, 1995.
7 Mittal, U., and Phamdo, N., “Signal/noise KLT based approach for enhancing speech degraded by colored noise,” IEEE Trans. Speech Audio Processing, volume 8, pp. 159-167, March 2000.
8 Y. Hu and P. C. Loizou, “A generalized subspace approach for enhancing speech corrupted by colored noise” IEEE Trans. Speech, Audio Process, volume. 11, pp. 334– 341, Jul. 2003.
9 Jabloun, F. and Champagne, B., “Incorporating the human hearing properties in the signal subspace approach for speech enhancement,” IEEE Transactions on Speech and Audio Processing, volume. 11, pp. 700-708, 2003.
10 You, C. H., Koh, S. N., and Rahardja, S.; , “An invertible frequency eigen domain transformation for masking-based subspace speech enhancement,” IEEE Signal Processing Letters,volume.12, no.6, pp. 461- 464, June 2005.
11 Chang, J.-H., “Warped discrete cosine transform-based noisy speech enhancement,” IEEE Trans. Circuits and Systems II: Express Briefs, volume 52, pp. 535 – 539, 2005.
12 Gustafsson, H., Nordholm, S.E., and Claesson, I., “Spectral subtraction using reduced delay convolution and adaptive averaging” Speech and Audio Processing, IEEE Transactions on,vol.9, no.8, pp.799-807, Nov 2001.
13 Kamath, S., and Loizou, P.; “A multi-band spectral subtraction method for enhancing speech corrupted by colored noise,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 4, pp. IV-4164, 13-17, May 2002.
14 Yamashita, K. and Shimamura, T.,“Nonstationary noise estimation using low-frequency regions for spectral subtractio,” Signal Processing Letters, volume. 12, pp. 465-468, 2005.
15 Boll, S., “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans.Acoust., Speech, Signal Process., volume.27, pp. 113-120, Apr. 1979.
16 Chen, B. and Loizou, P. C., “A Laplacian-based (MMSE) estimator for speech enhancemen,”Speech Communication, volume. 49, pp. 134-143, 2007.
17 P. C. Loizou, “Speech enhancement based on perceptually motivated bayesian estimators of the magnitude spectrum,” IEEE Trans. Speech, Audio Process, volume. 13, pp. 857–869, Sep.2005.
18 Ephraim, Y. and Malah, D., “Speech enhancement using a minimum mean-square error logspectral amplitude estimator,” IEEE Transactions on Acoustics, Speech and Signal Processing,volume.33, no.2, pp. 443- 445, Apr 1985.
19 Sameti, H., and Sheikhzadeh, H., “L. Deng, R. Brennan, HMM-based strategies for enhancement of speech signals embedded in nonstationary noise,” IEEE Trans. Speech and Audio Processing, volume 6, pp. 445 –455, 1998.
20 Hansen, J. H. L., Radhakrishnan, V., and Arehart, K.H., “Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System,” IEEE Transactions on Audio, Speech, and Language Processing, volume. 14, no.6,pp. 2049-2063, Nov. 2006.
21 Almajai, I., and Milner, B., “Visually Derived Wiener Filters for Speech Enhancement,” IEEE Transactions on Audio, Speech, and Language Processing, volume. 19, no.6, pp. 1642-1651,Aug. 2011.
22 Ben Jebara, S., “A Perceptual Approach to Reduce Musical Noise Phenomenon with Wiener Denoising Technique,” IEEE International Conference on Acoustics, Speech and Signal Processing, (ICASSP), volume 3, pp. 14-19, May 2006.
23 Martin, R., ”Speech Enhancement Based on Minimum Mean-Square Error Estimation and Super gaussian Priors,” IEEE Transactions on Speech and Audio Processing, volume. 13, no.5, pp.845- 856, Sept. 2005.
24 Papoulis, A. and Pillai, S. U., “Probability, Random Variables and Stochastic Processes,” 4th Edition, McGraw-Hill, 2002.
25 Chang, S., Kwon, Y., Yang, S. I., and Kim, I. J.,“Speech enhancement for non-stationary noise environment by adaptive wavelet packet,” IEEE International Conference on Acoustics, Speech,and Signal Processing (ICASSP), volume. 1, pp. I-561 -I-564, 2002.
26 Yi, H. and Loizou, P.C.,“Speech enhancement based on wavelet thresholding the multitaper spectrum,” IEEE Signal Processing Letters, volume. 12, pp. 59-67, 2004.
27 Tabibian, S. and Akbari, A. and Nasersharif, B., “A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence,” 14th International CSI Computer Conference (CSICC), pp. 495-500, 2009.
28 Bahoura, M. and Rouat, J., “Wavelet speech enhancement based on the Teager energy operator,” IEEE Signal Processing Letters, volume. 8, pp. 10-12, 2001.
29 Donoho, D.L.,“De-noising by soft-thresholding,” IEEE Transactions on Information Theory,volume. 41, pp. 613-627, 1995.
30 Ghanbari, “A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets,” Speech Communication, volume 48, pp. 927 – 940, 2006.
31 Sheikhzadeh, H., and Abutalebi, H. R., “An improved wavelet-based speech enhancement system,” EUROSPEECH, pp. 1855–1858, 2001.
32 Shao, Y., Chang, C. H., “A Generalized Time–Frequency Subtraction Method for Robust Speech Enhancement Based on Wavelet Filter Banks Modeling of Human Auditory System,” IEEE Transactions on Systems, Man, and Cybernetics, volume. 37, no.4, pp.877-889, Aug. 2007.
33 Johnson, M. T., Yuan, X., and Ren, Y., “Speech signal enhancement through adaptive wavelet thresholding”, Speech Communication, 2007.
34 Kaiser, J., “Some useful properties of teager’s energy operators,” IEEE Int. Conf. Acoustics,Speech, and Signal Processing, (ICASSP), volume 3, pp. 149 –152.
35 Kaiser, J.,“Some useful properties of teager’s energy operators,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), volume 3, pp.149 –152.
36 Lallouani, A., Gabrea, M., and Gargour, C., “Wavelet based speech enhancement using two different threshold-based denoising algorithms,” Canadian Conference on Electrical and Computer Engineering, volume 1, pp. 315 – 318, 2004.
37 Malayeri, A., “Noise speech wavelet analyzing in special time ranges,” Int. Conf. Advanced Communication Technology (ICACT), volume 1, pp. 525 –528.
38 Ephraim, Y.; Malah, D.;, “Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech and Signal Processing,volume. 32, no.6, pp. 1109- 1121, Dec 1984.
39 Erkelens, J.S.; Hendriks, R.C.; Heusdens, R..; Jensen, J..; , “Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors,” IEEE Transactions on Audio, Speech, and Language Processing, volume. 15, no.6, pp.1741-1752,Aug. 2007.
40 Benesty, J., Sondhi, M., and Huang, Y., “Handbook of Speech Processing,” Springer, 2008.
41 Gazor, S.; Wei Zhang; , “Speech enhancement employing Laplacian-Gaussian mixture,” IEEE Transactions on Speech and Audio Processing, volume. 13, no.5, pp. 896- 904, Sept. 2005.
42 Plourde, E., “Bayesian short-time spectral amplitude estimators for single channel speech enhancement,” Ph.D. thesis, Montreal, Que., Canada, Canada, 2009.
43 Makovoz, D., “Noise Variance Estimation In Signal Processing,” IEEE International Symposium on Signal Processing and Information Technology, pp.364-369, Aug. 2006.
44 Babaie-Zadeh, M., and Jutten, C., “A general approach for mutual information minimization and its application to blind source separation,” IEEE Signal Processing Letters, volume 85, pp. 975–995, 2005.
45 De Souza, P., “A statistical approach to the design of an adaptive self normalizing silence detector, Acoustics,” IEEE Transactions on Speech and Signal Processing, volume. 31, pp. 678– 684, 1983.
46 Busso, C.; Sungbok Lee; Narayanan, S.; , “Analysis of Emotionally Salient Aspects of Fundamental Frequency for Emotion Detection,” IEEE Transactions on Audio, Speech, and Language Processing, vol.17, no.4, pp.582-596, May 2009.
47 Hu, Y., and Loizou, P., “Evaluation of objective quality measures for speech enhancement,”IEEE Transactions on Audio, Speech, and Language Processing, volume. 16, pp. 229 –238,2008.
Miss Tahsina Farah Sanam
Institute of Appropriate Technology, Bangladesh University of Engineering and Technology - Bangladesh
tahsina@iat.buet.ac.bd
Miss Celia Shahnaz
Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology - Bangladesh