Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(150.04KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
Wavelet Based Noise Robust Features for Speaker Recognition
Vibha Tiwari, Jyoti Singhai
Pages - 52 - 64     |    Revised - 01-05-2011     |    Published - 31-05-2011
Volume - 5   Issue - 2    |    Publication Date - May / June 2011  Table of Contents
MORE INFORMATION
KEYWORDS
Speaker Recognition, Mel Frequency Cepstral Coefficients (MFCC), Amplitude Modulation (AM), Wavelet Filterbank.
ABSTRACT
Extraction and selection of the best parametric representation of acoustic signal is the most important task in designing any speaker recognition system. A wide range of possibilities exists for parametrically representing the speech signal such as Linear Prediction Coding (LPC) ,Mel frequency Cepstrum coefficients (MFCC) and others. MFCC are currently the most popular choice for any speaker recognition system, though one of the shortcomings of MFCC is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze the non-stationary signal. Therefore it is not suitable for noisy speech signals. To overcome this problem several researchers used different types of AM-FM modulation/demodulation techniques for extracting features from speech signal. In some approaches it is proposed to use the wavelet filterbanks for extracting the features. In this paper a technique for extracting the features by combining the above mentioned approaches is proposed. Features are extracted from the envelope of the signal and then passed through wavelet filterbank. It is found that the proposed method outperforms the existing feature extraction techniques.
CITED BY (5)  
1 Faek, F. K. (2015). Objective Gender and Age Recognition from Speech Sentences.
2 Farouk, M. H. (2014). Speaker Recognition. In Application of Wavelets in Speech Processing (pp. 33-35). Springer International Publishing.
3 Vignolo, L. D., Milone, D. H., & Rufiner, H. L. (2013). Genetic wavelet packets for speech recognition. Expert Systems with Applications, 40(6), 2350-2359.
4 Karamangala, N., & Kumaraswamy, R. (2013). Speaker Recognition in Uncontrolled Environment: A Review. Journal of Intelligent Systems, 22(1), 49-65.
5 Faek, F. K., & Al-Talabani, A. K. (2013). Speaker Recognition from Noisy Spoken Sentences. International Journal of Computer Applications, 70(20), 11-14.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 iSEEK 
5 Scribd 
6 SlideShare 
7 PdfSR 
1 S. Pruzansky, “Pattern-matching procedure for automatic talker recognition”, J.A.S.A., 35, pp. 354-358, 1963.
2 K. P. Li, et. al., “Experimental studies in speaker verification using a adaptive system, J.A.S.A., 40, pp. 966-978, 1966.
3 K. P. Li and G. W. Hughes, “Talker differences as they appear in correlation matrices of continuous speech spectra”, J.A.S.A. , 55, pp. 833-837, 1974.
4 B. Beek, et. al., “An assessment of the technology of automatic speech recognition for military applications”, IEEE Trans. Acoustics Speech and Signal Processing, ASSP-25, pp. 310-322, 1977.
5 M. R. Sambur, “Speaker recognition and verification using linear prediction analysis,” .Ph. D. Dissert, M.I.T., 1972.
6 P. Mermelstein and S. Davis, “Comparison of parametric representation for mono syllabic word recognition in continuously spoken sentences”, In IEEE Transactions on Acoustic Speech and Signal Processing, Vol. 28, No. 4, pp. 357-366, 1980.
7 Q.Zhu and A. Alwan, “AM demodulation of speech spectra and its application to noise robust speech recognition” in proceedings ICSLP, 2000.
8 Q.Zhu and A. Alwan “Non linear feature extraction for robust speech recognition in stationary and non stationary noise” Computer speech and Language (17) ,pp. 381-402 , Elsevier Science Ltd. ,2003.
9 A. Potamianos and P. Maragos “Speech analysis and synthesis using an AM-FM modulation model”, Speech Communication, vol.28, (no.3), pp195-209, 1999.
10 D. Dimitriadis, J.C. Segura, L. Garcia, A. Potamianos, P. Maragos and V. Pitsikalis “Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments”, Proc. of Intern. Conf. on Speech Communication and Technology - Interspeech 2007, Antwerp, Belgium, Aug. 2007
11 D. Dimitriadis, P. Maragos, and A.Potamianos “Robust AM-FM Features for Speech Recognition” , IEEE signal processing letters, vol. 12, no. 9, pp. 621-624, Sep. 2005
12 J.N. Gowdy, and Z. Tufekci, "Mel-Scaled Discrete Wavelet Coefficients for Speech Recognition," Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, pp 1351-1354, Jun 2000.
13 Long and Dutta, “Wavelet based feature extraction for phoneme recognition”, in the proceedings of 4th international conference of spoken language processing,USA vol.1, 1996.
14 T. Kinnunen, V. Hautamäki, P. Fränti “Fusion of spectral feature sets for accurate speaker identification”, In Proc. 9th Int. Conf. Speech and Computer ,SPECOM ,2004
15 Sarikaya et.al. “Wavelet packet transform features with application to speaker identification”, in proceedings of the IEEE Nordic signal processing symposium 1998.
16 R. Sarikaya and J. H. L. Hansen,“High resolution speech feature parameterization for monophone-based stressed speech recognition”, IEEE signal processing letters vol7(7),pp. 182-185Jul 2000.
17 O. Farooq and S. Datta “Mel Filter- Like Admissible Wavelet packet Structure for Speech Recognition” , IEEE Signal Processing letters, Vol.8 , No. 7,pp 196-198 , Jul 2001
18 F.G. Zeng, K. Nie, G. S. Stickney, Y.Y.Kong, M. Vongphoe, A. Bhargave, C. Wei, and K. Cao “Speech recognition with amplitude and frequency modulations” PNAS vol. 102 , no. 7 ,pp 2293–2298, Feb., 2005
19 P. Maragos, J. F. Kaiser and T. F. Quatieri , “Energy Separation in Signal Modulations with Application to Speech Analysis”, IEEE transactions on signal processing, vol. 41, no. 10, pp. 3024-3051 , Oct. 1993.
20 F.Gunnar, “ The acoustic theory of speech production” , S’Gravenhage , Mouton,1960.
21 Y. Linde, A. Buzo, and R. M. Gray, ``An Algorithm for Vector Quantizer Design,'' IEEE Transactions on Communications, pp 84-95, Jan. 1980.
22 Y. Hu, and P. Loizou, , “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, Elsevier, 49, pp 588-601, 2007.
23 H. Hirsch, and D. Pearce , “The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions” , ISCA ITRW ASR 2000, Paris, France, Sep 18-20, 2000.
Mr. Vibha Tiwari
Gyan Ganga Institute of Technology and management Bhopal, India - India
vibhatiwari19@gmail.com
Dr. Jyoti Singhai
Maulana Azad National Institute Of Technology Bhopal, India - India