Home   >   CSC-OpenAccess Library   >    Manuscript Information
Wavelet Based Noise Robust Features for Speaker Recognition
Vibha Tiwari, Jyoti Singhai
Pages - 52 - 64     |    Revised - 01-05-2011     |    Published - 31-05-2011
Volume - 5   Issue - 2    |    Publication Date - May / June 2011  Table of Contents
MORE INFORMATION
KEYWORDS
Speaker Recognition, Mel Frequency Cepstral Coefficients (MFCC), Amplitude Modulation (AM), Wavelet Filterbank.
ABSTRACT
Extraction and selection of the best parametric representation of acoustic signal is the most important task in designing any speaker recognition system. A wide range of possibilities exists for parametrically representing the speech signal such as Linear Prediction Coding (LPC) ,Mel frequency Cepstrum coefficients (MFCC) and others. MFCC are currently the most popular choice for any speaker recognition system, though one of the shortcomings of MFCC is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze the non-stationary signal. Therefore it is not suitable for noisy speech signals. To overcome this problem several researchers used different types of AM-FM modulation/demodulation techniques for extracting features from speech signal. In some approaches it is proposed to use the wavelet filterbanks for extracting the features. In this paper a technique for extracting the features by combining the above mentioned approaches is proposed. Features are extracted from the envelope of the signal and then passed through wavelet filterbank. It is found that the proposed method outperforms the existing feature extraction techniques.
CITED BY (5)  
1 Faek, F. K. (2015). Objective Gender and Age Recognition from Speech Sentences.
2 Farouk, M. H. (2014). Speaker Recognition. In Application of Wavelets in Speech Processing (pp. 33-35). Springer International Publishing.
3 Vignolo, L. D., Milone, D. H., & Rufiner, H. L. (2013). Genetic wavelet packets for speech recognition. Expert Systems with Applications, 40(6), 2350-2359.
4 Karamangala, N., & Kumaraswamy, R. (2013). Speaker Recognition in Uncontrolled Environment: A Review. Journal of Intelligent Systems, 22(1), 49-65.
5 Faek, F. K., & Al-Talabani, A. K. (2013). Speaker Recognition from Noisy Spoken Sentences. International Journal of Computer Applications, 70(20), 11-14.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 iSEEK 
5 Scribd 
6 SlideShare 
7 PdfSR 
A. Potamianos and P. Maragos “Speech analysis and synthesis using an AM-FM modulation model”, Speech Communication, vol.28, (no.3), pp195-209, 1999.
B. Beek, et. al., “An assessment of the technology of automatic speech recognition for military applications”, IEEE Trans. Acoustics Speech and Signal Processing, ASSP-25, pp. 310-322, 1977.
D. Dimitriadis, J.C. Segura, L. Garcia, A. Potamianos, P. Maragos and V. Pitsikalis “Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments”, Proc. of Intern. Conf. on Speech Communication and Technology - Interspeech 2007, Antwerp, Belgium, Aug. 2007
D. Dimitriadis, P. Maragos, and A.Potamianos “Robust AM-FM Features for Speech Recognition” , IEEE signal processing letters, vol. 12, no. 9, pp. 621-624, Sep. 2005
F.G. Zeng, K. Nie, G. S. Stickney, Y.Y.Kong, M. Vongphoe, A. Bhargave, C. Wei, and K. Cao “Speech recognition with amplitude and frequency modulations” PNAS vol. 102 , no. 7 ,pp 2293–2298, Feb., 2005
F.Gunnar, “ The acoustic theory of speech production” , S’Gravenhage , Mouton,1960.
H. Hirsch, and D. Pearce , “The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions” , ISCA ITRW ASR 2000, Paris, France, Sep 18-20, 2000.
J.N. Gowdy, and Z. Tufekci, "Mel-Scaled Discrete Wavelet Coefficients for Speech Recognition," Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, pp 1351-1354, Jun 2000.
K. P. Li and G. W. Hughes, “Talker differences as they appear in correlation matrices of continuous speech spectra”, J.A.S.A. , 55, pp. 833-837, 1974.
K. P. Li, et. al., “Experimental studies in speaker verification using a adaptive system, J.A.S.A., 40, pp. 966-978, 1966.
Long and Dutta, “Wavelet based feature extraction for phoneme recognition”, in the proceedings of 4th international conference of spoken language processing,USA vol.1, 1996.
M. R. Sambur, “Speaker recognition and verification using linear prediction analysis,” .Ph. D. Dissert, M.I.T., 1972.
O. Farooq and S. Datta “Mel Filter- Like Admissible Wavelet packet Structure for Speech Recognition” , IEEE Signal Processing letters, Vol.8 , No. 7,pp 196-198 , Jul 2001
P. Maragos, J. F. Kaiser and T. F. Quatieri , “Energy Separation in Signal Modulations with Application to Speech Analysis”, IEEE transactions on signal processing, vol. 41, no. 10, pp. 3024-3051 , Oct. 1993.
P. Mermelstein and S. Davis, “Comparison of parametric representation for mono syllabic word recognition in continuously spoken sentences”, In IEEE Transactions on Acoustic Speech and Signal Processing, Vol. 28, No. 4, pp. 357-366, 1980.
Q.Zhu and A. Alwan “Non linear feature extraction for robust speech recognition in stationary and non stationary noise” Computer speech and Language (17) ,pp. 381-402 , Elsevier Science Ltd. ,2003.
Q.Zhu and A. Alwan, “AM demodulation of speech spectra and its application to noise robust speech recognition” in proceedings ICSLP, 2000.
R. Sarikaya and J. H. L. Hansen,“High resolution speech feature parameterization for monophone-based stressed speech recognition”, IEEE signal processing letters vol7(7),pp. 182-185Jul 2000.
S. Pruzansky, “Pattern-matching procedure for automatic talker recognition”, J.A.S.A., 35, pp. 354-358, 1963.
Sarikaya et.al. “Wavelet packet transform features with application to speaker identification”, in proceedings of the IEEE Nordic signal processing symposium 1998.
T. Kinnunen, V. Hautamäki, P. Fränti “Fusion of spectral feature sets for accurate speaker identification”, In Proc. 9th Int. Conf. Speech and Computer ,SPECOM ,2004
Y. Hu, and P. Loizou, , “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, Elsevier, 49, pp 588-601, 2007.
Y. Linde, A. Buzo, and R. M. Gray, ``An Algorithm for Vector Quantizer Design,'' IEEE Transactions on Communications, pp 84-95, Jan. 1980.
Mr. Vibha Tiwari
Gyan Ganga Institute of Technology and management Bhopal, India - India
vibhatiwari19@gmail.com
Dr. Jyoti Singhai
Maulana Azad National Institute Of Technology Bhopal, India - India