Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(232.55KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
Prateek Srivastava, Reena Panda, Sankarsan Rauta
Pages - 128 - 139     |    Revised - 15-09-2012     |    Published - 24-10-2012
Volume - 6   Issue - 4    |    Publication Date - October 2012  Table of Contents
MORE INFORMATION
KEYWORDS
Speaker Recognition , Gaussian Mixture Model, Cepstral Mean Subtraction, Mel Frequency Cepstral Coefficients, Gender classification
ABSTRACT
Automatic speaker recognition system is used to recognize an unknown speaker among several reference speakers by making use of speaker-specific information from their speech. In this paper, we introduce a novel, hierarchical, text-independent speaker recognition. Our baseline speaker recognition system accuracy, built using statistical modeling techniques, gives an accuracy of 81% on the standard MIT database and our baseline gender recognition system gives an accuracy of 93.795%. We then propose and implement a novel state-space pruning technique by performing gender recognition before speaker recognition so as to improve the accuracy/timeliness of our baseline speaker recognition system. Based on the experiments conducted on the MIT database, we demonstrate that our proposed system improves the accuracy over the baseline system by approximately 2%, while reducing the computational time by more than 30%.
CITED BY (1)  
1 Sunitha, K. V., & Sharada, A. (2012). Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor. International Journal of Human Computer Interaction (IJHCI), 3(4), 83.
1 Google Scholar
2 CiteSeerX
3 refSeek
4 Scribd
5 SlideShare
6 PdfSR
1 X. Huang, A.Acero and H.-W.Hon, Spoken language processing, Upper Saddle River, New Jersey, Prentice Hall PTR, 2001.
2 S. Furui, Digital Speech Processing, Synthesis and Recognition, New York, Marcel Dekker,2001.
3 J. R. Deller, J. H. L. Hansen, J. G. Proakis, Discrete-Time Processing of Speech Signals,Piscataway (N.J.), IEEE Press, 2000.
4 X. Huang, A. Acero and H.-W.Hon, Spoken language processing, Upper Saddle River, New Jersey, Prentice Hall PTR, 2001.
5 D. A. Reynolds, “An Overview of Automatic Speaker Recognition Technology”, ICASSP 2002,pp 4072-4075.
6 EvgenyKarpov, ‘Real-Time Speaker Identification’, University of Joensuu Department of Computer Science Master’s Thesis
7 Mohamed FaouziBenZeghibaa, ‘Joint Speech And Speaker recognition' IDIAP RR 05- 28,February 2005
8 J.R Deller, J.H.L. Hansen, J .G. Proakis, Discrete –Time processing of speech signals,Piscataway (N.J.),/IEEE Press,2000
9 Brett Richard Wildermoth,'Text Independent Speaker Recognition using source based features', January 2001, Griffith university , Australia.
10 H. Gish and M. Schmidt, “Text Independent Speaker Identification”, IEEE Signal Processing Magazine, Vol. 11, No. 4, 1994, pp. 18-32.
11 MohaddesehNosratighods ,EliathambyAmbikairajah ,and Julien Epps “SPEAKER VERIFICATION USING A NOVEL SET OF DYNAMIC FEATURES”
12 J .M.Naik ,”Speaker Verifiaction-A tutorial”, IEEE Communications Magazine, January 1990,pp.42-48.
13 D. A. Reynolds, “An Overview of Automatic Speaker Recognition Technology”, ICASSP 2002, pp 4072-4075.
14 J.P. Campbell, “Speaker Recognition: A Tutorial”, Proc. of the IEEE, vol. 85, no. 9, Sept 1997, pp. 1437-1462
15 H. Gish and M. Schmidt, “Text Independent Speaker Identification”, IEEE Signal Processing Magazine, Vol. 11, No. 4, 1994, pp. 18-32.
16 D. Reynolds, R. Rose, “Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE transactions on speech and audio processing, Vol. 3, No1, 1995,pp. 72-83
17 Jeff A. Bilmes , “A Gentle Tutorial of the EM Algorithm and its application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models”, TR-97-021, April 1998
18 leonard, R. ,G. ,' A Database for speaker independent digit recognition' , Proc. ICASSP 84 ,Volume 3, p. 42.11, 1984
19 D. A. Reynolds, A Gaussian mixture modeling approach to text independent speaker identification, Ph.D. thesis, Georgia Institute of Technology, Atlanta, Ga, USA, September 1992.
20 S. Roberts, D. Husmeier, I. Rezek, andW.Penny, “Bayesian approaches to gaussian mixture modeling,” IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 1133–1142, Nov. 1998.
21 Atal, B.S“Automatic recognition of speakers from their voices,” Proc. IEEE, vol. 64, pp. 460–475, 1976.
22 SadaokiFurui“Speaker-dependent-feature extraction, recognition and processing techniques,” Speech Commun., vol. 10, pp. 505–520, 1991.
23 Campbell W, Sturim D, Reynolds D, Solomonoff A. SVM-based speaker verification using a GMM supervector kernel and NAP variability compensation. In: Proceedings of the international conference on acoustics, speech and signal processing; 2006. p. 1–97.
24 Herbert, M., 2008. Text-dependent speaker recognition. In: Benesty, J., Sondhi, M., Huang,Y. (Eds.), Springer Handbook of Speech Processing. Springer-Verlag, Heidelberg, pp. 743–762.
Mr. Prateek Srivastava
Advanced micro devices - India
prateek.k.srivastava@gmail.com
Miss Reena Panda
National Institute of Technology - India
Mr. Sankarsan Rauta
National Institute of Technology - India