Home   >   CSC-OpenAccess Library   >    Manuscript Information
Performance Comparison Of 2-D DCT On Full/Block Spectrogram And 1-D DCT on Row Mean of Spectrogram for Speaker Identification
H. B. Kekre, Tanuja Kiran Sarode, Shachi J. Natu, Prachi J. Natu
Pages - 100 - 112     |    Revised - 30-06-2010     |    Published - 10-08-2010
Volume - 4   Issue - 3    |    Publication Date - July 2010  Table of Contents
MORE INFORMATION
KEYWORDS
Speaker identification, , Speaker Recognition, , Spectrograms, , DCT, , Row Mean
ABSTRACT
The goal of this paper is to present a very simple approach to text dependent speaker identification using a combination of spectrograms and well known Discrete Cosine Transform (DCT). This approach is based on use of DCT to find similarities between spectrograms obtained from speech samples. The set of spectrograms forms the database for our experiments rather than raw speech samples. Performance of this approach is compared for different number of coefficients of DCT when DCT is applied on entire spectrogram, when DCT is applied to spectrogram divided into blocks and when DCT is applied to the Row Mean of a spectrogram. Performance comparison shows that, number of mathematical computations required for DCT on Row Mean of spectrogram method is drastically less as compared to other two methods with almost equal identification rate.
CITED BY (18)  
1 Larcher, A., Lee, K. A., Ma, B., & Li, H. (2014). Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Communication, 60, 56-77.
2 Srinivas, V., Rani, C. S., & Madhu, T. (2013). Investigation of Decision Tree Induction, Probabilistic Technique and SVM For Speaker Identification.
3 Kekre, H. B., Sarode, T., Natu, S., & Natu, P. (2012). Performance Evaluation of Speaker Identification using Spectrogram and Hartley Transform with Cropped Feature Vector. International Journal of Computer Science and Information Security, 10(2), 102.
4 Kekre, H. B., Sarode, T. K., Natu, P. J., & Natu, S. J. (2012). Performance Evaluation of Face Recognition Technique using Hartley and DCT with Fractional Feature Vector. International Journal of Computer Applications, 41(8).
5 Kekre, H. B., Sarode, T. K., & Ansari, F. (2012, October). Performance evaluation of DCT, Walsh, Haar and Hartley transforms on whole images and partial coefficients in Image Classification. In Communication, Information & Computing Technology (ICCICT), 2012 International Conference on (pp. 1-6). IEEE.
6 Sidram, M. H., & Bhajantri, N. U. (2012). Normalized cross-correlation for tracking object and updating the template: exploration with extensive dataset. International Journal of Computer Science and Information Security, 10(2), 106.
7 Dey, N., Das, A., & Chaudhuri, S. S. (2012). Wavelet based normal and abnormal heart sound identification using spectrogram analysis. arXiv preprint arXiv:1209.1224.
8 Kekre, H. B., Kulkarni, V., Gaikar, P., & Gupta, N. (2012). Speaker Identification using Spectrograms of Varying Frame Sizes. International Journal of Computer Applications, 50(20).
9 Kekre, H. B., Sarode, T. K., Natu, P. J., & Natu, S. J. (2011, February). Transform based face recognition with partial and full feature vector using dct and walsh transform. In Proceedings of the International Conference & Workshop on Emerging Trends in Technology (pp. 1295-1300). ACM.
10 Kekre, H. B., Sarode, T. K., Natu, P. J., & Natu, S. J. (2011). Performance Comparison of Face Recognition using DCT and Walsh Transform with Full and Partial Feature Vector against KFCG VQ Algorithm. threshold, 4, 29.
11 Patil, S. G., & Mashette, G. S. Iris recognition by using partial coefficients of.
12 Dr. H. B. Kekre , Dr. T. K. Sarode , P. Bhatia , S. N. Nayak and D. J. Nagpal, “Iris Recognition using Partial Coefficients by applying Discrete Cosine Transform, Haar Wavelet and DCT Wavelet Transform” International Journal of Computer Applications 32(6), pp. 39-43, October 2011.
13 H.B. Kekre , T. K. Sarode and M. S. Ugale, “Performance Comparison of Image Classifier using Discrete Cosine Transform and Walsh Transform” in IJCA Proceedings on International Conference and workshop on Emerging Trends in Technology (ICWET) (4), 2011, pp. 14-20.
14 H. B. Kekre, T. K. Sarode and M. S. Ugale, “An Efficient Image Classifier Using Discrete Cosine Transform”, in Proceedings of the International Conference & Workshop on Emerging Trends in Technology, New York, NY, USA 2011.
15 Kekre, H. B., Sarode, T., Natu, P., & Natu, S. (2010, September). Performance Comparison of Face Recognition Using DCT Against Face Recognition Using Vector Quantization Algorithms. In LBG, KPE, KMCG, KFCG” International Journal Of Image Processing (IJIP.
16 Kekre, H. B., Sarode, T. K., Natu, P. J., & Natu, S. J. LBG, KPE, KMCG, KFCG. International Journal Of Image Processing (IJIP), 4(4), 377.
17 Kekre, H. B., Sarode, T. K., Natu, S. J., & Natu, P. J. (1733). Speaker Identification Using 2-D DCT, Walsh And Haar On Full And Block Spectrogram. International Journal on Computer Science and Engineering, 2(5), 2010.
18 Dr. H. B. Kekre ,Dr. T. K. Sarode ,S. J. Natu and Pr. J. Natu, “Performance Comparison of Speaker Identification Using DCT, Walsh, Haar on Full and Row Mean of Spectrogram ”, International Journal of Computer Applications, 5(6), pp. 30–37, August 2010.
1 Google Scholar 
2 Academic Journals Database 
3 Academic Index 
4 CiteSeerX 
5 refSeek 
6 iSEEK 
7 Socol@r  
8 ResearchGATE 
9 Libsearch 
10 Bielefeld Academic Search Engine (BASE) 
11 Scribd 
12 SlideShare 
13 PDFCAST 
14 PdfSR 
Andrew B. Watson, “Image compression using the Discrete Cosine Transform”, Mathematica journal, 4(1), pp. 81-88, 1994,.
Azzam Sleit, Sami Serhan, and Loai Nemir, “A histogram based speaker identification technique”, International Conference on ICADIWT, pp. 384-388, May 2008.
B. S. Atal, “Automatic Recognition of speakers from their voices”, Proc. IEEE, vol. 64, pp. 460-475, 1976.
D. O’Shaughnessy, “Speech communications- Man and Machine”, New York, IEEE Press, 2nd Ed., pp. 199, pp. 437-458, 2000.
Debadatta Pati, S. R. Mahadeva Prasanna, “Non-Parametric Vector Quantization of Excitation Source Information for Speaker Recognition”, IEEE Region 10 Conference, pp. 1-4, Nov. 2008.
Evgeniy Gabrilovich, Alberto D. Berstin: “Speaker recognition: using a vector quantization approach for robust text-independent speaker identification”, Technical report DSPG-95-9-001’, September 1995.
H. B. Kekre, Ms. Tanuja K. Sarode, Sudeep D. Thepade, "Image Retrieval using Color-Texture Features from DCT on VQ Codevectors obtained by Kekre’s Fast Codebook Generation", ICGST-International Journal on Graphics, Vision and Image Processing (GVIP), Volume 9, Issue 5, pp.: 1-8, September 2009. Available online at http://www.icgst.com/gvip/Volume9/Issue5/P1150921752.html.
H. B. Kekre, Sudeep Thepade, Akshay Maloo, “Eigenvectors of Covariance Matrix using Row Mean and Column Mean Sequences for Face Recognition”, CSC-International Journal of Biometrics and Bioinformatics (IJBB), Volume (4): Issue (2), pp. 42-50, May 2010.
H. B. Kekre, Sudeep Thepade, Akshay Maloo, “Image Retrieval using Fractional Coefficients of Transformed Image using DCT and Walsh Transform”, International Journal of Engineering Science and Technology, Vol.. 2, No. 4, 2010, 362-371
H. B. Kekre, Sudeep Thepade, Akshay Maloo,”Performance Comparison of Image Retrieval Using Fractional Coefficients of Transformed Image Using DCT, Walsh, Haar and Kekre’s Transform”, CSC-International Journal of Image processing (IJIP), Vol.. 4, No.2, pp.:142-155, May 2010.
H. B. Kekre, Tanuja K. Sarode, Sudeep D. Thepade, “Image Retrieval by Kekre’s Transform Applied on Each Row of Walsh Transformed VQ Codebook”, (Invited), ACM-International Conference and Workshop on Emerging Trends in Technology (ICWET 2010),Thakur College of Engg. And Tech., Mumbai, 26-27 Feb 2010, The paper is invited at ICWET 2010. Also will be uploaded on online ACM Portal.
H. B. Kekre, Tanuja Sarode “Two Level Vector Quantization Method for Codebook Generation using Kekre’s Proportionate Error Algorithm” , CSC-International Journal of Image Processing, Vol.4, Issue 1, pp.1-10, January-February 2010
H. B. Kekre, Tanuja Sarode, Sudeep D. Thepade, “Color-Texture Feature based Image Retrieval using DCT applied on Kekre’s Median Codebook”, International Journal on Imaging (IJI), Volume 2, Number A09, Autumn 2009,pp. 55-65. Available online at www.ceser.res.in/iji.html (ISSN: 0974-0627).
H.B.Kekre, Sudeep D. Thepade, Archana Athawale, Anant Shah, Prathmesh Verlekar, Suraj Shirke, “Performance Evaluation of Image Retrieval using Energy Compaction and Image Tiling over DCT Row Mean and DCT Column Mean”, Springer-International Conference on Contours of Computing Technology (Thinkquest-2010), Babasaheb Gawde Institute of Technology, Mumbai, 13-14 March 2010, The paper will be uploaded on online Springerlink.
H.B.Kekre, Sudeep D. Thepade, Archana Athawale, Anant Shah, Prathmesh Verlekar, Suraj Shirke,“Energy Compaction and Image Splitting for Image Retrieval using Kekre Transform over Row and Column Feature Vectors”, International Journal of Computer Science and Network Security (IJCSNS),Volume:10, Number 1, January 2010, (ISSN: 1738-7906) Available at www.IJCSNS.org.
H.B.Kekre, Sudeep D. Thepade, “Improving the Performance of Image Retrieval using Partial Coefficients of Transformed Image”, International Journal of Information Retrieval (IJIR), Serials Publications, Volume 2, Issue 1, pp. 72-79 (ISSN: 0974-6285), 2009.
H.B.Kekre, Tanuja Sarode, Sudeep D. Thepade, “DCT Applied to Row Mean and Column Vectors in Fingerprint Identification”, In Proceedings of International Conference on Computer Networks and Security (ICCNS), 27-28 Sept. 2008, VIT, Pune.
http://www.itee.uq.edu.au/~conrad/vidtimit/
http://www2.imm.dtu.dk/~lf/elsdsr/
J.P.Campbell, “Speaker recognition: a tutorial”, Proc. IEEE, vol. 85, no. 9, pp. 1437-1462, 1997.
Jialong He, Li Liu, and G¨unther Palm, “A discriminative training algorithm for VQ-based speaker Identification”, IEEE Transactions on speech and audio processing, vol. 7, No. 3, pp. 353-356, May 1999.
S. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Transaction Acoustics Speech and Signal Processing, vol. 4, pp. 375-366, 1980.
Tridibesh Dutta and Gopal K. Basak, “Text dependent speaker identification using similar
Tridibesh Dutta, “Text dependent speaker identification based on spectrograms”, Proceedings of Image and vision computing, pp. 238-243, New Zealand 2007,
Wang Yutai, Li Bo, Jiang Xiaoqing, Liu Feng, Wang Lihao, “Speaker Recognition Based on Dynamic MFCC Parameters”, International Conference on Image Analysis and Signal Processing, pp. 406-409, 2009
] H.B.Kekre, Tanuja K. Sarode, Sudeep D. Thepade, Vaishali Suryavanshi, “Improved Texture Feature Based Image Retrieval using Kekre’s Fast Codebook Generation Algorithm”, Springer-International Conference on Contours of Computing Technology (Thinkquest-2010), Babasaheb Gawde Institute of Technology, Mumbai, 13-14 March 2010, The paper will be uploaded on online Springerlink.
Dr. H. B. Kekre
- India
Mr. Tanuja Kiran Sarode
Thadomal Shahani Engineering College - India
tanuja_0123@yahoo.com
Mr. Shachi J. Natu
- India
Mr. Prachi J. Natu
- India


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS