Parameters Optimization for Improving ASR Performance in Adverse Real World Noisy Environmental Conditions

Urmila  Shrawankar; Vilas Thakare

Call for Papers - Ongoing round of submission, notification and publication.

Home | Login or Register | Contact CSC

Home > CSC-OpenAccess Library > Manuscript Information

Full Text Available
(no registration required)

(136.15KB)

-- CSC-OpenAccess Policy

-- Creative Commons Attribution NonCommercial 4.0 International License

>> COMPLETE LIST OF JOURNALS

EXPLORE PUBLICATIONS BY COUNTRIES


	EUROPE

	MIDDLE EAST

	ASIA

	AFRICA
.............................

	United States of America

	United Kingdom

	Canada

	Australia

	Italy

	France

	Brazil

	Germany

	Malaysia

	Turkey

	China

	Taiwan

	Japan

	Saudi Arabia

	Jordan

	Egypt

	United Arab Emirates

	India

	Nigeria

Parameters Optimization for Improving ASR Performance in Adverse Real World Noisy Environmental Conditions

Urmila Shrawankar, Vilas Thakare

Pages - 58 - 70 | Revised - 15-09-2012 | Published - 25-10-2012

Published in International Journal of Recent Trends in Human Computer Interaction (IJHCI)

Volume - 3 Issue - 3 | Publication Date - October 2012 Table of Contents

MORE INFORMATION

References | Cited By (4) | Abstracting & Indexing

KEYWORDS

ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction

ABSTRACT

From the existing research it has been observed that many techniques and methodologies are available for performing every step of Automatic Speech Recognition (ASR) system, but the performance (Minimization of Word Error Recognition-WER and Maximization of Word Accuracy Rate- WAR) of the methodology is not dependent on the only technique applied in that method. The research work indicates that, performance mainly depends on the category of the noise, the level of the noise and the variable size of the window, frame, frame overlap etc is considered in the existing methods. The main aim of the work presented in this paper is to use variable size of parameters like window size, frame size and frame overlap percentage to observe the performance of algorithms for various categories of noise with different levels and also train the system for all size of parameters and category of real world noisy environment to improve the performance of the speech recognition system. This paper presents the results of Signal-to-Noise Ratio (SNR) and Accuracy test by applying variable size of parameters. It is observed that, it is really very hard to evaluate test results and decide parameter size for ASR performance improvement for its resultant optimization. Hence, this study further suggests the feasible and optimum parameter size using Fuzzy Inference System (FIS) for enhancing resultant accuracy in adverse real world noisy environmental conditions. This work will be helpful to give discriminative training of ubiquitous ASR system for better Human Computer Interaction (HCI). Keywords: ASR Performance, ASR Parameters Optimization, Multi-Environmental Training, Fuzzy Inference System for ASR, ubiquitous ASR system, Human Computer Interaction (HCI)

CITED BY (4)

1	Smruti, S., Sahoo, J., Dash, M., & Mohanty, M. N. (2015, January). An Approach to Design an Intelligent Parametric Synthesizer for Emotional Speech. In Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014 (pp. 367-374). Springer International Publishing.

2	Lanzola, G., Parimbelli, E., Micieli, G., Cavallini, A., & Quaglini, S. (2014). Data quality and completeness in a web stroke registry as the basis for data and process mining. Journal of healthcare engineering, 5(2), 163-184.

3	Shrawankar, U., & Thakare, V. (2013). An Adaptive Methodology for Ubiquitous ASR System. arXiv preprint arXiv:1303.3948.

4	Sunitha, K. V., & Sharada, A. (2012). Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor. International Journal of Human Computer Interaction (IJHCI), 3(4), 83.

ABSTRACTING & INDEXING

1	Google Scholar

2	CiteSeerX

3	refSeek

4	Scribd

5	SlideShare

6	PdfSR

REFERENCES

“LOOKING AHEAD: Grand Challenges In Speech And Language Processing”, IEEE Signal Processing Magazine [179] January 2012

B.-H.Juang, "Speech Recognition in Adverse Environments," Computer Speech and Language, pp. 275--294, 5, 1991.

I Mporas, T Ganchev, M Siafarikas, N Fakotakis, “Comparison of Speech Features on the Speech Recognition Task”, Journal of Computer Science Vol 3 (8): pp 608-616, 2007

J. C. Bezdek and S. K. Pal, Eds., “Fuzzy Models for Pattern Recognition Methods That Search for Structures in Data”. New York: IEEE Press, 1992.

J. Ramírez, J. M. Górriz and J. C. Segura, “Voice Activity Detection. Fundamentals and Speech Recognition System Robustness”, I-Tech, Vienna, Austria, June 2007

L R Rabiner, “A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition”, proceedings of the IEEE, Vol. 77, No. 2, Feb 1989.

L. A. Zadeh, “Fuzzy sets,” Inform. Control, vol. 8, pp. 338–353, 1965

Loizou, P., “Speech Enhancement: Theory and Practice”. CRC Press LLC, Boca Raton,Florida.,2007

Loizou, P., Kim, G., “Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions”. IEEE Trans. Acoust. Speech Signal Process.19, 47–56., 2011

MathWorks - MATLAB and Simulink for Technical Computing, www.mathworks.com/.

Motlícek P.: “Feature Extraction in Speech Coding and Recognition”, Report, Portland, US,Oregon Graduate Institute of Science and Technology, pp. 1-50, 2002

Qifeng Zhu and Abeer Alwan, “On The Use Of Variable Frame Rate Analysis In Speech Recognition”, ICASSP, 2000

Suhadi Suhadi, Carsten Last, and Tim Fingscheidt, “A Data-Driven Approach to A Priori SNR Estimation”, IEEE Transactions On Audio, Speech, And Language Processing, Vol. 19, No.1, January 2011, pg 186- 195

T. Takagi and M. Sugeno, “Fuzzy identification of systems and its applications to modeling and control,” IEEE Trans. Syst., Man, Cybern., vol. SMC-15, no. 1, pp. 116–132, Jan. 1985.

Y.Gong, "Speech Recognition in Noisy Environments: A Survey," Speech Communication,Vol. 12, No. 3, pp. 231--239, June, 1995.

MANUSCRIPT AUTHORS

Miss Urmila Shrawankar

G H Raisoni College of Engg., Nagpur - India

urmilas@rediffmail.com

Dr. Vilas Thakare

SGB Amravati University, Amravati - India

CREATE AUTHOR ACCOUNT

LAUNCH YOUR SPECIAL ISSUE

View all special issues >>

PUBLICATION VIDEOS