A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic Features of Speech Recognition In Noisy Environment

Hajer Rahali; Zied Hajaiej; Noureddine Ellouze

Call for Papers - Ongoing round of submission, notification and publication.

Home | Login or Register | Contact CSC

Home > CSC-OpenAccess Library > Manuscript Information

Full Text Available
(no registration required)

(644.41KB)

-- CSC-OpenAccess Policy

-- Creative Commons Attribution NonCommercial 4.0 International License

>> COMPLETE LIST OF JOURNALS

EXPLORE PUBLICATIONS BY COUNTRIES


	EUROPE

	MIDDLE EAST

	ASIA

	AFRICA
.............................

	United States of America

	United Kingdom

	Canada

	Australia

	Italy

	France

	Brazil

	Germany

	Malaysia

	Turkey

	China

	Taiwan

	Japan

	Saudi Arabia

	Jordan

	Egypt

	United Arab Emirates

	India

	Nigeria

A Comparative Study: Gammachirp Wavelets and Auditory Filter Using Prosodic Features of Speech Recognition In Noisy Environment

Hajer Rahali, Zied Hajaiej, Noureddine Ellouze

Pages - 25 - 37 | Revised - 31-03-2014 | Published - 30-04-2014

Published in International Journal of Computer Science and Security (IJCSS)

Volume - 8 Issue - 2 | Publication Date - April 2014 Table of Contents

MORE INFORMATION

References | Abstracting & Indexing

KEYWORDS

Gammachirp Filter, Wavelet Packet, MFCC, Impulsive Noise.

ABSTRACT

Modern automatic speech recognition (ASR) systems typically use a bank of linear filters as the first step in performing frequency analysis of speech. On the other hand, the cochlea, which is responsible for frequency analysis in the human auditory system, is known to have a compressive non-linear frequency response which depends on input stimulus level. It will be shown in this paper that it presents a new method on the use of the gammachirp auditory filter based on a continuous wavelet analysis. The essential characteristic of this model is that it proposes an analysis by wavelet packet transformation on the frequency bands that come closer the critical bands of the ear that differs from the existing model based on an analysis by a short term Fourier transformation (STFT). The prosodic features such as pitch, formant frequency, jitter and shimmer are extracted from the fundamental frequency contour and added to baseline spectral features, specifically, Mel Frequency Cepstral Coefficients (MFCC) for human speech, Gammachirp Filterbank Cepstral Coefficient (GFCC) and Gammachirp Wavelet Frequency Cepstral Coefficient (GWFCC). The results show that the gammachirp wavelet gives results that are comparable to ones obtained by MFCC and GFCC. Experimental results show the best performance of this architecture. This paper implements the GW and examines its application to a specific example of speech. Implications for noise robust speech analysis are also discussed within AURORA databases.

ABSTRACTING & INDEXING

1	Google Scholar

2	CiteSeerX

3	refSeek

4	TechRepublic

5	Scribd

6	SlideShare

7	PdfSR

REFERENCES

Alex Park. Using the gammachirp filter for auditory analysis of speech. May 14, 2003.18.327: Wavelets and Filter banks.

E. Ambikairajah, J. Epps, L. Lin. Wideband speech and audio coding using gammatone filter banks. Proc. ICASSP’01, Salt Lake City, USA, May 2001, vol.2, pp.773-776.

Greenwood, D.D. A cochlear frequency-position function for several species – 29 years later. J.Acous. Soc. Am, Vol. 87, No. 6, Juin 1990.

H. G. Hirsch, D. Pearce. The AURORA Experiment Framework for the Performance Evaluations of Speech Recognition Systems under Noisy Condition. ISCA ITRW ASR2000 Automatic Speech Recognition: Challenges for the Next Millennium, France,2000.

H.G. Musmann. Genesis of the MP3 audio coding standard. IEEE Trans. on Consumer Electronics, Vol. 52, pp. 1043 – 1049, Aug. 2006.

Irino, T., Patterson R. D. A compressive gammachirp auditory filter for both physiological and psychophysical data. J. Acoust. Soc. Am. Vol. 109, N° 5, Pt. 1, May 2001. pp. 2008-2022.

J. O. Smith III, J.S. Abel. Bark and ERB Bilinear Transforms. IEEE Tran. On speech and Audio Processing, Vol. 7, No. 6, November 1999.

M. Brookes. VOICEBOX: Speech Processing Toolbox for MATLAB. Software, available[Mar, 2011] from,

M. N. Viera, F.R. McInnes, M.A. Jack. Robust F0 and Jitter estimation in the Pathological voices. Proceedings of ICSLP96, Philadelphia, pp.745–748, 1996.

Miller A., Nicely P. E. (1955). Analyse de confusions perceptives entre consonnes anglaises. J. Acous. Soc. Am, 27, 2, (trad Française,Mouton, 1974 in Melher & Noizet,textes pour une psycholinguistique).

P. Rajmic, J. Vlach. Real-time Audio Processing Via Segmented wavelet Transform. 10th International Conference on Digital Audio Effect , Bordeaux, France, Sept. 2007.

P.R. Deshmukh. Multi-wavelet Decomposition for Audio Compression. IE (I) Journal –ET,Vol 87, July 2006.

R.E. Slyh, W.T. Nelson, E.G. Hansen. Analysis of m rate, shimmer, jitter, and F0 contour features across stress and speaking style in the SUSAS database. vol. 4. in Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing, pp. 2091-4, Mar. 1999.

S. Mallat. A Theory for multiresolution signal decomposition: Wavelet representation.IEEE Trans. Pattern Analysis and Machine Intelligence. Vol. 11. No. 7 pp 674-693 July 1989.

Salhi.L. Design and implementation of the cochlear filter model based on a wavelet transform as part of speech signals analysis. Research Journal of Applied Sciences2 (4):512-521, 2007 Medwell-Journal 2007.

Stephan Mallat. Une exploitation des signaux en ondelettes. Les éditions de l’école polytechnique.

T. Irino, M. Unoki. An Analysis Auditory Filterbank Based on an IIR Implementation of the Gammachirp. J. Acoust. SocJapan. 20(6): 397-406, November, 1999.

T. Irino, R. D. Patterson. A time-domain, Level-dependent auditory filter: The gammachirp. J. Acoust.Soc. Am. 101(1): 412-419, January, 1997.

T. Irino, R. D. Patterson. Temporal asymmetry in the auditory system. J. Acoust. Soc. Am.99(4): 2316-2331, April, 1997.

WEBER F., MANGANARO L., PESKIN B. SHRIBERG E. Using prosodic and lexical information for speaker identification. Proc. ICASSP, Orlando, FL, May 2002.

MANUSCRIPT AUTHORS

Miss Hajer Rahali

National Engineering School of Tunis (ENIT) L aboratory of Systems and Signal Processing (LSTS) BP 37, Le Belvédère, 1002 Tunis - Tunisia

hajer.rahali@enit.rnu.tn

Mr. Zied Hajaiej

National Engineering School of Tunis (ENIT) L aboratory of Systems and Signal Processing (LSTS) BP 37, Le Belvédère, 1002 Tunis - Tunisia

Dr. Noureddine Ellouze

National Engineering School of Tunis (ENIT) Laboratory of Systems and Signal Processing (LSTS) BP 37, Le BelvÃ©dÃ¨re, 1002 Tunis, Tunisie - Tunisie

CREATE AUTHOR ACCOUNT

LAUNCH YOUR SPECIAL ISSUE

View all special issues >>

PUBLICATION VIDEOS