Home   >   CSC-OpenAccess Library   >    Manuscript Information
A New Method for Pitch Tracking and Voicing Decision Based on Spectral Multi-Scale Analysis
Mohamed Anouar Ben Messaoud, Aicha Bouzid
Pages - 144 - 152     |    Revised - 30-10-2009     |    Published - 30-11-2009
Volume - 3   Issue - 5    |    Publication Date - November 2009  Table of Contents
MORE INFORMATION
KEYWORDS
Speech, Wavelet transforms, Multi-scale, Pitch, Voicing detection
ABSTRACT
This paper proposes a new voicing detection and pitch estimation method that is particularly robust for noisy speech. This method is based on the spectral analysis of the speech multi-scale product. The multi-scale product (MP) consists of making the product of wavelet transform coefficients. The wavelet used is the quadratic spline function. We argue that the spectral of Multi-scale Product Analysis is capable of revealing an estimate of a pitch-harmonic more accurately even in a heavy noisy scenario. We evaluate our approach on the Keele database. The experimental results show the robustness of our method for noisy speech, and the good performance for clean speech in comparison with state-of-the-art algorithms.
CITED BY (7)  
1 Bahja, F., Martino, J., Elhaj, E. I., & Aboutajdine, D. (2016). A corroborative study on improving pitch determination by time–frequency cepstrum decomposition using wavelets. SpringerPlus, 5(1), 1-17.
2 Acharya, P.Speech enhancement using unbiased normalized adaptive filtering technique.
3 Prameela, K., Kumar, M. A., Zia-Ur-Rahman, M., & Rao, B. R. M. (2011). Non Stationary Noise Removal from Speech Signals using Variable Step Size Strategy. International Journal of Computer Science & Communication Networks, 1(1).
4 Rahman, M. Z. U., Mohedden, S. K., Rao, B. R. M., Reddy, Y. J., & Karthik, G. V. S. (2011). Filtering Non-Stationary Noise in Speech Signals using Computationally Efficient Unbiased and Normalized Algorithm. International Journal on Computer Science and Engineering, ISSN, 0975-3397.
5 Karthik, G. V. S., Kumar, M. A., & Rahman, M. Z. U. (2011). Speech Enhancement Using Gradient Based Variable Step Size Adaptive Filtering Techniques. International Journal of Computer Science & Emerging Technologies (E-ISSN: 2044-6004), 2(1), 168-177.
6 Mohedden, S. K., Zia-Ur-Rahman, M., Krishna, K. M., & Rao, B. R. M. Battle Field Speech Enhancement using an Efficient Unbiased Adaptive Filtering Technique.
7 Messaoud, M. A. B., Bouzid, A., & Ellouze, N. (2010). Autocorrelation of the Speech Multi-Scale Product for Voicing Decision and Pitch Estimation. Cognitive Computation, 2(3), 151-159.
1 Google Scholar 
2 Academic Index 
3 refSeek 
4 iSEEK 
5 Socol@r  
6 ResearchGATE 
7 Bielefeld Academic Search Engine (BASE) 
8 Scribd 
9 SlideShare 
10 PDFCAST 
11 PdfSR 
12 Free-Books-Online 
. B. M. Sadler, T. Pham and L. C. Sadler. “Optimal and wavelet-based shock wave detection and estimation”. Journal of the Acoustical Society of America, 104: 955-963, 1998
A. Bouzid and N. Ellouze. “Electroglottographic measures based on GCI and GOI detection using multiscale product”, International journal of computers, communications and control, 3(1): 21-32, 2008
A. Bouzid and N. Ellouze. “Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform”, Research Letter in Signal Processing, 7: 1687-6911, 2008
A. Cheveigné. “YIN, a fundamental frequency estimator for speech and music”. Journal of the Acoustical Society of America, 111(4):1917-1930, 2002
A. M. Noll. “Cepstrum pitch determination”. J. Acoust. SOC. Amer., 41: 293-309, 1967
A. Martin, D. Charlet and L. Mauuary. “Robust Speech/ Non-speech Detection Using LDA Applied to MFCC”. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1: 237--240, 2001
B. M. Sadler and A. Swami. “Analysis of multi-scale products for step detection and estimation”. IEEE Trans. Inform. Theory, 1043-1051, 1999
C. S. Burrus, R. A. Gopinath and H. Guo. “Introduction to Wavelets and Wavelet Transform”, A Primer. Prentice Hall, (1998)
D. A. Krubsack and R. J. Niederjohn. “An autocorrelation pitch detector and voicing decision with confidence measures developed for noise-corrupted speech”. IEEE Trans. Acoust., Speech, Signal Processing, 39(1): 319-329, 1991
D. O. Shaughnessy. “Speech communications: human and machine”. IEEE Press, NY, second edition, (2000)
D. Talkin. “A robust algorithm for pitch tracking (RAPT)”. In Speech Coding and Synthesis, W. B. Kleijn and K. K. Paliwal, Eds.,Elsevier Science, pp. 497-518 (1995)
D.G. Childers, M. Hahn and J.N. Larar. “Silence and Voiced/Unvoiced/Mixed Excitation Classification of Speech”. IEEE Trans. On Acoust., Speech , Signal Process, 37(11):1771--1774, 1989
F. Sha and L. K. Saul. “Real-time pitch determination of one or more voices by nonnegative matrix factorization”, L. K. Saul, Y. Weiss, and L. Bottou, Eds., MIT Press, pp. 1233-1240 (2005)
F. Sha, J. A. Burgoyne and L. K. Saul. “Multiband statistical learning for F0 estimation in speech”. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Canada, 2004
G. Meyer, F. Plante and W. A. Ainsworth. “A pitch extraction reference database”. EUROSPEECH,1995
J.P. Campbell. “Speaker Recognition : A Tutorial”. In Proceedings of the IEEE, 85(9): 1437--1462, 1997
K. Achan, S. Roweis, A. Hertzmann and B. Frey. “A segment-based probabilistic generative model of speech”. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2005
L. Liao and M. Gregory. “Algorithms for Speech Classification”. In Proceedings of the 5th ISSPA, Brisbane, 1999
L. R. Rabiner, M. J. Cheng, A. H. Rosenberg and C. A. McGonegal. “A comparative performance study of several pitch detection algorithms”. IEEE Trans. Acoust., Speech, Signal Processing, 24(5): 399-417, 1976
L. Rabiner. “On the use of autocorrelation analysis for pitch detection”. IEEE Trans. Acoust., Speech, Signal Processing, 25(1): 24-33, 1977
P. Boersma. “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound”. In Proceedings of the Institute of Phonetic Sciences, Amsterdam, 1993
P. C. Bagshaw, S. M. Hiller and M. A. Jack. “Enhanced pitch tracking and the processing of f0 contours for computer aided intonation teaching”. In Proceedings of the 3rd European Conference on Speech Communication and Technology, 1993
S. Kadambe and G. Faye Boudreaux-Bartels. “Application of the Wavelet Transform for Pitch Detection of Speech Signals”. IEEE Trans. on Info. Theory, 38: 917-924, 1992
S. Mallat. “A Wavelet Tour of Signal Processing”, Academic Press, second edition, (1999)
T. Shimamura and H. Takagi. “Noise-Robust Fundamental Frequency Extraction Method Based on Exponentiated Band-Limited Amplitude Spectrum”. In The 47th IEEE International Midwest Symposium on Circuits and Systems, 2004
W. J. Hess. “Pitch and voicing determination”, Marcel Dekker, Inc., pp. 3-48 (1992)
Z. Berman and J. S. Baras. “Properties of the multiscale maxima and zero-crossings representations”, IEEE Trans.on Signal Processing, 42(1):3216-3231, 1993
Mr. Mohamed Anouar Ben Messaoud
National School of Engineers of Tunis - Tunisia
anouar.benmessaoud@yahoo.fr
Associate Professor Aicha Bouzid
National School of Engineers of Tunis - Tunisia