EXPLORE PUBLICATIONS BY COUNTRIES


	EUROPE

	MIDDLE EAST

	ASIA

	AFRICA
.............................

	United States of America

	United Kingdom

	Canada

	Australia

	Italy

	France

	Brazil

	Germany

	Malaysia

	Turkey

	China

	Taiwan

	Japan

	Saudi Arabia

	Jordan

	Egypt

	United Arab Emirates

	India

	Nigeria

Computational Model and Simulation of Articulatory Mechanism in Yoruba Voiced Speech

ADEGBITE Adewuyi Adetayo, ODEJOBI, Ajadi Odetunji, LAYENI, Olawanle P.

Pages - 40 - 57 | Revised - 30-11-2020 | Published - 31-12-2020

Published in International Journal of Software Engineering (IJSE)

Volume - 8 Issue - 4 | Publication Date - December 2020 Table of Contents

MORE INFORMATION

References | Abstracting & Indexing

KEYWORDS

Yoruba Vowels, Speech Production Process, Oral Cavity, Nasal Cavity, Articulatory Mechanism.

ABSTRACT

This study examined the physical, electrical and mathematical models used in the dynamics of voiced sound production. It formulated and designed a computational model for the standard Yoruba voiced sounds, and the designed model also implemented. This was with a view to developing a Yoruba speech recognition and text-to-speech model. The mechanism of human speech production articulatory process documented in the existing literature was examined and analysed. The mechanical coupling in the vocal cords of the oral and that of the nasal cavity were studied. Then a computational model, which defined real variables over Standard Yoruba voice speech production process was formulated and designed using algorithm. The design was then implemented using an appropriate numerical computational tool called Matlab. The implemented novel model established that more volume in time of air will be needed for nasal vowels than oral vowels in the production of Yoruba voiced speech. A minimum of 238cm3 of air is needed for nasal cavity while a maximum of 171cm3 is needed for oral cavity. It was deduced that the change in the damping coefficient along the vocal cord does not affect the response rate of the speech organ whose rise time remains 0.9ms while a change in the spring constant causes changes in the response rate parameters. Whenever damping coefficient is constant, that is, either 0.1 or 0.2 over the points positioned with masses, even if the vocal chords of individuals have different assigned mass values, the speech production is still the same and observed as normal. The study concluded by establishing a computational model for nasalized Yoruba vowels. This model has the utility of serving as a resource for Yoruba speech recognition and text-to-speech application.

REFERENCES

A. Aalto, D. Aalto, J. Malinen, and M. Vainio. Interaction of vocal fold and vocal tract oscillations. In 24th Nordic Seminar on Computational Mechanics, J. Freund and R. Kouhia (Eds.), Aalto University, pp. 1 – 4, 2011.

A. Akinlabi. Understanding Yorubalife and culture: The sound system of Yoru`b´a. Africa World Press Inc., Eritrea, 2004.

A. Indumathi and E. Chandra. Survey on Speech Synthesis. Signal Processing: An International Journal,6(5): 140 – 145, 2012.

A. J. S. Teixeira, R. Martinez, L. N. Silva, L. M. T. Jesus, J. C. Principe, and F. A. C. Vaz (2005). Simulation of human speech production applied to the study and synthesis of European Portuguese. EURASIP Journal on Applied Signal Processing, 9:1435 – 1448, 2005.

B. H. Story, and I. R. Titze. Voice simulation with a body-cover model of the vocal folds. Journal Acoustic Society of America, 97(2):1249 – 1260, 1995.

B. H. Story. An overview of the physiology, physics and modeling of the sound source for vowels. Acoustic Science & Technology, 23(4):195 – 206, 2002.

C. Scully. Linguistic units and units of speech production. Speech Communication, 6:77 – 142, 1987.

D. A. Berry. Mechanisms of modal and nonmodal phonation. Journal of phonetics, 29:431 – 450, 2001. [5] A. Breen. Speech synthesis model: A review. Electronics and Communications Engineering Journal, February, 1992:19 – 31, 1992.

E. Cataldo, C. Soize, C. Desceliers, and R. Sampaio. Uncertainties in mechanical models of larynx and vocal tract for voice production. In XII International Symposium on Dynamics Problems of Mechanics, Brazil, pp. 1 – 10, 2007.

E. Cataldo, F. R. Leta, J. Lucero, and L. Nicolato Synthesis of voiced sounds using low- dimensional models of the vocal cords and time-varying subglottal pressure. Mechanics Research Communications, 33(6):250 – 260, 2006 a.

E. Cataldo, J. C. Lucero, R. Sampaio, and L. Nicolato. Comparison of Some Mechanical Models of Larynx in the Synthesis of Voiced Sounds. Journal of the Brazil Society of Mechanical Science and Engineering, XXVIII(4):461–466, 2006 b.

E. Rank. Oscillator-plus-noise modelling of speech signals. Doctor of enginering’s thesis, Vienna University of Technology, Austria, 2005.

G. Fant. Speech production: A voice source dynamics. STL-QPSR, 2 - 3:17 – 37, 1980.

H. K. Dass. Advanced engineering mathematics. 81-219-0345-9. S. Chand & company ltd., New Delhi, India. ISBN: 81-219-0345-9, 17th edition, 1988.

H. Yehia, and F. Itakura. A method to combine acoustic and morphological constraints in the speech production inverse problem. Speech Communication, 18:151 – 174, 1996.

H. Yehia, P. Rubin, and E. Vatikiotis-Bateson. Quantitative association of vocal-tract and facial behavior. Speech Communication, 26:23 – 43, 1998.

I. R. Titze, and B. H. Story. Rules for controlling low-dimensional vocal fold models with muscle activation. Journal Acoustic Society of America, 112(3):1064 – 1076, 2002.

I. R. Titze. On the mechanics of vocal-fold vibration. Journal Acoustic Society of America, 60(6):66 – 80, 1977.

J. C. Lucero, K. G. Lourenco, N. Hermant, A. V. Hirtum, and X. Pelorson. Effect of source- tract acoustical coupling on the oscillation onset of the vocal folds. Journal Acoustic Society of America, 132(1):403 – 411, 2012.

J. Flanagan, and L. Landgraf. Self-oscillating source for vocal-tract synthesizers. IEEE Trans. On Audio and Electroacoustics, 16:57 – 64, 1968.

J. Huang. Articulatory speech synthesis and speech production modelling. Doctor of philosphy thesis, University of Illinois, Urbana-Champaign, United States, 2001.

J. Xin, and Y. Qi. Mathematical modeling and signal processing in speech and hearing sciences, volume 10. Springer international publisher, Switzerland, 2014.

K. A. Stroud, and D. J. Booth. Adanced engineering mathematics. 1-4039-0312-3. Palgrave Macmillian Publisher, London. ISBN: 1-4039-0312-3, 4th edition, 1995.

K. A. Stroud, and D. J. Booth. Engineering mathematics. 0-333-91939-4. Palgrave Macmillian Publisher, London. ISBN: 0-333-91939-4, 5th edition, 2001.

K. E. Cummings, and M. A. Clements. Glottal models for digital speech processing: A historical survey and new results. Digital signal processing, 5:21 – 42, 1995.

K. Ishizaka, and J. L. Flanagan. Synthesis of Voiced Sounds from a two-mass model of Vocal Cords. The Bell system Technical Journal, 51(6):1233 – 1268, 1972.

L. A. Akanbi, and O. A. Odejobi. Automatic recognition of oral vowels in tone language: Experiments with fuzzy logic and neural network models. Applied Soft Computing, 11:1467 – 1480, 2011.

L. Cveticanin. Review on Mathematical and Mechanical Models of the Vocal Cord. Journal of Applied Mathematics, 2012:1–18, 2012.

L. Juvela, B. Bollepalli, V. Tsiaras and P. Alku. Glotnet - a raw waveform model for the glottal excitation in statistical parametric speech synthesis. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(6):1019 – 1030, 2019.

M. B. Chandak, R. V. Dharaskar and V. M. Thakre. Text to Speech Synthesis with Prosody Feature: Implementation of emotion in speech output using forward parsing. International Journal of Computer Science and Security,4(3): 352 – 360, 2010.

M. E. Joshua. Why Model? In Sfi Working Paper, pp. 1–6, 2008.

M. F. Regner, C. Tao, D. Ying, A. Olszewski, Y. Zhang, and J. J Jiang. The Effect of Vocal Fold Adduction on the Acoustic Quality of Phonation: Ex Vivo Investigations. Journal of Voice, 26(6):698 – 705, 2012.

M. K. Reddy, and K. S. Rao. Excitation modelling using epoch faetures for statistical parametric speech synthesis. Computer Speech and Language, 60(2020):10 – 29, 2020.

N. Ruty, A. V. Hirtum, X. Pelorson, I. Lopez, and A. Hirschberg. A mechanical experimental setup to simulate vocal folds vibrations. ZAS Papers in Linguistics, 40:161–175, 2005.

O. A. Od´e.jo.b´i. Articulatory organs in speech production. Personal collections retrieved February,2015a.

O. A. Od´e.jo.b´i. Electrical Circuit of Model. Personal collections retrieved February,2015b.

O. A. Od´e.jo.b´i. Physical and Mechanical Representation of Model. Personal collections retrieved February,2015c.

O. Perrotin, and I. McLoughlin. A spectral glottal flow model for source-filter separation of speech. International Conference on Acoustics, Speech and Signal Processing, 2019:7160 – 7164, 2019.

P. Dikshit. ”An algorithm for locating fundamental frequency (f0) markers in speech.”, Master’s thesis, Department of Computer Engineering, Old Dominion University, United States, 2004.

P. Palo. A review of articulatory speech synthesis. Master’s thesis, Helsinki University of Technology, Finland, 2006.

R. Bronson, and G. Coster. Differential equations. Schaum’s outline series McGraw, United States, ISBN: 978-0-07-161162-6, 3rd edition, 2006.

S. P. Panda, A. K. Nayak, and S. C. Rai. Survey on speech synthesis techniques in Indian languages. Multimedia Systems, 26:453 – 478, 2020.

T. Kenter, V. Wan, C. Chan, R. Clark and J. Vit. CHiVE: Varying prosody in speech synthesis with linguistically driven dynamci hierarchical conditional variational network. International Conference on Machine Learning, 2019:3331 – 3340, 2019.

T. O. Olorunfemi. Development of a computational model for predicting acoustic data from articulatory configuration of standard Yorubavowels. Master’s thesis, Department of Computer Science and Engineering, Obafemi Awolowo University, If´e., Nigeria, 2013.

Y. Zhang, J. J. Jiang, L. Biazzo, and M. Jorgensen. Perturbation and Nonlinear Dynamic Analyses of Voices from Patients with Unilateral Laryngeal Paralysis. Journal of Voice, 19(4):519 – 528, 2005.

Z. Mnasri, F. Boukadida, and N. Ellouze. F0 Contour Modeling for Arabic Text-to-Speech Synthesis usinf Fujisaki Parameters and Neural Networks. Signal Processing: An International Journal,4(6): 352 – 369, 2011.

MANUSCRIPT AUTHORS

Mr. ADEGBITE Adewuyi Adetayo

Department of Computer Science, Adekunle Ajasin University, Akungba-Akoko - Nigeria

adewuyi.adegbite@gmail.com

Professor ODEJOBI, Ajadi Odetunji

Department of Computer Science and Engineering, Obafemi Awolowo. University, Ile-Ife - Nigeria

Dr. LAYENI, Olawanle P.

Department of Mathematics, Obafemi Awolowo. University, Ile-Ife - Nigeria

CREATE AUTHOR ACCOUNT

LAUNCH YOUR SPECIAL ISSUE

View all special issues >>

PUBLICATION VIDEOS