Hybrid Phonemic and Graphemic Modeling for Arabic Speech Recognition
Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi
Pages - 88 - 96     |    Revised - 15-11-2012     |    Published - 31-12-2012
Volume - 3   Issue - 1    |    Publication Date - October 2012  Table of Contents
Arabic, Acoustic Modeling, Pronunciation Modeling, Speech Recognition
In this research, we propose a hybrid approach for acoustic and pronunciation modeling for Arabic speech recognition. The hybrid approach benefits from both vocalized and non-vocalized Arabic resources, based on the fact that the amount of non-vocalized resources is always higher than vocalized resources. Two speech recognition baseline systems were built: phonemic and graphemic. The two baseline acoustic models were fused together after two independent trainings to create a hybrid acoustic model. Pronunciation modeling was also hybrid by generating graphemic pronunciation variants as well as phonemic variants. Different techniques are proposed for pronunciation modeling to reduce model complexity. Experiments were conducted on large vocabulary news broadcast speech domain. The proposed hybrid approach has shown a relative reduction in WER of 8.8% to 12.6% based on pronunciation modeling settings and the supervision in the baseline systems.
