Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable Level
Fadi Sindran, Firas Mualla, Tino Haderlein, Khaled Daqrouq, Elmar Nöth
Pages - 23 - 37     |    Revised - 31-10-2016     |    Published - 01-12-2016
Volume - 7   Issue - 2    |    Publication Date - December 2016  Table of Contents
Phonetization, Standard Arabic, Phonetic Transcription, Pronunciation Dictionaries, Transcription Rules.
Phonetization is the transcription from written text into sounds. It is used in many natural language processing tasks, such as speech processing, speech synthesis, and computer-aided pronunciation assessment. A common phonetization approach is the use of letter-to-sound rules developed by linguists for the transcription from grapheme to sound. In this paper, we address the problem of rule-based phonetization of standard Arabic. 1The paper contributions can be summarized as follows: 1) Discussion of the transcription rules of standard Arabic which were used in literature on the phonemic and phonetic level. 2) Improvements of existing rules are suggested and new rules are introduced. Moreover, a comprehensive algorithm covering the phenomenon of pharyngealization in standard Arabic is proposed. Finally, the resulting rules set has been tested on large datasets. 3) We present a reliable automatic phonetic transcription of standard Arabic at five levels: phoneme, allophone, syllable, word, and sentence. An encoding which covers all sounds of standard Arabic is proposed, and several pronunciation dictionaries have been automatically generated. These dictionaries have been manually verified yielding an accuracy higher than 99 % for standard Arabic texts that do not contain dates, numbers, acronyms, abbreviations, and special symbols. The dictionaries are available for research purposes.
Mr. Fadi Sindran
Friedrich-Alexander-Universität Erlangen-Nrnberg/Department of Computer Science 5 - Germany
Mr. Firas Mualla
Faculty of Engineering/Department of Computer Science /Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg - Germany
Dr. Tino Haderlein
Faculty of Engineering/Department of Computer Science /Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg - Germany
Professor Khaled Daqrouq
Department of Electrical and Computer Engineering King Abdulaziz University - Saudi Arabia
Professor Elmar Nöth
Faculty of Engineering/Department of Computer Science /Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg - Germany