Home   >   CSC-OpenAccess Library   >    Manuscript Information
Automatic Arabic Dialect Identification Systems for Written Texts: A Survey
Maha Jarallah Althobaiti
Pages - 61 - 89     |    Revised - 31-10-2020     |    Published - 01-12-2020
Volume - 11   Issue - 3    |    Publication Date - December 2020  Table of Contents
MORE INFORMATION
KEYWORDS
Arabic Dialect Identification, Traditional Machine Learning, Deep Learning, Feature Engineering Techniques, Benchmark Corpora, Arabic Natural Language Processing.
ABSTRACT
Arabic dialect identification is a specific task of natural language processing, aiming to automatically predict the Arabic dialect of a given text. Arabic dialect identification is the first step in various natural language processing applications such as machine translation, multilingual text-to-speech synthesis, and cross-language text generation. Therefore, in the last decade, interest has increased in addressing the problem of Arabic dialect identification. In this paper, we present a comprehensive survey of Arabic dialect identification research in written texts. We first define the problem and its challenges. Then, the survey extensively discusses in a critical manner many aspects related to Arabic dialect identification task. So, we review the traditional machine learning methods, deep learning architectures, and complex learning approaches to Arabic dialect identification. We also detail the features and techniques for feature representations used to train the proposed systems. Moreover, we illustrate the taxonomy of Arabic dialects studied in the literature, the various levels of text processing at which Arabic dialect identification is conducted (e.g., token, sentence, and document level), as well as the available annotated resources, including evaluation benchmark corpora. Open challenges and issues are discussed at the end of the survey.
A. Alshutayri, E. S. Atwell, A. Alosaimy, J. Dickins, M. Ingleby, and J. Watson, “Arabic language WEKA-based dialect classifier for Arabic automatic speech recognition transcripts,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016), 2016, pp. 204–211.
A. Basile, G. Dwyer, M. Medvedeva, J. Rawee, H. Haagsma, and M. Nissim, “Is there life beyond n-grams? A simple SVM-based author profiling system,” 2017.
A. Hanani, A. Qaroush, and S. Taylor, “Classifying ASR transcriptions according to Arabic dialect,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 126–134.
A. Hanani, A. Qaroush, and S. Taylor, “Identifying dialects with textual and acoustic cues,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 93–101.
A. M. Butnaru and R. T. Ionescu, “UnibucKernel Reloaded: First place in Arabic dialect identification for the second year in a row,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 77–87.
A. M. Ciobanu, S. Nisioi, and L. P. Dinu, “Vanilla Classifiers for Distinguishing between Similar Languages,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 235–242.
A. O. O. Alshutayri and E. Atwell, “Exploring Twitter as a source of an Arabic dialect corpus,” Int. J. Comput. Linguist., vol. 8, no. 2, pp. 37–44, 2017.
A. Ragab et al., “Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 244–248.
A. Salama, H. Bouamor, B. Mohit, and K. Oflazer, “YouDACC: the Youtube Dialectal Arabic Comment Corpus,” in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), May 2014, pp. 1246–1251.
A. Zirikly, B. Desmet, and M. Diab, “The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection,” in COLING, 2016, pp. 33–41.
B. Talafha, A. Fadel, M. Al-Ayyoub, Y. Jararweh, A.-S. Mohammad, and P. Juola, “Team JUST at the MADAR Shared Task on Arabic Fine-Grained Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 285–289.
B. Talafha, W. Farhan, A. Altakrouri, and H. Al-Natsheh, “Mawdoo3 AI at MADAR Shared Task: Arabic Tweet Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 239–243.
C. E. Metz, “Basic principles of ROC analysis,” in Seminars in nuclear medicine, 1978, vol. 8, no. 4, pp. 283–298.
C. Guggilla, “Discrimination between similar languages, varieties and dialects using cnn-and lstm-based deep neural networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 185–194.
C. Holes, Colloquial Arabic of the Gulf and Saudi Arabia. Routledge & Kegan Paul Books, 1984.
C. Holes, “Bahraini dialects: sectarian differences exemplified through texts,” Zeitschrift für Arab. Linguist., no. 13, pp. 27–67, 1984.
C. Tillmann, S. Mansour, and Y. Al-Onaizan, “Improved sentence-level Arabic dialect classification,” in Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, 2014, pp. 110–119.
C. Unicode, “The Character Contents of the Unicode Standard: Technical Reports and Standards,” [Online]. Available: http://www.unicode.org/reports/ (accessed Jan. 22, 2020).
C. Zhang and M. Abdul-Mageed, “No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 279–284.
Ç. Çöltekin and T. Rama, “Discriminating similar languages with linear SVMs and neural networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 15–24.
D. Caubet, “Moroccan Arabic,” Encycl. Arab. Lang. Linguist., vol. 3, pp. 273–287, 2008.
D. Chiang, M. Diab, N. Habash, O. Rambow, and S. Shareef, “Parsing Arabic dialects,” in 11th Conference of the European Chapter of the Association for Computational Linguistics, 2006, pp. 369–376.
D. Ghoul and G. Lejeune, “MICHAEL: Mining Character-level Patterns for Arabic Dialect Identification (MADAR Challenge),” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 229–233.
D. Kodiyan, F. Hardegger, S. Neuhaus, and M. Cieliebak, “Author Profiling with bidirectional rnns using Attention with grus: Notebook for PAN at CLEF 2017,” 2017.
D. M. Eberhard, G. F. Simons, and C. D. Fennig, “Ethnologue: Languages of the World. Twenty-third edition,” [Online]. Available: http://www.ethnologue.com/ (accessed Jan. 14, 2019).
D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” J. Mach. Learn. Technol., vol. 2, no. 1, pp. 37-–63, 2011.
D. Palfreyman and M. al Khalil, “A Funky Language for Teenzz to Use: Representing Gulf Arabic in Instant Messaging,” J. Comput. Commun., vol. 9, no. 1, p. JCMC917, 2003.
D. Tang, B. Qin, and T. Liu, “Document modeling with gated recurrent neural network for sentiment classification,” in Proceedings of the 2015 conference on empirical methods in natural language processing, 2015, pp. 1422–1432.
D. Testen, Semitic Languages. Encyclopedia Britannica, 2018.
E. S. Tellez, S. Miranda-Jiménez, M. Graff, and D. Moctezuma, “Gender and language-variety Identification with MicroTC.,” 2017.
F. Huang, “Improved Arabic dialect classification with social media data,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 2118–2126.
F. Rangel, P. Rosso, M. Potthast, and B. Stein, “Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter,” Work. Notes Pap. CLEF, pp. 73–1613, 2017.
F. Sadat, F. Kazemi, and A. Farzindar, “Automatic identification of Arabic dialects in social media,” in Proceedings of the first international workshop on Social media retrieval and analysis, 2014, pp. 35–40.
G. de Francony, V. Guichard, P. Joshi, H. Afli, and A. Bouchekif, “Hierarchical Deep Learning for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 249–253.
G. Molina et al., “Overview for the Second Shared Task on Language Identification in Code-Switched Data,” in Proceedings of The EMNLP 2016 Second Workshop on Computational Approaches to Linguistic Code Switching (CALCS), 2016, pp. 40–49.
H. Bouamor et al., “The MADAR Arabic dialect corpus and lexicon,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), 2018, pp. 3387–3396.
H. Bouamor, N. Habash, and K. Oflazer, “A Multidialectal Parallel Corpus of Arabic.,” in LREC, 2014, pp. 1240–1245.
H. Bouamor, S. Hassan, and N. Habash, “The MADAR shared task on Arabic fine-grained dialect identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 199–207.
H. Elfardy and M. Diab, “Sentence level dialect identification in Arabic,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2013, vol. 2, pp. 456–461.
H. Elfardy and M. Diab, “Token level identification of linguistic code switching,” in Proceedings of COLING 2012: Posters, 2012, pp. 287–296.
H. Elfardy, M. Al-Badrashiny, and M. Diab, “AIDA: Identifying code switching in informal Arabic text,” in Proceedings of The First Workshop on Computational Approaches to Code Switching, 2014, pp. 94–101.
H. Elfardy, M. Al-Badrashiny, and M. Diab, “Code switch point detection in Arabic,” in International Conference on Application of Natural Language to Information Systems, 2013, pp. 412–416.
H. Hammarström, R. Forkel, and M. Haspelmath, “Glottolog database 4.1,” [Online]. Available: https://glottolog.org/ (accessed Oct. 04, 2019).
H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins, “Text classification using string kernels,” J. Mach. Learn. Res., vol. 2, no. Feb, pp. 419–444, 2002.
H. Mubarak and K. Darwish, “Using Twitter to collect a multi-dialectal corpus of Arabic,” in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), 2014, pp. 1–7.
H. Mubarak, “Dial2MSA: A Tweets Corpus for Converting Dialectal Arabic to Modern Standard Arabic,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), OSACT2018 Workshop, May 2018, pp. 49–53.
J. Heath, “Moroccan Arabic phonology,” Phonol. Asia Africa (including Caucasus), vol. 1, pp. 205–217, 1997.
J. Younes and E. Souissi, “A quantitative view of Tunisian dialect electronic writing,” in 5th International Conference on Arabic Language Processing, 2014, pp. 63–72.
J. Younes, H. Achour, and E. Souissi, “Constructing linguistic resources for the Tunisian dialect using textual user-generated contents on the social web,” in International Conference on Web Engineering, 2015, pp. 3–14.
K. Abu Kwaik, M. K. Saad, S. Chatzikyriakidis, and S. Dobnik, “Shami: A Corpus of Levantine Arabic Dialects,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC’18, Miyazaki, Japan, May 7-12, 2018., 2018, pp. 3645–3652.
K. Almeman and M. Lee, “Automatic building of arabic multi dialect text corpora by bootstrapping dialect words,” in 1st international conference on communications, signal processing, and their applications (iccspa), 2013, pp. 1–6.
K. Darwish, H. Sajjad, and H. Mubarak, “Verifiably effective Arabic dialect identification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1465–1468.
K. Darwish, “Arabizi Detection and Conversion to Arabic,” in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), 2014, pp. 217–224.
L. Deng, D. Yu, and others, “Deep learning: methods and applications,” Found. Trends®in Signal Process., vol. 7, no. 3--4, pp. 197–387, 2014.
L. Lulu and A. Elnagar, “Automatic Arabic Dialect Classification Using Deep Learning Models,” Procedia Comput. Sci., vol. 142, pp. 262–269, 2018.
L. P. Dinu, A. M. Ciobanu, M. Zampieri, and S. Malmasi, “Classifier ensembles for dialect and language variety identification,” arXiv Prepr. arXiv1808.04800, 2018.
M. A. Yaghan, “‘Arabizi’: A contemporary style of Arabic Slang,” Des. Issues, vol. 24, no. 2, pp. 39–52, 2008.
M. Abdul-Mageed, H. Alhuzali, and M. Elaraby, “You tweet what you speak: A city-level dataset of arabic dialects,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC’18), 2018, pp. 3653–3659.
M. Al-Badrashiny, H. Elfardy, and M. Diab, “Aida2: A hybrid approach for token and sentence level dialect identification in Arabic,” in Proceedings of the Nineteenth Conference on Computational Natural Language Learning, 2015, pp. 42–51.
M. Ali, “Character level convolutional neural network for Arabic dialect identification,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 122–127.
M. Allahyari et al., “A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques.” 2017.
M. Diab and N. Habash, “Arabic dialect processing tutorial,” in Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts, 2007, pp. 5–6.
M. Diab, N. Habash, O. Rambow, M. Altantawy, and Y. Benajiba, “COLABA: Arabic dialect annotation and processing,” in LREC workshop on Semitic language processing, 2010, pp. 66–74.
M. Elaraby and A. Zahran, “A Character Level Convolutional BiLSTM for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 274–278.
M. Elaraby and M. Abdul-Mageed, “Deep models for Arabic dialect identification on benchmarked data,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 263–274.
M. Eldesouki, F. Dalvi, H. Sajjad, and K. Darwish, “Qcri@ DSL 2016: Spoken Arabic dialect identification using textual features,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 221–226.
M. Salameh, H. Bouamor, and N. Habash, “Fine-grained Arabic dialect identification,” in Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 1332–1344.
M. T. Diab et al., “Tharwa: A Large Scale Dialectal Arabic-Standard Arabic-English Lexicon,” in LREC, 2014, pp. 3782–3789.
M. W. Cowell, Reference grammar of Syrian Arabic: Based on the dialect of Damascus (Richard Slade Harrell Arabic Series 6). Washington, DC: Georgetown University Press, 1964.
M. Zampieri et al., “Findings of the VarDial evaluation campaign 2017,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017), 2017, pp. 1–15.
M. Zampieri et al., “Language identification and morphosyntactic tagging: The second VarDial evaluation campaign,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 1–17.
N. Habash et al., “Unified guidelines and resources for Arabic dialect orthography,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, pp. 3628–3637.
N. Habash, A. Soudi, and T. Buckwalter, “On arabic transliteration,” in Arabic computational morphology, Springer, 2007, pp. 15–22.
N. Habash, O. Rambow, M. Diab, and R. Kanjawi-Faraj, “Guidelines for annotation of Arabic dialectness,” in Proceedings of the LREC Workshop on HLT & NLP within the Arabic world, 2008, pp. 49–53.
N. Habash, R. Eskander, and A. Hawwari, “A morphological analyzer for Egyptian Arabic,” in Proceedings of the twelfth meeting of the special interest group on computational morphology and phonology, 2012, pp. 1–9.
N. Habash, R. Roth, O. Rambow, R. Eskander, and N. Tomeh, “Morphological analysis and disambiguation for dialectal Arabic,” in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013, pp. 426–432.
N. Schaetti, “UniNE at CLEF 2017: TF-IDF and Deep-Learning for Author Profiling.,” 2017.
N. Y. Habash, “Introduction to Arabic natural language processing,” Synth. Lect. Hum. Lang. Technol., vol. 3, no. 1, pp. 1–187, 2010.
O. F. Zaidan and C. Callison-Burch, “Arabic dialect identification,” Comput. Linguist., vol. 40, no. 1, pp. 171–202, 2014.
O. F. Zaidan and C. Callison-Burch, “The Arabic online commentary dataset: an annotated dataset of informal Arabic with high dialectal content,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, 2011, pp. 37–41.
P. McNamee, “Language and Dialect Discrimination Using Compression-Inspired Language Models,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 195–203.
P. Mishra and V. Mujadia, “Arabic Dialect Identification for Travel and Twitter Text,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 234–238.
P. Pribán and S. Taylor, “ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 208–213, doi: 10.18653/v1/w19-4623.
R. Cotterell and C. Callison-Burch, “A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic.,” in LREC, 2014, pp. 241–245.
R. Socher et al., “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631–1642.
R. T. Ionescu and A. Butnaru, “Learning to identify Arabic and German dialects using multiple kernels,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 200–209.
R. T. Ionescu and M. Popescu, “UnibucKernel: An approach for Arabic dialect identification based on multiple string kernels,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 135–144.
R. Tachicart, K. Bouzoubaa, S. L. Aouragh, and H. Jaafa, “Automatic identification of Moroccan colloquial Arabic,” in International Conference on Arabic Language Processing, 2017, pp. 201–214.
R. Zbib et al., “Machine translation of Arabic dialects,” in Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: Human language technologies, 2012, pp. 49–59.
S. Harrat, K. Meftouh, and K. Smaili, “Machine translation for Arabic dialects (survey),” Inf. Process. Manag., 2017.
S. Harrat, K. Meftouh, K. Abidi, and K. Smaili, “Automatic identification methods on a corpus of twenty five fine-grained Arabic dialects,” in International Conference on Arabic Language Processing, 2019, pp. 79–92.
S. Harrat, K. Meftouh, M. Abbas, S. Jamoussi, M. Saad, and K. Smaili, “Cross-dialectal Arabic processing,” in International Conference on Intelligent Text Processing and Computational Linguistics, 2015, pp. 620–632.
S. J. Russell and P. Norvig, Artificial Intelligence-A Modern Approach, Third International Edition. Pearson Education London, 2010.
S. Malmasi and M. Zampieri, “Arabic dialect identification in speech transcripts,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 106–113.
S. Malmasi and M. Zampieri, “Arabic dialect identification using iVectors and ASR transcripts,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 178–183.
S. Malmasi, E. Refaee, and M. Dras, “Arabic dialect identification using a parallel multidialectal corpus,” in Conference of the Pacific Association for Computational Linguistics, 2015, pp. 35–53.
S. Malmasi, M. Zampieri, N. Ljubešic, P. Nakov, A. Ali, and J. Tiedemann, “Discriminating between similar languages and Arabic dialect identification: A report on the third DSL shared task,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial6), 2016, pp. 1–14.
S. Sierra, M. Montes-y-Gómez, T. Solorio, and F. A. González, “Convolutional Neural Networks for Author Profiling,” Work. Notes CLEF 2017-Conference Labs Eval. Forum, Ireland, 11-14 Sept., 2017.
T. Lippincott, P. Shapiro, K. Duh, and P. McNamee, “JHU System Description for the MADAR Arabic Dialect Identification Shared Task,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 264–268.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in Proceedings of the International Conference on Learning Representations (ICLR 2013), 2013, pp. 1–12.
T. O’reilly, What is web 2.0. “ O’Reilly Media, Inc.,” 2009.
T. S. Jauhiainen, M. Lui, M. Zampieri, T. Baldwin, and K. Lindén, “Automatic language identification in texts: A survey,” J. Artif. Intell. Res., vol. 65, pp. 675–782, 2019.
T. Solorio et al., “Overview for the first shared task on language identification in code-switched data,” in Proceedings of the First Workshop on Computational Approaches to Code Switching, 2014, pp. 62–72.
T. Takezawa, G. Kikui, M. Mizushima, and E. Sumita, “Multilingual spoken language corpus development for communication research,” in International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from ISCSLP 2006, 2007, pp. 303–324.
W. Adouane, N. Semmar, and R. Johansson, “ASIREM participation at the discriminating similar languages shared task 2016,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 163–169.
W. Adouane, N. Semmar, R. Johansson, and V. Bobicev, “Automatic detection of arabicized berber and Arabic varieties,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 63–72.
Y. Belinkov and J. Glass, “A character-level convolutional neural network for distinguishing similar languages and dialects,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 145–152.
Y. Du, W. Wang, and L. Wang, “Hierarchical recurrent neural network for skeleton based action recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1110–1118.
Y. Fares et al., “Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 224–228.
Y. Goldberg, “Neural network methods for natural language processing,” Synth. Lect. Hum. Lang. Technol., vol. 10, no. 1, pp. 1–309, 2017.
Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, “Character-aware neural language models,” 2016.
Y. Kim, “Convolutional neural networks for sentence classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1746–-1751.
Y. Miura, T. Taniguchi, M. Taniguchi, and T. Ohkuma, “Author Profiling with Word+ Character Neural Attention Network.,” 2017.
Y. Samih, H. Mubarak, A. Abdelali, M. Attia, M. Eldesouki, and K. Darwish, “QC-GO Submission for MADAR Shared Task: Arabic Fine-Grained Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 290–294.
Dr. Maha Jarallah Althobaiti
Department of Computer Science, Taif University, Taif, 21944-11099 - Saudi Arabia
mahajk2011@hotmail.com, maha.j@tu.edu.sa