Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(1.23MB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
Automatic Arabic Dialect Identification Systems for Written Texts: A Survey
Maha Jarallah Althobaiti
Pages - 61 - 89     |    Revised - 31-10-2020     |    Published - 01-12-2020
Volume - 11   Issue - 3    |    Publication Date - December 2020  Table of Contents
MORE INFORMATION
KEYWORDS
Arabic Dialect Identification, Traditional Machine Learning, Deep Learning, Feature Engineering Techniques, Benchmark Corpora, Arabic Natural Language Processing.
ABSTRACT
Arabic dialect identification is a specific task of natural language processing, aiming to automatically predict the Arabic dialect of a given text. Arabic dialect identification is the first step in various natural language processing applications such as machine translation, multilingual text-to-speech synthesis, and cross-language text generation. Therefore, in the last decade, interest has increased in addressing the problem of Arabic dialect identification. In this paper, we present a comprehensive survey of Arabic dialect identification research in written texts. We first define the problem and its challenges. Then, the survey extensively discusses in a critical manner many aspects related to Arabic dialect identification task. So, we review the traditional machine learning methods, deep learning architectures, and complex learning approaches to Arabic dialect identification. We also detail the features and techniques for feature representations used to train the proposed systems. Moreover, we illustrate the taxonomy of Arabic dialects studied in the literature, the various levels of text processing at which Arabic dialect identification is conducted (e.g., token, sentence, and document level), as well as the available annotated resources, including evaluation benchmark corpora. Open challenges and issues are discussed at the end of the survey.
1 N. Habash, O. Rambow, M. Diab, and R. Kanjawi-Faraj, “Guidelines for annotation of Arabic dialectness,” in Proceedings of the LREC Workshop on HLT & NLP within the Arabic world, 2008, pp. 49–53.
2 N. Y. Habash, “Introduction to Arabic natural language processing,” Synth. Lect. Hum. Lang. Technol., vol. 3, no. 1, pp. 1–187, 2010.
3 C. Zhang and M. Abdul-Mageed, “No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 279–284.
4 M. Diab, N. Habash, O. Rambow, M. Altantawy, and Y. Benajiba, “COLABA: Arabic dialect annotation and processing,” in LREC workshop on Semitic language processing, 2010, pp. 66–74.
5 H. Elfardy, M. Al-Badrashiny, and M. Diab, “Code switch point detection in Arabic,” in International Conference on Application of Natural Language to Information Systems, 2013, pp. 412–416.
6 O. F. Zaidan and C. Callison-Burch, “Arabic dialect identification,” Comput. Linguist., vol. 40, no. 1, pp. 171–202, 2014.
7 N. Habash, R. Roth, O. Rambow, R. Eskander, and N. Tomeh, “Morphological analysis and disambiguation for dialectal Arabic,” in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013, pp. 426–432.
8 S. Harrat, K. Meftouh, and K. Smaili, “Machine translation for Arabic dialects (survey),” Inf. Process. Manag., 2017.
9 D. Chiang, M. Diab, N. Habash, O. Rambow, and S. Shareef, “Parsing Arabic dialects,” in 11th Conference of the European Chapter of the Association for Computational Linguistics, 2006, pp. 369–376.
10 R. Zbib et al., “Machine translation of Arabic dialects,” in Proceedings of the 2012 conference of the north american chapter of the association for computational linguistics: Human language technologies, 2012, pp. 49–59.
11 O. F. Zaidan and C. Callison-Burch, “The Arabic online commentary dataset: an annotated dataset of informal Arabic with high dialectal content,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, 2011, pp. 37–41.
12 H. Bouamor, N. Habash, and K. Oflazer, “A Multidialectal Parallel Corpus of Arabic.,” in LREC, 2014, pp. 1240–1245.
13 S. Malmasi and M. Zampieri, “Arabic dialect identification in speech transcripts,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 106–113.
14 G. de Francony, V. Guichard, P. Joshi, H. Afli, and A. Bouchekif, “Hierarchical Deep Learning for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 249–253.
15 H. Bouamor, S. Hassan, and N. Habash, “The MADAR shared task on Arabic fine-grained dialect identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 199–207.
16 M. Diab and N. Habash, “Arabic dialect processing tutorial,” in Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts, 2007, pp. 5–6.
17 H. Hammarström, R. Forkel, and M. Haspelmath, “Glottolog database 4.1,” [Online]. Available: https://glottolog.org/ (accessed Oct. 04, 2019).
18 D. M. Eberhard, G. F. Simons, and C. D. Fennig, “Ethnologue: Languages of the World. Twenty-third edition,” [Online]. Available: http://www.ethnologue.com/ (accessed Jan. 14, 2019).
19 T. O’reilly, What is web 2.0. “ O’Reilly Media, Inc.,” 2009.
20 D. Testen, Semitic Languages. Encyclopedia Britannica, 2018.
21 N. Habash, A. Soudi, and T. Buckwalter, “On arabic transliteration,” in Arabic computational morphology, Springer, 2007, pp. 15–22.
22 M. A. Yaghan, “‘Arabizi’: A contemporary style of Arabic Slang,” Des. Issues, vol. 24, no. 2, pp. 39–52, 2008.
23 D. Palfreyman and M. al Khalil, “A Funky Language for Teenzz to Use: Representing Gulf Arabic in Instant Messaging,” J. Comput. Commun., vol. 9, no. 1, p. JCMC917, 2003.
24 T. S. Jauhiainen, M. Lui, M. Zampieri, T. Baldwin, and K. Lindén, “Automatic language identification in texts: A survey,” J. Artif. Intell. Res., vol. 65, pp. 675–782, 2019.
25 H. Elfardy and M. Diab, “Token level identification of linguistic code switching,” in Proceedings of COLING 2012: Posters, 2012, pp. 287–296.
26 H. Elfardy and M. Diab, “Sentence level dialect identification in Arabic,” in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2013, vol. 2, pp. 456–461.
27 C. Holes, Colloquial Arabic of the Gulf and Saudi Arabia. Routledge & Kegan Paul Books, 1984.
28 C. Holes, “Bahraini dialects: sectarian differences exemplified through texts,” Zeitschrift für Arab. Linguist., no. 13, pp. 27–67, 1984.
29 M. W. Cowell, Reference grammar of Syrian Arabic: Based on the dialect of Damascus (Richard Slade Harrell Arabic Series 6). Washington, DC: Georgetown University Press, 1964.
30 J. Heath, “Moroccan Arabic phonology,” Phonol. Asia Africa (including Caucasus), vol. 1, pp. 205–217, 1997.
31 D. Caubet, “Moroccan Arabic,” Encycl. Arab. Lang. Linguist., vol. 3, pp. 273–287, 2008.
32 C. Unicode, “The Character Contents of the Unicode Standard: Technical Reports and Standards,” [Online]. Available: http://www.unicode.org/reports/ (accessed Jan. 22, 2020).
33 R. Tachicart, K. Bouzoubaa, S. L. Aouragh, and H. Jaafa, “Automatic identification of Moroccan colloquial Arabic,” in International Conference on Arabic Language Processing, 2017, pp. 201–214.
34 K. Darwish, “Arabizi Detection and Conversion to Arabic,” in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), 2014, pp. 217–224.
35 N. Habash et al., “Unified guidelines and resources for Arabic dialect orthography,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, pp. 3628–3637.
36 M. T. Diab et al., “Tharwa: A Large Scale Dialectal Arabic-Standard Arabic-English Lexicon,” in LREC, 2014, pp. 3782–3789.
37 R. Cotterell and C. Callison-Burch, “A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic.,” in LREC, 2014, pp. 241–245.
38 C. Tillmann, S. Mansour, and Y. Al-Onaizan, “Improved sentence-level Arabic dialect classification,” in Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, 2014, pp. 110–119.
39 M. Al-Badrashiny, H. Elfardy, and M. Diab, “Aida2: A hybrid approach for token and sentence level dialect identification in Arabic,” in Proceedings of the Nineteenth Conference on Computational Natural Language Learning, 2015, pp. 42–51.
40 F. Huang, “Improved Arabic dialect classification with social media data,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 2118–2126.
41 S. Malmasi, E. Refaee, and M. Dras, “Arabic dialect identification using a parallel multidialectal corpus,” in Conference of the Pacific Association for Computational Linguistics, 2015, pp. 35–53.
42 L. Lulu and A. Elnagar, “Automatic Arabic Dialect Classification Using Deep Learning Models,” Procedia Comput. Sci., vol. 142, pp. 262–269, 2018.
43 M. Elaraby and M. Abdul-Mageed, “Deep models for Arabic dialect identification on benchmarked data,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 263–274.
44 H. Mubarak, “Dial2MSA: A Tweets Corpus for Converting Dialectal Arabic to Modern Standard Arabic,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), OSACT2018 Workshop, May 2018, pp. 49–53.
45 H. Bouamor et al., “The MADAR Arabic dialect corpus and lexicon,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018), 2018, pp. 3387–3396.
46 T. Takezawa, G. Kikui, M. Mizushima, and E. Sumita, “Multilingual spoken language corpus development for communication research,” in International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from ISCSLP 2006, 2007, pp. 303–324.
47 T. Solorio et al., “Overview for the first shared task on language identification in code-switched data,” in Proceedings of the First Workshop on Computational Approaches to Code Switching, 2014, pp. 62–72.
48 G. Molina et al., “Overview for the Second Shared Task on Language Identification in Code-Switched Data,” in Proceedings of The EMNLP 2016 Second Workshop on Computational Approaches to Linguistic Code Switching (CALCS), 2016, pp. 40–49.
49 H. Elfardy, M. Al-Badrashiny, and M. Diab, “AIDA: Identifying code switching in informal Arabic text,” in Proceedings of The First Workshop on Computational Approaches to Code Switching, 2014, pp. 94–101.
50 S. Malmasi, M. Zampieri, N. Ljubešic, P. Nakov, A. Ali, and J. Tiedemann, “Discriminating between similar languages and Arabic dialect identification: A report on the third DSL shared task,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial6), 2016, pp. 1–14.
51 M. Zampieri et al., “Findings of the VarDial evaluation campaign 2017,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017), 2017, pp. 1–15.
52 M. Zampieri et al., “Language identification and morphosyntactic tagging: The second VarDial evaluation campaign,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 1–17.
53 K. Almeman and M. Lee, “Automatic building of arabic multi dialect text corpora by bootstrapping dialect words,” in 1st international conference on communications, signal processing, and their applications (iccspa), 2013, pp. 1–6.
54 J. Younes and E. Souissi, “A quantitative view of Tunisian dialect electronic writing,” in 5th International Conference on Arabic Language Processing, 2014, pp. 63–72.
55 A. Salama, H. Bouamor, B. Mohit, and K. Oflazer, “YouDACC: the Youtube Dialectal Arabic Comment Corpus,” in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), May 2014, pp. 1246–1251.
56 J. Younes, H. Achour, and E. Souissi, “Constructing linguistic resources for the Tunisian dialect using textual user-generated contents on the social web,” in International Conference on Web Engineering, 2015, pp. 3–14.
57 H. Mubarak and K. Darwish, “Using Twitter to collect a multi-dialectal corpus of Arabic,” in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), 2014, pp. 1–7.
58 A. O. O. Alshutayri and E. Atwell, “Exploring Twitter as a source of an Arabic dialect corpus,” Int. J. Comput. Linguist., vol. 8, no. 2, pp. 37–44, 2017.
59 K. Abu Kwaik, M. K. Saad, S. Chatzikyriakidis, and S. Dobnik, “Shami: A Corpus of Levantine Arabic Dialects,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC’18, Miyazaki, Japan, May 7-12, 2018., 2018, pp. 3645–3652.
60 M. Abdul-Mageed, H. Alhuzali, and M. Elaraby, “You tweet what you speak: A city-level dataset of arabic dialects,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC’18), 2018, pp. 3653–3659.
61 M. Allahyari et al., “A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques.” 2017.
62 D. M. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation,” J. Mach. Learn. Technol., vol. 2, no. 1, pp. 37-–63, 2011.
63 W. Adouane, N. Semmar, R. Johansson, and V. Bobicev, “Automatic detection of arabicized berber and Arabic varieties,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 63–72.
64 S. Harrat, K. Meftouh, M. Abbas, S. Jamoussi, M. Saad, and K. Smaili, “Cross-dialectal Arabic processing,” in International Conference on Intelligent Text Processing and Computational Linguistics, 2015, pp. 620–632.
65 C. E. Metz, “Basic principles of ROC analysis,” in Seminars in nuclear medicine, 1978, vol. 8, no. 4, pp. 283–298.
66 K. Darwish, H. Sajjad, and H. Mubarak, “Verifiably effective Arabic dialect identification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1465–1468.
67 Y. Goldberg, “Neural network methods for natural language processing,” Synth. Lect. Hum. Lang. Technol., vol. 10, no. 1, pp. 1–309, 2017.
68 N. Habash, R. Eskander, and A. Hawwari, “A morphological analyzer for Egyptian Arabic,” in Proceedings of the twelfth meeting of the special interest group on computational morphology and phonology, 2012, pp. 1–9.
69 P. McNamee, “Language and Dialect Discrimination Using Compression-Inspired Language Models,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 195–203.
70 T. Lippincott, P. Shapiro, K. Duh, and P. McNamee, “JHU System Description for the MADAR Arabic Dialect Identification Shared Task,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 264–268.
71 S. J. Russell and P. Norvig, Artificial Intelligence-A Modern Approach, Third International Edition. Pearson Education London, 2010.
72 F. Sadat, F. Kazemi, and A. Farzindar, “Automatic identification of Arabic dialects in social media,” in Proceedings of the first international workshop on Social media retrieval and analysis, 2014, pp. 35–40.
73 M. Salameh, H. Bouamor, and N. Habash, “Fine-grained Arabic dialect identification,” in Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 1332–1344.
74 W. Adouane, N. Semmar, and R. Johansson, “ASIREM participation at the discriminating similar languages shared task 2016,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 163–169.
75 A. M. Ciobanu, S. Nisioi, and L. P. Dinu, “Vanilla Classifiers for Distinguishing between Similar Languages,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 235–242.
76 Ç. Çöltekin and T. Rama, “Discriminating similar languages with linear SVMs and neural networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 15–24.
77 A. Alshutayri, E. S. Atwell, A. Alosaimy, J. Dickins, M. Ingleby, and J. Watson, “Arabic language WEKA-based dialect classifier for Arabic automatic speech recognition transcripts,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016), 2016, pp. 204–211.
78 A. Hanani, A. Qaroush, and S. Taylor, “Classifying ASR transcriptions according to Arabic dialect,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 126–134.
79 H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins, “Text classification using string kernels,” J. Mach. Learn. Res., vol. 2, no. Feb, pp. 419–444, 2002.
80 R. T. Ionescu and M. Popescu, “UnibucKernel: An approach for Arabic dialect identification based on multiple string kernels,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 135–144.
81 R. T. Ionescu and A. Butnaru, “Learning to identify Arabic and German dialects using multiple kernels,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 200–209.
82 A. M. Butnaru and R. T. Ionescu, “UnibucKernel Reloaded: First place in Arabic dialect identification for the second year in a row,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 77–87.
83 L. P. Dinu, A. M. Ciobanu, M. Zampieri, and S. Malmasi, “Classifier ensembles for dialect and language variety identification,” arXiv Prepr. arXiv1808.04800, 2018.
84 A. Ragab et al., “Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 244–248.
85 A. Hanani, A. Qaroush, and S. Taylor, “Identifying dialects with textual and acoustic cues,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 93–101.
86 M. Eldesouki, F. Dalvi, H. Sajjad, and K. Darwish, “Qcri@ DSL 2016: Spoken Arabic dialect identification using textual features,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 221–226.
87 S. Malmasi and M. Zampieri, “Arabic dialect identification using iVectors and ASR transcripts,” in Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), 2017, pp. 178–183.
88 D. Ghoul and G. Lejeune, “MICHAEL: Mining Character-level Patterns for Arabic Dialect Identification (MADAR Challenge),” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 229–233.
89 P. Pribán and S. Taylor, “ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 208–213, doi: 10.18653/v1/w19-4623.
90 L. Deng, D. Yu, and others, “Deep learning: methods and applications,” Found. Trends®in Signal Process., vol. 7, no. 3--4, pp. 197–387, 2014.
91 T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in Proceedings of the International Conference on Learning Representations (ICLR 2013), 2013, pp. 1–12.
92 B. Talafha, W. Farhan, A. Altakrouri, and H. Al-Natsheh, “Mawdoo3 AI at MADAR Shared Task: Arabic Tweet Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 239–243.
93 Y. Belinkov and J. Glass, “A character-level convolutional neural network for distinguishing similar languages and dialects,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 145–152.
94 M. Ali, “Character level convolutional neural network for Arabic dialect identification,” in Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018), 2018, pp. 122–127.
95 Y. Samih, H. Mubarak, A. Abdelali, M. Attia, M. Eldesouki, and K. Darwish, “QC-GO Submission for MADAR Shared Task: Arabic Fine-Grained Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 290–294.
96 F. Rangel, P. Rosso, M. Potthast, and B. Stein, “Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter,” Work. Notes Pap. CLEF, pp. 73–1613, 2017.
97 Y. Kim, “Convolutional neural networks for sentence classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014, pp. 1746–-1751.
98 Y. Du, W. Wang, and L. Wang, “Hierarchical recurrent neural network for skeleton based action recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1110–1118.
99 D. Tang, B. Qin, and T. Liu, “Document modeling with gated recurrent neural network for sentiment classification,” in Proceedings of the 2015 conference on empirical methods in natural language processing, 2015, pp. 1422–1432.
100 R. Socher et al., “Recursive deep models for semantic compositionality over a sentiment treebank,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631–1642.
101 A. Basile, G. Dwyer, M. Medvedeva, J. Rawee, H. Haagsma, and M. Nissim, “Is there life beyond n-grams? A simple SVM-based author profiling system,” 2017.
102 E. S. Tellez, S. Miranda-Jiménez, M. Graff, and D. Moctezuma, “Gender and language-variety Identification with MicroTC.,” 2017.
103 D. Kodiyan, F. Hardegger, S. Neuhaus, and M. Cieliebak, “Author Profiling with bidirectional rnns using Attention with grus: Notebook for PAN at CLEF 2017,” 2017.
104 S. Sierra, M. Montes-y-Gómez, T. Solorio, and F. A. González, “Convolutional Neural Networks for Author Profiling,” Work. Notes CLEF 2017-Conference Labs Eval. Forum, Ireland, 11-14 Sept., 2017.
105 N. Schaetti, “UniNE at CLEF 2017: TF-IDF and Deep-Learning for Author Profiling.,” 2017.
106 Y. Miura, T. Taniguchi, M. Taniguchi, and T. Ohkuma, “Author Profiling with Word+ Character Neural Attention Network.,” 2017.
107 B. Talafha, A. Fadel, M. Al-Ayyoub, Y. Jararweh, A.-S. Mohammad, and P. Juola, “Team JUST at the MADAR Shared Task on Arabic Fine-Grained Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 285–289.
108 P. Mishra and V. Mujadia, “Arabic Dialect Identification for Travel and Twitter Text,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 234–238.
109 S. Harrat, K. Meftouh, K. Abidi, and K. Smaili, “Automatic identification methods on a corpus of twenty five fine-grained Arabic dialects,” in International Conference on Arabic Language Processing, 2019, pp. 79–92.
110 Y. Fares et al., “Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 224–228.
111 A. Zirikly, B. Desmet, and M. Diab, “The GW/LT3 VarDial 2016 shared task system for dialects and similar languages detection,” in COLING, 2016, pp. 33–41.
112 C. Guggilla, “Discrimination between similar languages, varieties and dialects using cnn-and lstm-based deep neural networks,” in Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), 2016, pp. 185–194.
113 Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, “Character-aware neural language models,” 2016.
114 M. Elaraby and A. Zahran, “A Character Level Convolutional BiLSTM for Arabic Dialect Identification,” in Proceedings of the Fourth Arabic Natural Language Processing Workshop, 2019, pp. 274–278.
Dr. Maha Jarallah Althobaiti
Department of Computer Science, Taif University, Taif, 21944-11099 - Saudi Arabia
mahajk2011@hotmail.com, maha.j@tu.edu.sa