Call for Papers - Ongoing round of submission, notification and publication.
    
  
Home    |    Login or Register    |    Contact CSC
By Title/Keywords/Abstract   By Author
Browse CSC-OpenAccess Library.
  • HOME
  • LIST OF JOURNALS
  • AUTHORS
  • EDITORS & REVIEWERS
  • LIBRARIANS & BOOK SELLERS
  • PARTNERSHIP & COLLABORATION
Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available
(no registration required)

(297.99KB)


-- CSC-OpenAccess Policy
-- Creative Commons Attribution NonCommercial 4.0 International License
>> COMPLETE LIST OF JOURNALS

EXPLORE PUBLICATIONS BY COUNTRIES

EUROPE
MIDDLE EAST
ASIA
AFRICA
.............................
United States of America
United Kingdom
Canada
Australia
Italy
France
Brazil
Germany
Malaysia
Turkey
China
Taiwan
Japan
Saudi Arabia
Jordan
Egypt
United Arab Emirates
India
Nigeria
Exploring Twitter as a Source of an Arabic Dialect Corpus
Areej Odah Alshutayri, Eric Atwell
Pages - 37 - 44     |    Revised - 30-04-2017     |    Published - 01-06-2017
Published in International Journal of Computational Linguistics (IJCL)
Volume - 8   Issue - 2    |    Publication Date - June 2017  Table of Contents
MORE INFORMATION
References   |   Abstracting & Indexing
KEYWORDS
Dialectal Arabic, Phonological Variations, Social Media, Multi Dialect, Twitter, Tweet.
ABSTRACT
Given the lack of Arabic dialect text corpora in comparison with what is available for dialects of English and other languages, there is a need to create dialect text corpora for use in Arabic natural language processing. What is more, there is an increasing use of Arabic dialects in social media, so this text is now considered quite appropriate as a source of a corpus. We collected 210,915K tweets from five groups of Arabic dialects Gulf, Iraqi, Egyptian, Levantine, and North African. This paper explores Twitter as a source and describes the methods that we used to extract tweets and classify them according to the geographic location of the sender. We classified Arabic dialects by using Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many alternative filters and classifiers for machine learning. Our approach in classification tweets achieved an accuracy equal to 79%.
ABSTRACTING & INDEXING
1 Google Scholar 
2 BibSonomy 
3 ResearchGate 
4 White Rose Research Online 
5 Scribd 
6 SlideShare 
REFERENCES
A. Ali, H. Mubarak, and S. Vogel. (2014). "Advances in Dialectal Arabic speech recognition". In: Proceedings of the of the international workshop on spoken language translation (IWSLT) Dec 4-5, Lake Tahoe CA, USA. pp.156-162.
E. Nagoudi, and D. Schwab. (2017). "Semantic Similarity of Arabic Sentences with Word Embeddings". Association for Computational Linguistics. pp.18-24. [workshop publication]. Available from: http://aclweb.org/anthology/W17-1303.
F. Alorifi. (2008). "Automatic identification of Arabic dialects using Hidden Markov Models". PhD thesis, University of Pittsburgh, Department of Electrical Engineering and Computer Science.
F. Biadsy, J. Hirschberg, N. Habash. (2009). "Spoken Arabic dialect identification using phonotactic modeling". In: Proceedings of the EACL workshop on computational approaches to Semitic languages, pp. 53-61, 31 March, Athens, Greece. ACL, Stroudsburg, PA, USA.
F. Sadat, F. Kazemi, and A. Farzindar. (2014). "Automatic identification of arabic language varieties and dialects in social media". In Proceedings of the Second Workshop on Natural Language Processing for Social Media (SocialNLP), pages 22-27.
H. Mubarak, K. Darwish. (2014). "Using Twitter to collect a multi-dialectal corpus of Arabic". In: Proceedings of the EMNLP workshop on natural language processing. Doha, Qatar, 25 October, 2014, pp. 1-7.
K. Almeman, M. Lee, and A. Almiman. (2013). "Multi Dialect Arabic Speech Parallel Corpora". In: Communications, Signal Processing, and their Applications (ICCSPA), 1st International Conference, Sharjah, UAE. IEEE.
K. Almeman, M. Lee. (2013). "Automatic building of Arabic multi-dialect text corpora by bootstrapping dialect words". In: The Proceedings of the 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA'13), Sharjah, UAE, 12-14 Feb., IEEE.
M. Alrabiah, A. Al-Salman, E. Atwell, N. Alhelewh. (2014). "KSUCCA: A Key To Exploring Arabic Historical Linguistics". International Journal of Computational Linguistics 5(2):pp.27-36.
M. Alrabiah, N. Alhelewh, A. Al-Salman, E. Atwell. (2014). "An Empirical Study On The Holy Quran Based On A Large Classical Arabic Corpus". International Journal of Computational Linguistics 5(1):pp.1-13.
M. Elmahdy, R. Gruhn, W. Minker, S. Abdennadher. (2009). "Cross-lingual acoustic modeling for Dialectal Arabic speech recognition". In: ACM SIGKDD Explorations Newsletter 11(1):101-118, November 2009.
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, H. Witten. (2009). "The WEKA Data Mining Software: An update". In ACM SIGKDD Explorations Newsletter, 11(1): pp. 10-18, November 2009.
M. Khoshaba. (2006). "Iraqi dialect vs. Standard Arabic", Medium Corporation, San Jose, CA, USA.
M. Saloot, N. Idris, A. Aw, and D. Thorleuchter. (2016). "Twitter corpus creation: The case of a Malay Chat-style-text Corpus (MCC)". Digital Scholarship in the Humanities, 31(2), pp.227-243.
N. Habash. (2010). "Introduction to Arabic natural language processing". Morgan & Claypool Publishers, Synthesis Lectures on Human Language Technology. 10, ebook isbn 978-1-59829-796-6.
O. Zaidan, C. Callison-Burch. (2014). "Arabic dialect identification". In: Computational Linguistics. 40(1): pp. 171-202.
U. Horesh and W. M. Cotter. (2016). "Current research on linguistic variation in the arabic-speaking world". Language and Linguistics Compass, 10(8):370-381.
MANUSCRIPT AUTHORS
Mrs. Areej Odah Alshutayri
Faculty of Computing and Information Technology King Abdul Aziz University Jeddah, Saudi Arabia and School of Computing University of Leeds Leeds, LS2 9JT, United Kingdom - United Kingdom
aalshetary@kau.edu.sa
Associate Professor Eric Atwell
School of Computing University of Leeds Leeds, LS2 9JT, United Kingdom - United Kingdom


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS
 
You can contact us anytime since we have 24 x 7 support.
Join Us|List of Journals|
    
Copyrights © 2025 Computer Science Journals (CSC Journals). All rights reserved. Privacy Policy | Terms of Conditions