Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(576.39KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
A New Concept Extraction Method for Ontology Construction From Arabic Text
Abeer Alarfaj, Abdulmalik Alsalamn
Pages - 1 - 21     |    Revised - 31-12-2019     |    Published - 01-02-2020
Volume - 9   Issue - 1    |    Publication Date - February 2020  Table of Contents
MORE INFORMATION
KEYWORDS
Ontology Construction, Arabic Ontology, Arabic Language Processing, Concept Extraction, Arabic Term Extraction, Specific Domain Corpus.
ABSTRACT
Ontology is one of the most popular representation model used for knowledge representation, sharing and reusing. The Arabic language has complex morphological, grammatical, and semantic aspects. Due to complexity of Arabic language, automatic Arabic terminology extraction is difficult. In addition, concept extraction from Arabic documents has been challenging research area, because, as opposed to term extraction, concept extraction are more domain related and more selective. In this paper, we present a new concept extraction method for Arabic ontology construction, which is the part of our ontology construction framework. A new method to extract domain relevant single and multi-word concepts in the domain has been proposed, implemented and evaluated. Our method combines linguistic, statistical information and domain knowledge. It first uses linguistic patterns based on POS tags to extract concept candidates, and then stop words filter is implemented to filter unwanted strings. To determine relevance of these candidates within the domain, different statistical measures and new domain relevance measure are implemented for first time for Arabic language. To enhance the performance of concept extraction, a domain knowledge will be integrated into the module. The concepts scores are calculated according to their statistical values and domain knowledge values. In order to evaluate the performance of the method, precision scores were calculated. The results show the high effectiveness of the proposed approach to extract concepts for Arabic ontology construction.
1 A.Al-Arfaj and A. Al-Salman. "Towards Concept Extraction for Ontologies on Arabic language," in Proceeding of 0rdInternationalConference on Islamic Applications in Computer Science And Technology, 1-3 Oct 2015, Turkey.
2 A.Al-Arfaj and A. Al-Salma. “Arabic NLP Tools for Ontology Construction from Arabic Text: An Overview,” in Proceeding of International Conference on Electrical and Information Technologies, (ICEIT'15) March 25-27, 2015 Marrakech, Morocco, pp. 246 – 251.
3 A.Al-Arfaj and A. Al-Salman. "Towards Ontology Construction from Arabic Texts- A Proposed Framework" in Proceeding of The 1.th IEEE International Conference on Computer and Information Technology (CIT 2014), 2014, pp. 737-742.
4 A. Zouaq, D. Gasevic, M. Hatala. "Towards open ontology learning and filtering". Information Systems, vol. 36, no.7, pp. 1064-1081, 2011
5 S. Boulaknadel, B. Daille and D. Aboutajdine. "A multi-word term extraction program for Arabic language," in Proceeding of the 6th International Conference on Language Resources and Evaluation, May 28-30, Marrakech Morocco., 2008, pp.1485-1488.
6 I.Bounhas and Y.Slimani. "A hybrid approach for Arabic multi-word term extraction," In Proceedings of the IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE), Dalian, China, August 21-23 2009, pp. 429-436.
7 P. Buitelaar, P. Cimiano and B. Magnini. "Ontology Learning from Text: An Overview". In: Ontology learning from text: methods, evaluation and applications. Breuker J, Dieng R, Guarino N, Mantaras RLd, Mizoguchi R, Musen M, editors. Amsterdam, Berlin, Oxford, Tokyo, Washington DC: IOS Press 2005.
8 P. Cimiano and J. Volker. "Text2Onto - A Framework for Ontology Learning and Data Driven Change Discovery," in Proceeding of the 10th International Conference on Applications of Natural language to Information system (NLDB), Spain, 2005, pp. 227-238.
9 P. Cimiano., A. Madche., S. Staab and J. Volker. "Ontology Learning." IN Staab,S and Studer, R (eds.), Handbook on Ontologies, International Handbooks on Information Systems, DOI 10.1007/978-3-540-92673-3, Springer-Verlag Berlin Heidelberg 2009, pp.245-267.
10 B. Daille. "Study and Implementation of Combined Techniques for Automatic Extraction of Terminology.". In Resnik and Judith (Eds): The Balancing Act: Combining Symbolic and Statistical Approaches to Language, Cambridge, MA, USA: Mit Press, 1996, pp.49-66.
11 S. El-Beltagy and A. Rafea. "KP-Miner: A Keyphrase Extraction System for English and Arabic Documents". Information systems, 34(1), pp. 132-144, 2009.
12 X. Jiang and A. Tan. "CRCTOL: A Semantic-Based Domain Ontology Learning System." Journal of the American Society for Information Science and Technology (JASIST), 61(1), pp.150-168, 2010.
13 R. Navigli and P. Velardi. "Learning Domain Ontologies from Document Warehouses and Dedicated Websites," Computational Linguistics, 01(0), pp. 181-179, 2004.
14 N. Noy and D. McGuinness. "Ontology Development 101: A Guide to Creating Your First Ontology." Report SMI-2001-0880, Department of Mechanical and Industrial Engineering, University of Toronto,pp.1-25, 2001
15 M. Pazienza., M. Pennacchiotti and F. Zanzotto. "Terminology Extraction: An Analysis of Linguistic and Statistical Approaches," Knowledge Mining, ser.: Studies in Fuzziness and Soft Computing, Sirmakessis, S., Ed., Berlin/Heidelberg: Springer, vol. 185, 2005, pp. 255- 279.
16 J. Qiu., Y. Chai., Y. Liu., Z. Gu., S. Li and Z. Tian. "Automatic nontaxonomic relation extraction from big data in smart city," IEEE Access, vol. 0, pp. 0.58.-74864, 2018
17 J. Qiu., Y. Chai., Z. Tian., X. Du and M. Guizani. "Automatic Concept Extraction Based on Semantic Graphs From Big Data in Smart City," IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, pp.1-9,2019.
18 A.Saif and M. AbAziz, "An Automatic Collocation Extraction from Arabic Corpus." Journal of Computer Science ,7 (1), pp. 6-11, 2011.
19 K. Toutanova,, D. Klein., C. Manning and Y. Singer. "Feature-rich part-of-speech tagging with cyclic dependency network," In Proceedings of Human Language Technology -North American Chapter( HLT-NAACL), 2003, pp. 252-259.
20 S. Tulkens., S. Suster and W. Dealemans. "Unsupervised concept extraction from clinical text through semantic composition". Journal of Biomedical Informatics. Vol.01,pp.110-120,2019
21 Z. Zhang., B. Christopher and F. Ciravegna. "A Comparative Evaluation of Term Recognition Algorithms," in Proceeding of The 6th International Conference on Lnaguage Resources and Evaluation (LREC2008), May 28-31,2008, Marrakech, Morocco, pp.2108- 2113.
22 S. Zaidi., M. Laskri and A. Abdelali. "Arabic collocations extraction using Gate," in Proceeding of International Conference on Machine and Web Intelligence (ICMWI), 2010, pp. 473 - 475.
23 M. Rizoiu and J. Velcin. "Topic Extraction for Ontology Learning". Ontology Learning and Knowledge Discovery Using the Web: Challenges and Recent Advances, Wong W., Liu W. and Bennamoun M. eds. (Ed.).2011, pp. 38-61
24 Z. Harris. "Distributional structure". structural and transformational linguistics, pp.008-794, 1970.
25 K. Sarkar. "A Hybrid Approach to Extract Keyphrases from Medical Documents." International Journal of Computer Application, 36(18), pp. 14-19, 2013.
26 A. Mashaan Abed., S. Tiun and M. AlBared. "Arabic Term Extraction using Combined Approach on Islamic document". Journal of Theoretical & Applied Information Technology, 58 (3), pp.601-608,2013.
27 A. El-Mahdaouy, S. Alaoui Ouatik and E. " A study of association measures and their combination for Arabic MWT extraction" in Proceedings 11th International Conference on Terminology and Artificial Intelligence,2013, pp. 45-52.
Dr. Abeer Alarfaj
Department of Computer Sciences, College of Computer and Information Sciences, Princess Nora Bint AbdulRahman University - Saudi Arabia
aaalarfaj@pnu.edu.sa
Dr. Abdulmalik Alsalamn
Department of Computer Science, College of Computer and Information Sciences, King Saud University - Saudi Arabia