|
| Ontology Based Approach for Classifying Biomedical Text Abstracts
|
|
Full
text: |
PDF(378.4KB) |
|
|
Source |
International Journal of Data Engineering (IJDE) |
|
Table of Contents |
|
|
Download
Complete Issue PDF(2.82MB) |
|
Volume: 2 Issue: 1 |
| |
Pages: 1-26 |
|
Publication
Date: March / April 2011 |
|
ISSN
(Online): 2180-1274 |
|
|
|
|
|
Pages |
1 - 15 |
|
Author(s) |
|
|
|
Published
Date |
04-04-2011 |
|
Publisher |
CSC
Journals, Kuala Lumpur,
Malaysia |
|
ADDITIONAL
INFORMATION |
| Keywords Abstract References Cited by Related Articles Collaborative
Colleague |
| |
|
| |
KEYWORDS: Biomedical Literature , Feature Selection, Hierarchical Text Classification, Ontology Alignment, Text Mining |
|
|
| |
|
|
| This Manuscript is indexed in the following databases/websites:- |
|
| 1. Scribd |
| 2. Docstoc |
| 3. Google Scholar |
| |
|
| |
|
|
| Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. Due to this problem, various classification methods were proposed by many researchers for classifying biomedical literature in order to help users find relevant articles on the web. In this paper, we propose a new approach to classifying a collection of biomedical text abstracts by using ontology alignment algorithm that we have developed. To accomplish our goal, we construct the OHSUMED disease hierarchy as the initial training hierarchy and the Medline abstract disease hierarchies as our testing hierarchy. For enriching our training hierarchy, we use the relevant features that extracted from selected categories in the OHSUMED dataset as feature vectors. These feature vectors then are mapped to each node or concept in the OHSUMED disease hierarchy according to their specific category. Afterward, we align and match the concepts in both hierarchies using our ontology alignment algorithm for finding probable concepts or categories. Subsequently, we compute the cosine similarity score between the feature vectors in probable concepts, in the genrichedh OHSUMED disease hierarchy and the Medline abstract disease hierarchy. Finally, we predict a category to the new Medline abstracts based on the highest cosine similarity score. The results obtained from the experiments demonstrate that our proposed approach for hierarchical classification performs slightly better than the multi-class flat classification. |
| |
|
| |
|
| |
| 1 |
A. M. Cohen. An effective general purpose approach for automated biomedical document classification. AMIA Annual Symposium Proceeding, 2006:161-165, 2006 |
|
|
| 2 |
A. Sun and E. Lim. Hierarchical text classification and evaluation. In Proceeding of the IEEE International Conference on Data Mining. Washington DC, USA, 2001 |
|
|
| 3 |
F. M. Couto, B. Martins and M. J. Silva. Classifying biological articles using web sources. In Proceedings of the ACM Symposium on Applied Computing. Nicosia, Cyprus, 2004 |
|
|
| 4 |
A. Singh and K. Nakata. Hierarchical classification of web search results using personalized ontologies. In Proceedings of the 3rd International Conference on Universal Access in Human-Computer Interaction. Las Vegas, NV, 2005 |
|
|
| 5 |
A. Pulijala and S. Gauch. Hierarchical text classification. In Proceedings of the International Conference on Cybernetics and Information Technologies (CITSA). Orlando, FL, 2004 |
|
|
| 6 |
S. Gauch, A. Chandramouli and S. Ranganathan. Training a hierarchical classifier using inter-document relationships. Technical Report, ITTC-FY2007-TR-31020-01, August 2006 |
|
|
| 7 |
M. E. Ruiz and P. Srinivasan. Hierarchical text categorization using neural networks. Information Retrieval, 5(1):87-118, 2002 |
|
|
| 8 |
T. Li, S. Zhu and M. Ogihara. Hierarchical document classification using automatically generated hierarchy. Journal of Intelligent Information Systems, 29(2):211-230, 2007 |
|
|
| 9 |
K. Deschacht and M. F. Moens. Efficient hierarchical entity classifier using conditional random fields. In Proceedings of the 2nd Workshop on Ontology Learning and Population. Sydney, Australia, 2006 |
|
|
| 10 |
G. R. Xue, D. Xing, Q. Yang and Y. Yu. Deep classification in large-scale text hierarchies. In Proceeding of the 31st Annual International ACM SIGIR Conference. Singapore, 2008 |
|
|
| 11 |
G. Nenadic, S. Rice, I. Spasic, S. Ananiadou and B. Stapley. Selecting text features for gene name classification: from documents to terms. In Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine, PA, USA, 2003 |
|
|
| 12 |
Y. Wang and Z. Gong. Hierarchical classification of web pages using support vector machine. Lecture Notes in Computer Science, Springer, 5362/2008:12-21, 2008 |
|
|
| 13 |
S. Dumais and H. Chen. Hierarchical classification of web content. In Proceedings of 23rd ACM International Conference on Research and Development in Information Retrieval. Athens, Greece, 2000 |
|
|
| 14 |
T. Y. Liu, Y. Yang, H. Wan, H. J. Zeng, Z. Chen and W. Y. Ma. Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter Natural language processing and text mining, 7(1):36-43, 2005 |
|
|
| 15 |
G. Nenadic and S. Ananiadou. Mining semantically related terms from biomedical literature. Journal of ACM Transactions on Asian Language Information Processing, 5(1):22-43, 2006 |
|
|
| 16 |
M.H. Seddiqui and M. Aono. An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics: Science, Services and Agents on the World Wide Web, 7:344-356, 2009 |
|
|
| 17 |
OHSUMED dataset. Dataset available at http://davis.wpi.edu/xmdv/datasets/ohsumed.html, 2005 |
|
|
| 18 |
Medical Subject Heading (MeSH) tree structures. Available at http://www.nlm.nih.gov/mesh/trees.html, 2010 |
|
|
| 19 |
C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2007 |
|
|
| |
|
| |
|
| |
| |
|
| |
|
| |
| |
|
| |
|
| |
|
| Rozilawati Binti Dollah : Colleagues
|
|
| Masaki Aono : Colleagues
|
|