Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(378.44KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Ontology Based Approach for Classifying Biomedical Text Abstracts
Rozilawati Binti Dollah, Masaki Aono
Pages - 1 - 15     |    Revised - 31-03-2011     |    Published - 04-04-2011
Volume - 2   Issue - 1    |    Publication Date - March / April 2011  Table of Contents
MORE INFORMATION
KEYWORDS
Biomedical Literature , Feature Selection, Hierarchical Text Classification, Ontology Alignment, Text Mining
ABSTRACT
Classifying biomedical literature is a difficult and challenging task, especially when a large number of biomedical articles should be organized into a hierarchical structure. Due to this problem, various classification methods were proposed by many researchers for classifying biomedical literature in order to help users find relevant articles on the web. In this paper, we propose a new approach to classifying a collection of biomedical text abstracts by using ontology alignment algorithm that we have developed. To accomplish our goal, we construct the OHSUMED disease hierarchy as the initial training hierarchy and the Medline abstract disease hierarchies as our testing hierarchy. For enriching our training hierarchy, we use the relevant features that extracted from selected categories in the OHSUMED dataset as feature vectors. These feature vectors then are mapped to each node or concept in the OHSUMED disease hierarchy according to their specific category. Afterward, we align and match the concepts in both hierarchies using our ontology alignment algorithm for finding probable concepts or categories. Subsequently, we compute the cosine similarity score between the feature vectors in probable concepts, in the genrichedh OHSUMED disease hierarchy and the Medline abstract disease hierarchy. Finally, we predict a category to the new Medline abstracts based on the highest cosine similarity score. The results obtained from the experiments demonstrate that our proposed approach for hierarchical classification performs slightly better than the multi-class flat classification.
CITED BY (4)  
1 Parlak, B., & Uysal, A. K. (2015, May). Classification of medical documents according to diseases. In Signal Processing and Communications Applications Conference (SIU), 2015 23th (pp. 1635-1638). IEEE.
2 Lim, J. H., & Lee, K. C. (2015). Classifying Biomedical Literature Providing Protein Function Evidence. ETRI Journal, 37(4), 813-823.
3 binti Dollah, R., & Aono, M. (2014). Employing Ontology Enrichment Algorithm in Classifying Biomedical Text Abstracts.
4 SU, Y. R., WANG, R. J., Peng, C. H. E. N., WEI, Y. Y., LI, C. X., & HU, Y. M. (2012). Agricultural ontology based feature optimization for agricultural text clustering. Journal of Integrative Agriculture, 11(5), 752-759.
1 Google Scholar
2 CiteSeerX
3 refSeek
4 Scribd
5 SlideShare
6 PdfSR
1 A. M. Cohen. “An effective general purpose approach for automated biomedical document classification”. AMIA Annual Symposium Proceeding, 2006:161-165, 2006
2 A. Sun and E. Lim. “Hierarchical text classification and evaluation”. In Proceeding of the IEEE International Conference on Data Mining. Washington DC, USA, 2001
3 F. M. Couto, B. Martins and M. J. Silva. “Classifying biological articles using web sources”. In Proceedings of the ACM Symposium on Applied Computing. Nicosia, Cyprus, 2004
4 A. Singh and K. Nakata. “Hierarchical classification of web search results using personalized ontologies”. In Proceedings of the 3rd International Conference on Universal Access in Human-Computer Interaction. Las Vegas, NV, 2005
5 A. Pulijala and S. Gauch. “Hierarchical text classification”. In Proceedings of the International Conference on Cybernetics and Information Technologies (CITSA). Orlando, FL, 2004
6 S. Gauch, A. Chandramouli and S. Ranganathan. “Training a hierarchical classifier using inter-document relationships”. Technical Report, ITTC-FY2007-TR-31020-01, August 2006
7 M. E. Ruiz and P. Srinivasan. “Hierarchical text categorization using neural networks”. Information Retrieval, 5(1):87-118, 2002
8 T. Li, S. Zhu and M. Ogihara. “Hierarchical document classification using automatically generated hierarchy”. Journal of Intelligent Information Systems, 29(2):211-230, 2007
9 K. Deschacht and M. F. Moens. “Efficient hierarchical entity classifier using conditional random fields”. In Proceedings of the 2nd Workshop on Ontology Learning and Population. Sydney, Australia, 2006
10 G. R. Xue, D. Xing, Q. Yang and Y. Yu. “Deep classification in large-scale text hierarchies”. In Proceeding of the 31st Annual International ACM SIGIR Conference. Singapore, 2008
11 G. Nenadic, S. Rice, I. Spasic, S. Ananiadou and B. Stapley. “Selecting text features for gene name classification: from documents to terms”. In Proceedings of the ACL 2003 workshop on Natural language processing in biomedicine, PA, USA, 2003
12 Y. Wang and Z. Gong. “Hierarchical classification of web pages using support vector machine”. Lecture Notes in Computer Science, Springer, 5362/2008:12-21, 2008
13 S. Dumais and H. Chen. “Hierarchical classification of web content”. In Proceedings of 23rd ACM International Conference on Research and Development in Information Retrieval. Athens, Greece, 2000
14 T. Y. Liu, Y. Yang, H. Wan, H. J. Zeng, Z. Chen and W. Y. Ma. “Support vector machines classification with a very large-scale taxonomy”. ACM SIGKDD Explorations Newsletter – Natural language processing and text mining, 7(1):36-43, 2005
15 G. Nenadic and S. Ananiadou. “Mining semantically related terms from biomedical literature”. Journal of ACM Transactions on Asian Language Information Processing, 5(1):22-43, 2006
16 M.H. Seddiqui and M. Aono. “An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size”. Web Semantics: Science, Services and Agents on the World Wide Web, 7:344-356, 2009
17 OHSUMED dataset. Dataset available at http://davis.wpi.edu/xmdv/datasets/ohsumed.html, 2005
18 Medical Subject Heading (MeSH) tree structures. Available at http://www.nlm.nih.gov/mesh/trees.html, 2010
19 C.-C. Chang and C.-J. Lin. “LIBSVM: a library for support vector machines”. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm, 2007
Mr. Rozilawati Binti Dollah
Toyohashi University of Technology - Japan
rozeela@kde.cs.tut.ac.jp
Mr. Masaki Aono
- Japan