Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(91.86KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
Named Entity Recognition for Telugu Using Conditional Random Field
G.V.S.Raju, B.Srinivasu, S. Viswanadha Raju, Allam Balaram
Pages - 36 - 44     |    Revised - 30-11-2010     |    Published - 20-12-2010
Volume - 1   Issue - 3    |    Publication Date - December 2010  Table of Contents
MORE INFORMATION
KEYWORDS
Named entity , Conditional Random field,, NER,, Telugu
ABSTRACT
Named Entity (NE) recognition is a task in which proper nouns and numerical information are extracted from documents and are classified into predefined categories such as Person names, Organization names , Location names, miscellaneous(Date and others). It is a key technology of Information Extraction, Question Answering system, Machine Translations, Information Retrial etc. This paper reports about the development of a NER system for Telugu using Conditional Random field (CRF). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Telugu languages is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named entities (NE) classes, such as Person name, Location name, Organization name, miscellaneous (Date and others). Keywords: Named entity, Conditional Random field, NE, CRF, NER, named entity recognition
CITED BY (2)  
1 Kulkarni, S., & Sagar, B. M. (2014). A Survey on Named Entity Recognition for South Indian Languages.
2 Althobaiti, M., Kruschwitz, U., & Poesio, M. (2012, September). Identifying named entities on a university intranet. In Computer Science and Electronic Engineering Conference (CEEC), 2012 4th (pp. 94-99). IEEE.
1 Google Scholar 
2 CiteSeerX 
3 Scribd 
4 SlideShare 
5 PdfSR 
1 Asif Ekbal et. al. “Language Independent Named Entity Recognition in Indian Languages”. IJCNLP, 2008.
2 Prasad Pingli et al. “A Hybrid Approach for Named Entity Recognition in Indian Languages”. IJCNLP, 2008.
3 Lafferty, McCallum, et al. “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. 2001 .
4 Himanshu Agrawal et. al. “Part of Speech Tagging and Chunking with Conditional Random Fields”. IJCNLP, 2008
5 Lafferty J., McCallum A., and Pereira F. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning. 2001.
6 CRF++: Yet Another CRF toolkit http://crfpp.sourceforge.net/ (accessed on 13 rd Feb 2009)
7 http://en.wikipedia.org/wiki/Named_entity (accessed on 11 th Feb 2009)
8 ]Navbharat Times http://navbharattimes.indiatimes.com (accessed on 11th Feb 2009)
9 Chinchor, N. 1997. MUC-7 Named entity task definition. In Proceedings of the 7th Message Understanding Conference (MUC-7)
10 Finkel, Jenny Rose, Grenager, Trond and Manning, Christopher. 2005. “Incorporating Nonlocal Information into Information Extraction Systems by Gibbs Sampling.” Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 363-370.
11 Kim, J. and Woodland, P.C. (2000a) “Rule Based Named Entity Recognition”. Technical Report CUED/F-INFENG/TR.385, Cambridge University Engineering Department, 2000.
12 Malouf, Robert.2002 Markov models for language-independent named entity recognition. In Proceedings of CoNLL-2002 Taipei, Taiwan, pages 591-599.
13 Pramod Kumar Gupta, Sunita Arora, An Approach for Named Entity Recognition System for Hindi: An Experim-ental Study, Proceedings of ASCNT – 2009, CDAC, Noida, India, pp. 103 – 108
14 T. W. Anderson and S. Scolve, Introduction to the Statistical Analysis of Data. Houghton Mifflin, 1978.
15 Kristjansson T., Culotta A., Viola P., and McCallum A. 2004. Interactive Information Extraction with Constrained ConditionalRandom Fields. In Proceedings of AAAI-2004.
16 D. Roth and W. Yih. Integer linear programming inference for conditional random fields. In Proc. of the International Conference on Machine Learning (ICML), pages 737–744, 2005
17 Zobel, Justin and Dart, Philip. 1996. Phonetic string matching: Lessons from information retrieval. In Proceedings of the Eighteenth ACM SIGIR International Conference on Research and Development in Information Retrieval, Zurich, Switzerland, August 1996, pp. 166-173
18 Li W. and McCallum A. 2003. Rapid Development of Hindi Named Entity Recognition using Conditional Random Fields and Feature Induction. In Special issue of ACM Transactions on Asian Language Information Processing: Rapid Development of Language Capabilities: The Surprise Languages.
19 F. Sha and F. Pereira. Shallow parsing with conditional random fields. roceedings of Human Language Technology, NAACL 2003, 2003.
20 D. Pinto, A. McCallum, X. Wei, and W. B. Croft. Table extraction using conditional random fields. Proceedings of the ACM SIGIR, 2003.
21 Charles Sutton,Andrew McCallum, An Introduction to Conditional Random Fields for Relational Learning, Department of Computer Science University of Massachusetts, USA
22 Paul Viola and Mukund Narasimhan. Learning to extract information from semistructured text using a discriminative context free grammar. In Proceedings ofthe ACM SIGIR, 2005.
23 G.V.S.Raju, B.Srinivasu, S.V.Raju and Kumar, Named Entity Recognition For Telugu using maximum entropy Model , Journal of Theoretical and Applied Information Technology (JATIT), Vol-13, No-2, pages 125-130.
Professor G.V.S.Raju
IIET - India
letter2raju@gmail.com
Associate Professor B.Srinivasu
IIET - India
S. Viswanadha Raju
- India
Assistant Professor Allam Balaram
- India