Named Entity Recognition System for Hindi Language: A Hybrid Approach
Shilpi Srivastava, Mukund Sanglikar, D.C Kothari
Pages - 10 - 23     |    Revised - 01-07-2011     |    Published - 05-08-2011
Volume - 2   Issue - 1    |    Publication Date - July / August 2011  Table of Contents
Named Entity Recognition (NER) is a major early step in Natural Language Processing (NLP) tasks like machine translation, text to speech synthesis, natural language understanding etc. It seeks to classify words which represent names in text into predefined categories like location, person-name, organization, date, time etc. In this paper we have used a combination of machine learning and Rule based approaches to classify named entities. The paper introduces a hybrid approach for NER. We have experimented with Statistical approaches like Conditional Random Fields (CRF) & Maximum Entropy (MaxEnt) and Rule based approach based on the set of linguistic rules. Linguistic approach plays a vital role in overcoming the limitations of statistical models for morphologically rich language like Hindi. Also the system uses voting method to improve the performance of the NER system. Keywords: NER, MaxEnt, CRF, Rule base, Voting, Hybrid approach ________________________________________
