Morphological Reconstruction for Word Level Script Identification.
B. V. Dhandra, Mallikarjun Hangarge
Pages - 41 - 51     |    Revised - 15-06-2007     |    Published - 30-06-2007
Volume - 1   Issue - 1    |    Publication Date - June 2007  Table of Contents
Script identification, Bilingual documents, OCR, Morphological reconstruction, regional descriptors
A line of a bilingual document page may contain text words in regional language and numerals in English. For Optical Character Recognition (OCR) of such a document page, it is necessary to identify different script forms before running an individual OCR system. In this paper, we have identified a tool of morphological opening by reconstruction of an image in different directions and regional descriptors for script identification at word level, based on the observation that every text has a distinct visual appearance. The proposed system is developed for three Indian major bilingual documents, Kannada, Telugu and Devnagari containing English numerals. The nearest neighbour and k-nearest neighbour algorithms are applied to classify new word images. The proposed algorithm is tested on 2625 words with various font styles and sizes. The results obtained are quite encouraging
1 Aparna, R. R., & Radha, R. (2014). Script Identification In Trilingual Indian Documents. International Journal of Image Processing (IJIP), 8(4), 178.
2 Singh, S., Kumar, A., Shaw, D. K., & Ghosh, D. (2014, February). Script separation in machine printed bilingual (Devnagari and Gurumukhi) documents using morphological approach. In Communications (NCC), 2014 Twentieth National Conference on (pp. 1-5). IEEE.
3 Abel, K. (2013).benefits of shifting freight delivery to night time, considering routing and environmental effects for addis ababa city (Doctoral dissertation, aau).
4 ABEBAYEHU, S. (2012). Amharic-English Script Identification in Real-Life Document Images (Doctoral dissertation, aau).
5 Pal, U., Jayadevan, R., & Sharma, N. (2012). Handwriting recognition in indian regional scripts: a survey of offline techniques. ACM Transactions on Asian Language Information Processing (TALIP), 11(1), 1.
Mr. B. V. Dhandra
- India
Mr. Mallikarjun Hangarge
- India