Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(164.22KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Morphological Reconstruction for Word Level Script Identification.
B. V. Dhandra, Mallikarjun Hangarge
Pages - 41 - 51     |    Revised - 15-06-2007     |    Published - 30-06-2007
Volume - 1   Issue - 1    |    Publication Date - June 2007  Table of Contents
MORE INFORMATION
KEYWORDS
Script identification, Bilingual documents, OCR, Morphological reconstruction, regional descriptors
ABSTRACT
A line of a bilingual document page may contain text words in regional language and numerals in English. For Optical Character Recognition (OCR) of such a document page, it is necessary to identify different script forms before running an individual OCR system. In this paper, we have identified a tool of morphological opening by reconstruction of an image in different directions and regional descriptors for script identification at word level, based on the observation that every text has a distinct visual appearance. The proposed system is developed for three Indian major bilingual documents, Kannada, Telugu and Devnagari containing English numerals. The nearest neighbour and k-nearest neighbour algorithms are applied to classify new word images. The proposed algorithm is tested on 2625 words with various font styles and sizes. The results obtained are quite encouraging
CITED BY (5)  
1 Aparna, R. R., & Radha, R. (2014). Script Identification In Trilingual Indian Documents. International Journal of Image Processing (IJIP), 8(4), 178.
2 Singh, S., Kumar, A., Shaw, D. K., & Ghosh, D. (2014, February). Script separation in machine printed bilingual (Devnagari and Gurumukhi) documents using morphological approach. In Communications (NCC), 2014 Twentieth National Conference on (pp. 1-5). IEEE.
3 Abel, K. (2013).benefits of shifting freight delivery to night time, considering routing and environmental effects for addis ababa city (Doctoral dissertation, aau).
4 ABEBAYEHU, S. (2012). Amharic-English Script Identification in Real-Life Document Images (Doctoral dissertation, aau).
5 Pal, U., Jayadevan, R., & Sharma, N. (2012). Handwriting recognition in indian regional scripts: a survey of offline techniques. ACM Transactions on Asian Language Information Processing (TALIP), 11(1), 1.
1 Google Scholar
2 Academic Journals Database
3 ScientificCommons
4 Academic Index
5 CiteSeerX
6 refSeek
7 iSEEK
8 Socol@r
9 ResearchGATE
10 Libsearch
11 Bielefeld Academic Search Engine (BASE)
12 Scribd
13 WorldCat
14 SlideShare
15 PDFCAST
16 PdfSR
17 Google Books
1 B.B.Chaudhuri and U.Pal,” An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi)”, In Proceedings of 4th ICDAR, Uhn. 18-20 August, 1997
2 B.B.Chaudhuri and U.Pal. “A complete printed Bangla OCR”, Pattern Recognition vol.31, pp 531-549, 1998
3 Santanu Chaudhury, Gaurav Harit, Shekar Madnani, R.B.Shet,” Identification of scripts of Indian languages by Combining trainable classifiers”, In Proceedings of ICVGIP 2000, Dec- 20-22, Bangalore, India.
4 D Dhanya, A.G Ramakrishnan and Peeta Basa pati, “Script identification in printed bilingual documents,” Sadhana, vol. 27, part-1, pp. 73-82, 2002
5 J. Hochberg, P. Kelly, T Thomas and L Kerns, “Automatic script identification from document images using cluster-based templates,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol.19, pp.176-181, 1997
6 Judith Hochberg, Kevin Bowers, Michael Cannon and Patrick Keely, “Script and language identification for hand-written document images,” IJDAR-1999, vol.2, pp. 45-52
7 B.V.Dhandra, V.S.Malemath, Mallikarjun Hangarge, Ravindra Hegadi, “Skew detection in Binary image documents based on Image Dilation and Region labeling Approach”, In Proceedings of ICPR 2006, V. No. II-3, pp. 954-957
8 Vincent, L.,” Morphological gray scale reconstruction in image analysis: Applications and efficient algorithms,” IEEE Trans. on Image processing, vol.2, no. 2, pp. 176-201, 1993
9 M.C.Padma and P. Nagabhushan,” Identification and separation of text words of Kannada Hindi and English languages through discriminating features”, In Proceedings of NCDAR- 2003, pp- 252-260. 2003
10 G.S.Peake and Tan, “Script and language identification from document images”, In Proceedings of Eighth British Mach. Vision Conf., vol.2, pp. 230-233, Sept-1997
11 U.Pal and B.B.Chaudhuri, “Script line separation from Indian Multi-script documents,” 5th ICDAR, pp.406-409, 1999
12 U.Pal. S.Sinha and B.B Chaudhuri, “Word-wise Script identification from a document containing English, Devnagari and Telgu Text,” In Proceedings of NCDAR-2003, PP 213-220
13 S. Basavaraj, Patil and N.V.Subbareddy. “Neural network based system for script identification in Indian documents,” Sadhana, vol. 27, part-1, pp. 83-97, 2002
14 Peeta Basa pati, S. Sabari Raju, Nishikanta Pati and A.G. Ramakrishnan, “Gabor filters for document analysis in Indian Bilingual Documents,” In Proceedings of ICISIP-2004, pp. 123- 126
15 P. Nagabhushan, S.A. Angadi and B.S. Anami,” An Intelligent Pin code Script Identification Methodology Based on Texture Analysis using Modified Invariant Moments,” In Proceedings of ICCR-2005, pp. 615-623
16 A.L.Spitz, “Determination of the script and language content of document images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, pp.234-245, 1997
17 A. L. Spitz, “Multilingual document recognition Electronic publishing, Document Manipulations, and Typography,” R. Furuta ed. Cambridge Uni. Press, pp. 193-206, 1990
18 T.N.Tan, “Rotation invariant texture features and their use in automatic script identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp.751-756, 1998
19 S. Wood. X. Yao. K.Krishnamurthi and L.Dang ”Language identification from for printed text independent of segmentation,” In Proceedings of International conference on Image Processing, pp. 428-431, 1995
20 Dengsheng Zhang, Guojun Lu, “Review of shape representation and description techniques,” Pattern Recognition, vol. 37, pp. 1-19, 2004
21 Annop M. Namboodri, Anil K Jain, “ Online handwritten script identification”, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 26,no.1,pp. 124-130, 2004
22 N. Otsu, ” A Threshold Selection Method from Gray-Level Histogram” , IEEE Transaction Systems, Man, and Cybernetics, vol.9,no.1,pp.62-66,1979
Mr. B. V. Dhandra
- India
dhandra_b_v@yahoo.co.in
Mr. Mallikarjun Hangarge
- India