| |
| |
|
|
|
|
| Morphological Reconstruction for Word Level Script Identification.
|
|
Full
text: |
PDF(164.2KB) |
|
|
Source |
International Journal of Computer Science and Security (IJCSS) |
|
Table of Contents |
|
|
Download
Complete Issue PDF(1.56MB) |
|
Volume: 1 Issue: 1 |
| |
Pages: 1-96 |
|
Publication
Date: June 2007 |
|
ISSN
(Online): 1985-1553 |
|
|
|
|
|
Pages |
41 - 51 |
|
Author(s) |
|
|
|
Published
Date |
30-06-2007 |
|
Publisher |
CSC
Journals, Kuala Lumpur,
Malaysia |
|
ADDITIONAL
INFORMATION |
| Keywords Abstract References Cited by Related Articles Collaborative
Colleague |
| |
|
| |
KEYWORDS: Script identification, Bilingual documents, OCR, Morphological reconstruction, regional descriptors |
|
|
| |
|
|
| This Manuscript is indexed in the following databases/websites:- |
|
| 1. Directory of Open Access Journals (DOAJ) |
| 2. CiteSeerX |
| 3. Docstoc |
| 4. Scribd |
| 5. PDFCAST |
| 6. Google Scholar |
| 7. WorldCat |
| 8. ScientificCommons |
| 9. Bielefeld Academic Search Engine (BASE) |
| 10. Academic Index |
| 11. refSeek |
| 12. ResearchGATE |
| 13. Microsoft Academic Search |
| 14. Socol@r |
| 15. iSEEK |
| 16. Academic Journals Database |
| 17. Libsearch |
| 18. slideshare |
| 19. Google Books |
| |
|
| |
|
|
| A line of a bilingual document page may contain text words in regional language
and numerals in English. For Optical Character Recognition (OCR) of such a
document page, it is necessary to identify different script forms before running an
individual OCR system. In this paper, we have identified a tool of morphological
opening by reconstruction of an image in different directions and regional
descriptors for script identification at word level, based on the observation that
every text has a distinct visual appearance. The proposed system is developed
for three Indian major bilingual documents, Kannada, Telugu and Devnagari
containing English numerals. The nearest neighbour and k-nearest neighbour
algorithms are applied to classify new word images. The proposed algorithm is
tested on 2625 words with various font styles and sizes. The results obtained are
quite encouraging |
| |
|
| |
|
| |
| 1 |
B.B.Chaudhuri and U.Pal,” An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi)”, In Proceedings of 4th ICDAR, Uhn. 18-20 August, 1997 |
|
|
| 2 |
B.B.Chaudhuri and U.Pal. “A complete printed Bangla OCR”, Pattern Recognition vol.31, pp 531-549, 1998 |
|
|
| 3 |
Santanu Chaudhury, Gaurav Harit, Shekar Madnani, R.B.Shet,” Identification of scripts of Indian languages by Combining trainable classifiers”, In Proceedings of ICVGIP 2000, Dec- 20-22, Bangalore, India. |
|
|
| 4 |
D Dhanya, A.G Ramakrishnan and Peeta Basa pati, “Script identification in printed bilingual documents,” Sadhana, vol. 27, part-1, pp. 73-82, 2002 |
|
|
| 5 |
J. Hochberg, P. Kelly, T Thomas and L Kerns, “Automatic script identification from document images using cluster-based templates,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol.19, pp.176-181, 1997 |
|
|
| 6 |
Judith Hochberg, Kevin Bowers, Michael Cannon and Patrick Keely, “Script and language identification for hand-written document images,” IJDAR-1999, vol.2, pp. 45-52 |
|
|
| 7 |
B.V.Dhandra, V.S.Malemath, Mallikarjun Hangarge, Ravindra Hegadi, “Skew detection in Binary image documents based on Image Dilation and Region labeling Approach”, In Proceedings of ICPR 2006, V. No. II-3, pp. 954-957 |
|
|
| 8 |
Vincent, L.,” Morphological gray scale reconstruction in image analysis: Applications and efficient algorithms,” IEEE Trans. on Image processing, vol.2, no. 2, pp. 176-201, 1993 |
|
|
| 9 |
M.C.Padma and P. Nagabhushan,” Identification and separation of text words of Kannada Hindi and English languages through discriminating features”, In Proceedings of NCDAR- 2003, pp- 252-260. 2003 |
|
|
| 10 |
G.S.Peake and Tan, “Script and language identification from document images”, In Proceedings of Eighth British Mach. Vision Conf., vol.2, pp. 230-233, Sept-1997 |
|
|
| 11 |
U.Pal and B.B.Chaudhuri, “Script line separation from Indian Multi-script documents,” 5th ICDAR, pp.406-409, 1999 |
|
|
| 12 |
U.Pal. S.Sinha and B.B Chaudhuri, “Word-wise Script identification from a document containing English, Devnagari and Telgu Text,” In Proceedings of NCDAR-2003, PP 213-220 |
|
|
| 13 |
S. Basavaraj, Patil and N.V.Subbareddy. “Neural network based system for script identification in Indian documents,” Sadhana, vol. 27, part-1, pp. 83-97, 2002 |
|
|
| 14 |
Peeta Basa pati, S. Sabari Raju, Nishikanta Pati and A.G. Ramakrishnan, “Gabor filters for document analysis in Indian Bilingual Documents,” In Proceedings of ICISIP-2004, pp. 123- 126 |
|
|
| 15 |
P. Nagabhushan, S.A. Angadi and B.S. Anami,” An Intelligent Pin code Script Identification Methodology Based on Texture Analysis using Modified Invariant Moments,” In Proceedings of ICCR-2005, pp. 615-623 |
|
|
| 16 |
A.L.Spitz, “Determination of the script and language content of document images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, pp.234-245, 1997 |
|
|
| 17 |
A. L. Spitz, “Multilingual document recognition Electronic publishing, Document Manipulations, and Typography,” R. Furuta ed. Cambridge Uni. Press, pp. 193-206, 1990 |
|
|
| 18 |
T.N.Tan, “Rotation invariant texture features and their use in automatic script identification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp.751-756, 1998 |
|
|
| 19 |
S. Wood. X. Yao. K.Krishnamurthi and L.Dang ”Language identification from for printed text independent of segmentation,” In Proceedings of International conference on Image Processing, pp. 428-431, 1995 |
|
|
| 20 |
Dengsheng Zhang, Guojun Lu, “Review of shape representation and description techniques,” Pattern Recognition, vol. 37, pp. 1-19, 2004 |
|
|
| 21 |
Annop M. Namboodri, Anil K Jain, “ Online handwritten script identification”, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 26,no.1,pp. 124-130, 2004 |
|
|
| 22 |
N. Otsu, ” A Threshold Selection Method from Gray-Level Histogram” , IEEE Transaction Systems, Man, and Cybernetics, vol.9,no.1,pp.62-66,1979 |
|
|
| |
|
| |
|
| |
| 1 |
S. Mansor, R. bin Din and A. Samsudin. "Analysis of Natural Language Steganography". International Journal of Computer Science and Security (IJCSS), 3(2), pp. 113-125, 2009 |
|
|
| |
|
| |
|
| |
| 1 |
arXiv.org |
| 2 |
Gulbarga University |
| 3 |
Universität Trier |
| 4 |
ArnetMiner |
| 5 |
eprintweb.org |
| 6 |
PAPYRUS |
| 7 |
Free Fonts |
| 8 |
PeekYou |
| 9 |
123people |
| 10 |
Baidu |
| 11 |
The Smithsonian/NASA Astrophysics Data System |
| |
|
| |
|
| |
|
| B. V. Dhandra : Colleagues
|
|
| Mallikarjun Hangarge : Colleagues
|
|
|
|
|
|
|
|
|
|
|