Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

(763.25KB)
This is an Open Access publication published under CSC-OpenAccess Policy.
Publications from CSC-OpenAccess Library are being accessed from over 74 countries worldwide.
A Novel Approach for Bilingual (English - Oriya) Script Identification and Recognition in a Printed Document
Sanghamitra Mohanty, Himadri Nandini Das Bebartta
Pages - 175 - 191     |    Revised - 30-4-2010     |    Published - 10-06-2010
Volume - 4   Issue - 2    |    Publication Date - May 2010  Table of Contents
MORE INFORMATION
KEYWORDS
Script separation, Indian script, Bilingual (English-Oriya) OCR, Horizontal profiles
ABSTRACT
In most of our official papers, school text books, it is observed that English words interspersed within the Indian languages. So there is need for an Optical Character Recognition (OCR) system which can recognize these bilingual documents and store it for future use. In this paper we present an OCR system developed for the recognition of Indian language i.e. Oriya and Roman scripts for printed documents. For such purpose, it is necessary to separate different scripts before feeding them to their individual OCR system. Firstly, we need to correct the skew followed by segmentation. Here we propose the script differentiation line-wise. We emphasize on Upper and lower matras associated with Oriya and absent in English. We have used horizontal histogram for line distinction belonging to different script. After separation different scripts are sent to their individual recognition engines.
CITED BY (14)  
1 Nayak, M., & Nayak, A. K. (2015). Odia Running Text Recognition Using Moment-Based Feature Extraction and Mean Distance Classification Technique. In Intelligent Computing, Communication and Devices (pp. 497-506). Springer India.
2 Singh, P. K., Sarkar, R., & Nasipuri, M. (2015). Offline Script Identification from multilingual Indic-script documents: A state-of-the-art. Computer Science Review, 15, 1-28.
3 Bandyopadhyay, S. (2014). Maximum Common Sub-graph Based Approach For Handwritten Oriya Digits (Doctoral dissertation, Jadavpur University Kolkata).
4 Bhattacharjee, D., Tripathi, D., Debnath, R., Hanumante, V., & Roy, S. A Novel Approach for Character Recognition. International Journal of Engineering Trends and Technology (IJETT)–Volume, 10.
5 Anand, R., Khanna, R., Student, N. C., & Israna, P. (2013). International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www. iasir. net. CYBERNETICS: SYSTEMS, 43(4).
6 Abel, K. (2013). Benefits of shifting freight delivery to night time, considering routing and environmental effects for addis ababa city (doctoral dissertation, aau).
7 Bhattacharjee, D., Tripathi, D., Debnath, R., Hanumante, V., Roy, S., & Roy, S. Nav view search.
8 ABEBAYEHU, S. (2012). Amharic-English Script Identification in Real-Life Document Images (Doctoral dissertation, aau).
9 SAMUEL, A. (2012). school of graduate studies school of information science (Doctoral dissertation, Addis Ababa University).
10 Sarangi, P. K., Sahoo, A. K., & Ahmed, P. (2012). Recognition of Isolated Handwritten Oriya Numerals using Hopfield Neural Network. International Journal of Computer Applications, 40(8), 36-42.
11 Senapati, D., Rout, S., & Nayak, M. (2012, July). A novel approach to text line and word segmentation on odia printed documents. In Computing Communication & Networking Technologies (ICCCNT), 2012 Third International Conference on (pp. 1-6). IEEE.
12 Mohanty, S., Himadri, N., & Bebartta, I. D. (2011). A Comparative Analysis of Classifiers Accuracies for Bilingual Printed Documents (Oriya-English). International Journal of Computer Science and Information Technologies, 2(2), 18.
13 Senapati, D., Rout, S., Padhi, D., & Mishra, S. (2011). Text Line Segmentation on Odiya Printed Documents. International Journal of Advanced Research in Computer Science, 2(6).
14 Patil, S. B. (2011). Neural Network based Bilingual OCR System: Experiment with English and Kannada Bilingual Documents. International Journal of Computer Applications, 13(8), 6-14.
1 Google Scholar 
2 refSeek 
3 iSEEK 
4 Socol@r  
5 Bielefeld Academic Search Engine (BASE) 
6 Scribd 
7 WorldCat 
8 SlideShare 
9 PDFCAST 
10 PdfSR 
1 A. L. Spitz. “Determination of the Script and Language Content of Document Images”. IEEE Trans. on PAMI, 235-245, 1997
2 J. Ding, L. Lam,and C. Y. Suen. “Classification of Oriental and European Scripts by using Characteristic Features”. In Proceedings of 4th ICDAR, pp. 1023-1027, 1997
3 D. Hhanya, A. G. Ramakrishna, and P. B. Pati. “ Script Identification in Printed Bilingual Documents”. Sadhana, 27(1): 73-82, 2002
4 J. Hochberg, P. Kelly, T. Thomas, and L. Kerns. “Automatic script Identification from Document Images using Cluster-Based Templates” IEEE Trans. on PAMI, 176-181, 1997
5 T. N. Tan. “Rotation Invariant Texture Features and their use in Automatic Script Identification”. IEEE Trans. On PAMI, 751-756, 1998
6 S. Wood, X. Yao, and K. Krishnamurthi, , L. Dang. “Language Identification for Printed Text Independent of Segmentation”. In Proc. Int’l Conf. on Image Processing. 428-431, 1995
7 U. Pal, and B. B Chaudhuri,. “Script Line Separation from Indian Multi-Script Documents”. IETE Journal of Research, 49, 3-11, 2003
8 U. Pal, S. Sinha, and B. B. Chaudhuri. “Multi-Script Line identification from Indian Documents”. In Proceedings 7th ICDAR, 880--884, 2003
9 S. Chanda, U. Pal, “English, Devnagari and Urdu Text Identification”. Proc. International Conference on Cognition and Recognition, 538-545, 2005
10 S. Mohanty, H. N. Das Bebartta, and T.K . Behera. “An Efficient Blingual Optical Character Recognition (English-Oriya) System for Printed Documents”. Seventh International Conference on Advances in Pattern Recognition, ICAPR. 398-401, 2009
11 R. K. Sharma, Dr. A. Singh, “Segmentation of Handwritten Text in Gurmukhi Script”. Computers & Security, 2(3):12-17, 2009
12 D. Suganthi, Dr. S. Purushothaman, “fMRI Segmentation Using Echo State Neural Network”. Computers & Security, 2(1):1-9, 2009
13 A. R. Khan, D. Muhammad, “A Simple Segmentation Approach for Unconstrained Cursive Handwritten Words in Conjunction with the Neural Network”. Computers & Security, 2(3):29- 35, 2009
14 S. Mohanty, and H. K. Behera.” A complete OCR Development System for Oriya Script”. Proceedings of SIMPLE’ 04, IIT Kharagpur, 2004
15 B. V. Dasarathy. “Nearest Neighbor Pattern Classification Techniques”. IEEE Computer Society Press,New York, 1991
16 V. N. Vapnik. “The Nature of Statistical LearningTheory”. Springer-Verlag, London, UK, 1995.
17 V. N. Vapnik. “Statistical Learning Theory”. John Wiley & Sons, New York, 1998.
18 S. Abe. “Analysis of multiclass support vector machines”. In Proceedings of International Conference on Computational Intelligence for Modelling Control and Automation (CIMCA’2003), Vienna, Austria, 2003
19 U. H.-G. Kreßel. “Pair wise classification and support vector machines”. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 255– 268. The MIT Press, Cambridge, MA, 1999
20 J. C. Platt, N. Cristianini, and J. Shawe-Taylor. “Large margin DAGs for multiclass classification”. In S. A. Solla, T. K. Leen, and K.-R. M¨uller, editors, Advances in Neural Information Processing Systems12, pages 547–553. The MIT Press, Cambridge, MA, 2000
21 B. Kijsirikul and N. Ussivakul. “Multiclass support vector machines using adaptive directed acyclic Graph”. In Proceedings of International Joint Conference on Neural Networks (IJCNN 2002), 980–985, 2002
22 S. Abe and T. Inoue. “Fuzzy support vector machines for multiclass problems”. In Proceedings of the Tenth European Symposium on Artificial Neural Networks (ESANN”2002), 116–118, Bruges, Belgium, 2002
23 K. P. Bennett. Combining support vector and mathematical programming methods for classification. In B. Sch¨olkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods: Support Vector Learning, pages 307–326. The MIT Press, Cambridge, MA, 1999
24 J. Weston and C. Watkins. Support vector machinesfor multi-class pattern recognition. In Proceedings of the Seventh European Symposium on Artificial Neural Networks (ESANN’99), pages 219–224, 1999
25 F. Takahashi and S. Abe. “Optimizing Directed Acyclic Graph Support vector Machines”. ANNPR , Florence (Italy), September 2003
Mr. Sanghamitra Mohanty
- India
sangham1@rediffmail.com
Himadri Nandini Das Bebartta
-