Home   >   CSC-OpenAccess Library   >    Manuscript Information
Header Based Classification of Journals Using Document Image Segmentation and Extreme Learning Machine
Kalpana S, Vijaya MS
Pages - 245 - 254     |    Revised - 10-07-2014     |    Published - 10-08-2014
Volume - 8   Issue - 5    |    Publication Date - September / October 2014  Table of Contents
MORE INFORMATION
KEYWORDS
Classification, Document Segmentation, Feature Extraction, Extreme Learning Machine.
ABSTRACT
Document image segmentation plays an important role in classification of journals, magazines, newspaper, etc., It is a process of splitting the document into distinct regions. Document layout analysis is a key process of identifying and categorizing the regions of interest in the scanned image of a text document. A reading system requires the segmentation of text zones from non- textual ones and the arrangement in their correct reading order. Detection and labelling of text zones play different logical roles inside the document such as titles, captions, footnotes, etc. This research work proposes a new approach to segment the document and classify the journals based on the header block. Documents are collected from different journals and used as input image. The image is segmented into blocks like heading, header, author name and footer using Particle Swarm optimization algorithm and features are extracted from header block using Gray Level Co-occurrences Matrix. Extreme Learning Machine has been used for classification based on the header blocks and obtained 82.3% accuracy.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
Esposito, F., Malerba, D., Francesca, Lisi, F.A., Ras, W.: Machine learning for intelligent processing of printed documents. Journal of Intelligent Information Systems 14 (2000) 175–198.
Gerd Maderlechner, Angela Schreyer and Peter Suda, “Information Extraction from Document Images using Attention Based Layout Segmentation”, Germany.
Haralick R.M., Shanmugam K., Dinstein I., “Textural Features for Image Classification”, IEEE Trans.on System Man and Cybernetics, 1973, 3(6), p.610-621.
ISO: 8613: Information Processing-Text and Office Systems-Office, Document Architecture (ODA) and Interchange Format, International Organization for Standardization, 1989.
Jianying Hu, Ramanujan Kashi, Gordon Wilfong, “Document Classification using Layout Analysis”,USA.
K. Kise, A. Sato, and M. Iwata, “Segmentation of page images using the area Voronoi diagram,”Computer Vision and Image Understanding 70, pp. 370–382, 1998.
K. T. Spoehr. Visual information processing. W. H. Freeman and Company, 1982.
L. A. Fletcher and R. Kasturi, “A robust algorithm for text string separation from mixed text/graphics images,” IEEE Transactions on Pattern Analysis and Machine Intelligence 10, pp. 910–918, 1988.
L. O. Gorman, “The document spectrum for page layout analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence 15, pp. 1162–1173, 1993.
M. Krishnamoorthy, G. Nagy, S. Seth, and M. Viswanathan, “Syntactic segmentation and labeling of digitized pages from technical journals,” IEEE Transactions on Pattern Analysis and Machine Intelligence 15, pp. 737–747, 1993.
Nagy, S. Seth, and M. Viswanathan, “A prototype document image analysis system for technical journals,” Computer 25, pp. 10–22, 1992.
Okun O. Doermann D and M. Pietikainen. “Page segmentation and zone classification”. The state of the art. In UMD, 1999.
Robert M. Haralick,”Document image Understanding: Geometric and Logical layout”, University of Washington, Seattle.
S. Baird, S. E. Jones, and S. J. Fortune, “Image segmentation by shape-directed covers,” in Proceedings of International Conference on Pattern Recognition, pp. 820–825, (Atlantic City, NJ), June 1990.
Santanu Chaudhury, Megha Jindal, and Sumantra Dutta Roy, “Model-Guided Segmentation and Layout Labeling of Document Images using a Hierarchical Conditional Random Field”, New Delhi,India.
T. Pavlidis and J. Zhou, “Page segmentation and classification,” Graphical Models and Image Processing 54, pp. 484–496, 1992.
Wahl. K. Wong, and R. Casey, “Block segmentation and text extraction in mixed text/image documents,” Graphical Models and Image Processing 20, pp. 375–390, 1982.
Y. Ishitani. Document layout analysis based on emergent computation. Proc. 4th ICDAR, 1:45–50,1997.
Y. Ishitani. Logical structure analysis of document images based on emergent computation. Proc. 5th ICDAR, 1999.
Yuan. Y. Tang and M. Cheriet, Jiming Liu, J.N Said, “Document Analysis and recognition by computers”.
Miss Kalpana S
Research Scholar PSGR Krishnammal College for Women Coimbatore, India. - India
kalpana.msccs@gmail.com
Miss Vijaya MS
Associate Professor PSGR Krishnammal College for Women Coimbatore, India. - India


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS