Separation of mixed Document Images in Farsi Scanned Documents Using Blind Source Separation
Hossein Ghanbarloo, Farbod Razzazi, Shahpur Alirezaei
Pages - 421 - 435     |    Revised - 30-08-2010     |    Published - 30-10-2010
Volume - 4   Issue - 4    |    Publication Date - October 2010  Table of Contents
Blind source Separation, Independent component Analysis, show-through, feed-through, background removing, scanned documents processing
In the field of mixed scanned documents separation, various studies have been carried out to reduce one (or more) unwanted artifacts from the document. Most of the approaches are based on comparison of the front and back sides of the documents. In some cases, it has been suggested to analyze the colored images; however, because of the calculation complexity of the approaches, they are not very applicable in practical applications. Furthermore none of them are tested on Farsi/Arabic documents. In this paper, an applicable approach to large size images is presented which is based on image block segmentation (mosaicing). The advantages of this approach are less memory usage, combining of simultaneous and ordinal blind source separation methods in order to increase the algorithm efficiency, reducing calculation complexity of the algorithm into about twenty percents of the basic algorithm, and high stability in noisy images. In noiseless conditions, the average signal to noise ratio of the output images is reached up to 28.75 db. Furthermore, all of these cases have been tested on Farsi official documents. By applying the suggested ideas, considerable accuracy is achieved in the results, at minimum time. In addition, various parameters of the proposed algorithm (e.g. the size of each block, appropriate initial point, and number of iterations) were optimized.
