Article Preview
Top1. Introduction
Document image analysis deals in two stages, i.e. text segmentation and text recognition. Thus, the main aim of the overall process is to localize text regions that will further help for recognition (Sahare & Dhok, 2018a, 2019a). Printed document images (e.g. bank drafts, stamp papers, etc.) contain text with uniform inter and intra spacing and their characters have definite height and width. Consequently, various parameters like centroids and characters spacing help in text segmentation (Louloudis et al., 2008, 2009). On the other side, handwritten documents or in freestyle handwritten documents (e.g. historical documents, examination answer sheets, etc.), non-uniform skews, characters and lines spacing are present. As a result, the complexity of text segmentation gets increased. This condition becomes worse when these two texts (printed and handwritten) are intermixed (Sahare & Dhok, 2017b), (Sahare & Dhok, 2018b, 2019b). It is observed that approaches that use Connected Components (CCs) (Louloudis et al., 2008, 2009) and projection profiles (Manmatha & Rothfeder, 2005) for text-line segmentation are script-dependent. These approaches find it hard to handle handwritten documents and skews. In addition, very few literatures like (Y. Li et al., 2008), (Soora & Deshpande, 2018) have addressed noise and skew related problems. To tackle these issues, a text-line segmentation algorithm is proposed using fast marching method, which does not depends upon the structural properties of text. There are number of papers available on the topic of text-line segmentation, however, none of the research papers done this particular work using fast marching method. To the best of our knowledge, this is one of the initial works carried out detail study and implementation of fast marching method for text-line segmentation from document images. The motivation for using fast marching method is that document image generally consists of two regions, namely text and background. For text-line segmentation, these regions can be considered as wave fronts, which move in outward direction. Each particle of these wave fronts is like a black pixel of the text region, which is considered as a node of a graph. These black pixels move towards other nodes using the cost function described by fast marching method. Therefore, fast marching method segments text-lines more precisely in the form of growing regions within the document images. This algorithm extracts text-lines from the documents without prior knowledge of the script geometry. This is an advantage of proposed text-line segmentation algorithm over other script-dependent algorithms. Further, word segmentation algorithm is designed using wavelet transform and CCs analysis (Sahare & Dhok, 2018b). This algorithm is employed on each segmented text-line. Using wavelet transform, energy map is calculated and then Gaussian filtering is applied. This is followed by CCs analysis to segment words in the form of text-blocks.
1.1. Contribution
In this paper, following contributions are made:
- (i)
With the prior understanding of text-line being horizontal in nature, guiding map is formed through state of the art Gaussian low pass filter in asymmetrical form, which helps to determine the text-line boundary.
- (ii)
Unlike the state of the art approaches, here, closed curve in the form of two-dimensional interface propagation is estimated, which in the form of growing text-line region is represented.
- (iii)
Proposed text-line segmentation approach enjoys the capability to process noisy, complex layout and oriented documents and also became script-independent.
- (iv)
To capture information that differentiates between text and background regions, energy map is utilized through wavelet transform during word segmentation.
- (v)
Focus of the attention regions (words within the text-line) are arisen to form text-blocks representation by convolving with Gaussian low-pass filter with precise standard deviation values.
- (vi)
Word segmentation framework can be applied to noisy documents and directly on the document image as well.