Article Preview
Top1. Introduction
Arabic words recognition is one of the approaches to recognize Arabic script. The global approach performs the text recognition by recognizing the whole word or sub words after segmentation of text into words or sub words (Lawgali et al., 2001; Miled et al., 1997).
Many contributions and several approaches and feature extraction techniques was proposed (Al-Hashim et al.; Alsaif et al., 2011; Menasri et al.; Abanda et al., 2009; Hachour, 2004) for Arabic words recognition that shown encouraging results than other analytic approaches because word recognition does not suffer much problems that knows other approaches such as the strong resemblance between the unit to recognize.
In this paper we propose a recognition system of printed Arabic names of Moroccan towns and villages shown in Figure 1. The proposed system is shown in Figure 2.
Figure 1. Simples Moroccan towns and villages names used
Figure 2. Process of classification system
The proposed system consists of four steps: acquisition, reprocessing, extraction and classification step. In the first step image is scanned then in next step is converted into a binary image then resized to 96x96, unnecessary pixels which are not part of the name, which exist in outside of name area is deleted then the resulting image is resized to 96 rows and 96 columns. Density weight and zigzag sequence method is used to extract 144 features to represent each name in the data set. In the classification phase, data set was created with 16000 simples for training data and 6000 for testing. In the classification step K nearest neighbor whit consensus rule and SVM are used as a classification method.
Top2. Preprocessing
The preprocessing step aims to prepare scanned image for next steps, in this step unnecessary information was removed and the quality of meaningful information was improved.
After scanning image of the name of Moroccan town or village with adequate resolution, binarization, noise removed and localization was performed on scanned image.
2.1. Binarization
Binarization technique (Kong et al., 1996; Pratt, 1991) aims to transform the matrix of scanned image to a binary matrix. The scanned image is represented by a matrix of values vary between 0 and 255. The last matrix is divided by 255 to obtain a matrix with values between 0 and 1. The binary matrix was obtained using thresholding technique with a threshold equal 0.3.