Article Preview
TopIntroduction
Handwritten text detection is an important research in computer vision and pattern recognition, and it refers to the task of determining the exact position of all texts or characters from input image and marking it with colored text-box. The difference of writing, outline and shape of handwritten texts made it very difficult to be detected accurately. Therefore, the detection of handwritten text ushered in a difficult challenge. Handwritten text detection has a wide range of applications such as documents recognition, historical documents translation, robot vision, etc. So, it is very important to continuously conduct in-depth research on detection methods in order to improve detection performance.
In the detection task of handwritten characters, images are always slanted, which has defects, ambiguities and excessive background noise, and historical document images have additional problems such as stains and breakage. As a special issue of handwritten text detection, text detection of historical document is performed on historical documents. With the fewer and fewer experts and scholars pay attention to the translation and understanding of historical documents, the importance of an automatic recognition system for historical documents is self-evident. The advantages of historical documents automatic recognition system are as follows: First, all historical documents exist in the form of digital images, avoiding the gradual disappearance such as fading paper or oracle. Second, it can quickly and automatically detect and recognize the input image, which is more efficient and accurate than manual. Third, an efficient detection and recognition system can facilitate the relevant learning of historical documents for researchers. Historical documents detection has an important application in the historical documents recognition system. Because an effective historical document recognition system needs to accurately detect the text-box before it can be recognized.
In recent years, especially with the popularity of deep learning technology, the field of text detection has attracted extensive attention of computer researchers. However, most researchers only focus on scene text detection, document text detection, handwritten text detection and other hot areas. As a special text detection task, the text recognition task of historical document images are difficult to detect because of its complex background, incomplete and fuzzy text, and its initial application value is small, so it has not been paid attention to by the academic community.
In the past 20 years, researchers have proposed many algorithms for text detection in handwritten characters. Especially in the past 10 years, the following literatures are dedicated to the detection of handwritten text (Shin H C, Roth H R, & Gao M, 2017), text detection tasks are defined as a two steps task: Candidate text area extraction and text/non-text area. These algorithms can generally be divided into two categories based on traditional algorithms and algorithms based on deep learning (Chen Shanxiong, Han Xu, & Mo Bofeng, 2017; Chen Shanxiong, Wang Xiaolong, & Wang Minggui, 2019). Because text detection of historical documents is special issue of handwritten text detection, so the methods used in handwritten text detection is suitable for historical documents text detection theoretically. However, additional influence of historical documents image makes it more difficult to detect accurately than handwritten text detection. Therefore, it is very important to improve the existing methods to achieve higher accuracy.