Reading text or searching for key words within a historical document is a very challenging task. one of the first steps of the complete task is binarization, where we separate foreground such as text, figures and drawings from the background. Successful results of this important step in many cases can determine next steps to success or failure, therefore it is very vital to the success of the complete task of reading and analyzing the content of a document image. Generally, historical documents images are of poor quality due to their storage condition and degradation over time, which mostly cause to varying contrasts, stains, dirt and seeping ink from reverse side. In this paper, we use banks of anisotropic predefined filters in different scales and orientations to develop a binarization method for degraded documents and manuscripts. Using the fact, that handwritten strokes may follow different scales and orientations, we use predefined sets of filter banks having various scales, weights, and orientations to seek a compact set of filters and weights in order to generate different layers of foregrounds and background. Results of convolving these filters on the gray level image locally, weighted and accumulated to enhance the original image. Based on the different layers, seeds of components in the gray level image and a learning process, we present an improved binarization algorithm to separate the background from layers of foreground. Different layers of foreground which may be caused by seeping ink, degradation or other factors are also separated from the real foreground in a second phase. Promising experimental results were obtained on the DIBCO2011 , DIBCO2013 and H-DIBCO2016 data sets and a collection of images taken from real historical documents.
Text Localization and extraction is an important issue in modern applications of computer vision. Applications such as reading and translating texts in the wild or from videos are among the many applications that can benefit results of this field. In this work, we adopt the well-known Viola-Jones algorithm to enable text extraction and localization from images in the wild. The Viola-Jones is an efficient, and a fast image-processing algorithm originally used for face detection. Based on some resemblance between text and face detection tasks in the wild, we have modified the viola-jones to detect regions of interest where text may be localized. In the proposed approach, some modification to the HAAR like features and a semi-automatic process of data set generating and manipulation were presented to train the algorithm. A process of sliding windows with different sizes have been used to scan the image for individual letters and letter clusters existence. A post processing step is used in order to combine the detected letters into words and to remove false positives. The novelty of the presented approach is using the strengths of a modified Viola-Jones algorithm to identify many different objects representing different letters and clusters of similar letters and later combine them into words of varying lengths. Impressive results were obtained on the ICDAR contest data sets.
A system is presented for spotting and searching keywords in handwritten Arabic documents. A slightly modified dynamic time warping algorithm is used to measure similarities between words. Two sets of features are generated from the outer contour of the words/word-parts. The first set is based on the angles between nodes on the contour and the second set is based on the shape context features taken from the outer contour. To recognize a given word, the segmentation-free approach is partially adopted, i.e., continuous word parts are used as the basic alphabet, instead of individual characters or complete words. Additional strokes, such as dots and detached short segments, are classified and used in a postprocessing step to determine the final comparison decision. The search for a keyword is performed by the search for its word parts given in the correct order. The performance of the presented system was very encouraging in terms of efficiency and match rates. To evaluate the presented system its performance is compared to three different systems. Unfortunately, there are no publicly available standard datasets with ground truth for testing Arabic key word searching systems. Therefore, a private set of images partially taken from Juma’a Al-Majid Center in Dubai for evaluation is used, while using a slightly modified version of the IFN/ENIT database for training.
A large amount of handwritten historical documents are located in libraries around the world. The desire to
access, search, and explore these documents paves the way for a new age of knowledge sharing and promotes
collaboration and understanding between human societies. Currently, the indexes for these documents are
generated manually, which is very tedious and time consuming. Results produced by state of the art techniques,
for converting complete images of handwritten documents into textual representations, are not yet sufficient.
Therefore, word-spotting methods have been developed to archive and index images of handwritten documents
in order to enable efficient searching within documents. In this paper, we present a new matching algorithm to be
used in word-spotting tasks for historical Arabic documents. We present a novel algorithm based on the Chamfer
Distance to compute the similarity between shapes of word-parts. Matching results are used to cluster images of
Arabic word-parts into different classes using the Nearest Neighbor rule. To compute the distance between two
word-part images, the algorithm subdivides each image into equal-sized slices (windows). A modified version
of the Chamfer Distance, incorporating geometric gradient features and distance transform data, is used as a
similarity distance between the different slices. Finally, the Dynamic Time Warping (DTW) algorithm is used
to measure the distance between two images of word-parts. By using the DTW we enabled our system to cluster
similar word-parts, even though they are transformed non-linearly due to the nature of handwriting. We tested
our implementation of the presented methods using various documents in different writing styles, taken from
Juma'a Al Majid Center - Dubai, and obtained encouraging results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.