PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE-IS&T Proceedings Volume 7247, including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A combination of x-ray fluorescence and image processing has been shown to recover text characters written in iron gall
ink on parchment, even when obscured by gold paint. Several leaves of the Archimedes Palimpsest were imaged using
rapid-scan, x-ray fluorescence imaging performed at the Stanford Synchrotron Radiation Lightsource of the SLAC
National Accelerator Laboratory. A simple linear show-through model is shown to successfully separate different layers
of text in the x-ray images, making the text easier to read by the scholars.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe our work on text-image alignment in context of building a historical document retrieval system. We
aim at aligning images of words in handwritten lines with their text transcriptions. The images of handwritten
lines are automatically segmented from the scanned pages of historical documents and then manually transcribed.
To train automatic routines to detect words in an image of handwritten text, we need a training set - images of
words with their transcriptions. We present our results on aligning words from the images of handwritten lines and
their corresponding text transcriptions. Alignment based on the longest spaces between portions of handwriting
is a baseline. We then show that relative lengths, i.e. proportions of words in their lines, can be used to improve
the alignment results considerably. To take into account the relative word length, we define the expressions for
the cost function that has to be minimized for aligning text words with their images. We apply right to left
alignment as well as alignment based on exhaustive search. The quality assessment of these alignments shows
correct results for 69% of words from 100 lines, or 90% of partially correct and correct alignments combined.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The document boundary determination problem is the process of identifying individual documents in a stack of
papers. In this paper, we report on a classification system for automation of this process. The system employs
features based on document structure and lexical content. We also report on experimental results to support the
effectiveness of this system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a segmentation method of continuous document flow. A document flow is a list of successive
scanned pages, put in a production chain, representing several documents without explicit separation mark between
them. To separate the documents for their recognition, it is needed to analyze the content of the successive pages
and to point out the limit pages of each document. The method proposed here is similar to the variable horizon
models (VHM) or multi-grams used in speech recognition. It consists in maximizing the flow likelihood knowing all
the Markov Models of the constituent elements. As the calculation of this likelihood on all the flow is NP-complete,
the solution consists in studying them in windows of reduced observations. The first results obtained on
homogeneous flows of invoices reaches more than 75% of precision and 90% of recall.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents
since it requires character recognition and indexing of the document images. A general technique for word spotting is
presented, independent of OCR, using automatic representation of the text queries of the user by word images and
comparing them with the word images extracted from the document images. The proposed system does not require
training. The only required preprocessing task is the alphabet determination. Global shape features are used to describe
the words. They are very general in order to capture the form of the word and appropriately normalized in order to face
the usual problems of variance in resolution, width of words and fonts. A novel technique that makes use of the
interpolation method is presented. In our experiments, we analyze the system dependence on its parameters and we prove
that its performance is similar to the trainable systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Modern digital libraries offer all the hyperlinking possibilities of the World Wide Web: when a reader finds a
citation of interest, in many cases she can now click on a link to be taken to the cited work. This paper presents
work aimed at providing the same ease of navigation for legacy PDF document collections that were created
before the possibility of integrating hyperlinks into documents was ever considered. To achieve our goal, we need
to carry out two tasks: first, we need to identify and link citations and references in the text with high reliability;
and second, we need the ability to determine physical PDF page locations for these elements. We demonstrate the
use of a high-accuracy citation extraction algorithm which significantly improves on earlier reported techniques,
and a technique for integrating PDF processing with a conventional text-stream based information extraction
pipeline. We demonstrate these techniques in the context of a particular document collection, this being the ACL
Anthology; but the same approach can be applied to other document sets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bibliographical references that appear in journal articles can provide valuable hints for subsequent information
extraction. We describe our statistical machine learning algorithms for locating and parsing such references from HTML
medical journal articles. Reference locating identifies the reference sections and then decomposes them into individual
references. We formulate reference locating as a two-class classification problem based on text and geometric features.
An evaluation conducted on 500 articles from 100 journals achieves near perfect precision and recall rates for locating
references. Reference parsing is to identify components, e.g. author, article title, journal title etc., from each individual
reference. We implement and compare two reference parsing algorithms. One relies on sequence statistics and trains a
Conditional Random Field. The other focuses on local feature statistics and trains a Support Vector Machine to classify
each individual word, and then a search algorithm systematically corrects low confidence labels if the label sequence
violates a set of predefined rules. The overall performance of these two reference parsing algorithms is about the same:
above 99% accuracy at the word level, and over 97% accuracy at the chunk level.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As new innovative devices, accepting or producing on-line documents, emerge, managing facilities for these
kinds of documents such as topic spotting are required. This means that we should be able to perform text
categorization of on-line documents. The textual data available in on-line documents can be extracted through online
recognition, a process which produces noise, i.e. errors, in the resulting text. This work reports experiments
on categorization of on-line handwritten documents based on their textual contents. We analyze the effect of the
word recognition rate on the categorization performances, by comparing the performances of a categorization
system over the texts obtained through on-line handwriting recognition and the same texts available as ground
truth. Two categorization algorithms (kNN and SVM) are compared in this work. A subset of the Reuters-21578
corpus consisting of more than 2000 handwritten documents has been collected for this study. Results show that
accuracy loss is not significant, and precision loss is only significant for recall values of 60%-80% depending on
the noise levels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a comparison between two different combination schemes for the improvement of the
performance of Arabic handwriting recognition systems. Several recognition systems (here considered as black
box systems) are used from the participating systems of the Arabic handwriting recognition competition at
ICDAR 2007. The outputs of these systems provide the input of our combination schemes. The first combination
schemes are based on fixed fusion using logical rules, while the second one are based on trainable rules. After
the normalization step of the recognition confidences and the combination of the outputs, the improvement is
evaluated in term of recognition rates of a multi-classifier system with or without reject. The participating
systems use the sets a to e of the IfN/ENIT database for training, and we use the set f for tests. Applying the
combination rules, the results show a high recognition rate of about 95% without reject, which corresponds to
an improvement of recognition rates between 8% and 15% compared to results at the ICDAR 2007 competition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a robust model for on-line handwritten Japanese text recognition. The method evaluates the
likelihood of candidate segmentation paths by combining scores of character pattern size, inner gap, character
recognition, single-character position, pair-character position, likelihood of candidate segmentation point and linguistic
context. The path score is insensitive to the number of candidate patterns and the optimal path can be found by the
Viterbi search. In experiments of handwritten Japanese sentence recognition, the proposed method yielded superior
performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We are interested in the problem of curve identification, motivated by problems in handwriting recognition.
Various geometric approaches have been proposed, with one of the most popular being "elastic matching." We
examine the problem using distances defined by inner products on functional spaces. In particular we examine the
Legendre and Legendre-Sobolev inner products. We show that both of these can be computed in online constant
time. We compare both with elastic matching and conclude that the Legendre-Sobolev distance measure provides
a competitive alternative to elastic matching, being almost as accurate and much faster.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Character Recognition and Document Retrieval still are very interesting research area although great
progress in performance has been made over the last decades. Advanced research topics in character recognition
and Document analysis are introduced in this paper, which include the further research in Tsinghai University on
handwritten Chinese character recognition, multilingual character recognition and writer identification. In
handwritten Chinese character recognition a special cascade MQDF classifier is discussed for unconstrained
cursive handwritten Chinese Character recognition and an optimum handwritten strip recognition algorithm is
introduced. In writer identification content dependent and content independent algorithms are discussed. In
multilingual character recognition a THOCR multilingual, including Japanese, Korean, Tibetan, Mongolian,
Uyghur, Arabic document recognition system is introduced in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel statistical model for determining whether a pair of documents, a known and a questioned, were written
by the same individual is proposed. The goal of this formulation is to learn the specific uniqueness of style in a
particular author's writing, given the known document. Since there are often insufficient samples to extrapolate
a generalized model of an writer's handwriting based solely on the document, we instead generalize over the
differences between the author and a large population of known different writers. This is in contrast to an earlier
model proposed whereby probability distributions were a priori without learning. We show the performance of
the model along with a comparison in performance to the non-learning, older model, which shows significant
improvement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Writer identification is a topic of much renewed interest today because of its importance in applications such as writer
adaptation, routing of documents and forensic document analysis. Various algorithms have been proposed to handle such
tasks. Of particular interests are the approaches that use allographic features [1-3] to perform a comparison of the
documents in question. The allographic features are used to define prototypes that model the unique handwriting styles
of the individual writers. This paper investigates a novel perspective that takes alphabetic information into consideration
when the allographic features are clustered into prototypes at the character level. We hypothesize that alphabetic
information provides additional clues which help in the clustering of allographic prototypes. An alphabet information
coefficient (AIC) has been introduced in our study and the effect of this coefficient is presented. Our experiments
showed an increase of writer identification accuracy from 66.0% to 87.0% when alphabetic information was used in
conjunction with allographic features on a database of 200 reference writers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When is it safe to use synthetic training data in supervised classification? Trainable classifier technologies require
large representative training sets consisting of samples labeled with their true class. Acquiring such training sets
is difficult and costly. One way to alleviate this problem is to enlarge training sets by generating artificial,
synthetic samples. Of course this immediately raises many questions, perhaps the first being "Why should we
trust artificially generated data to be an accurate representative of the real distributions?" Other questions
include "When will training on synthetic data work as well as - or better than training on real data ?".
We distinguish between sample space (the set of real samples), parameter space (all samples that can be
generated synthetically), and finally, feature space (the set of samples in terms of finite numerical values). In
this paper, we discuss a series of experiments, in which we produced synthetic data in parameter space, that is,
by convex interpolation among the generating parameters for samples and showed we could amplify real data
to produce a classifier that is as accurate as a classifier trained on real data. Specifically, we have explored the
feasibility of varying the generating parameters for Knuth's Metafont system to see if previously unseen fonts
could also be recognized. We also varied parameters for an image quality model.
We have found that training on interpolated data is for the most part safe, that is to say never produced
more errors. Furthermore, the classifier trained on interpolated data often improved class accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either
independent or coupled, for the recognition of degraded characters. The independent classifiers are a
vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image
rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and
horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers
are then combined linearly at the decision level. We compare the different classifiers -independent,
coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits
and the recognition of real degraded old printed characters. Our results show that coupled DBNs
perform better on degraded characters than the linear combination of independent HMM scores. Our
results also show that the best classifier is obtained by linearly combining the scores of the best coupled
DBN and the best independent HMM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many documents contain (free-hand) underlining, "COPY" stamps, crossed out text, doodling and other "clutter" that
occlude the text. In many cases, it is not possible to separate the text from the clutter. Commercial OCR solutions
typically fail for cluttered text. We present a new method for finding the clutter using path analysis of points on the
skeleton of the clutter/text connected component. This method can separate the clutter from the text even for fairly
complex clutter shapes.
Even with good localization of occluding clutter, it is difficult to use feature-based recognition for occluded characters,
simply because the clutter affects the features in various ways. We propose a new algorithm that uses adapted templates
of the font in the document that can be used for all forms of occlusion of the character. The method finds the simulated
localization of the corresponding clutter in the templates and compares the unaffected parts of the templates and the
character. The method has proved highly successful even when much of the character is occluded. We present examples
of clutter localization and character recognition with occluded characters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mongolian is one of the major ethnic languages in China. Large amount of Mongolian printed documents need to be
digitized in digital library and various applications. Traditional Mongolian script has unique writing style and multi-font-type
variations, which bring challenges to Mongolian OCR research. As traditional Mongolian script has some
characteristics, for example, one character may be part of another character, we define the character set for recognition
according to the segmented components, and the components are combined into characters by rule-based post-processing
module. For character recognition, a method based on visual directional feature and multi-level classifiers is presented.
For character segmentation, a scheme is used to find the segmentation point by analyzing the properties of projection and
connected components. As Mongolian has different font-types which are categorized into two major groups, the
parameter of segmentation is adjusted for each group. A font-type classification method for the two font-type group is
introduced. For recognition of Mongolian text mixed with Chinese and English, language identification and relevant
character recognition kernels are integrated. Experiments show that the presented methods are effective. The text
recognition rate is 96.9% on the test samples from practical documents with multi-font-types and mixed scripts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In large scale scanning applications, orientation detection of the digitized page is necessary for the following
procedures to work correctly. Several existing methods for orientation detection use the fact that in Roman
script text, ascenders are more likely to occur than descenders. In this paper, we propose a different approach
for page orientation detection that uses this information. The main advantage of our method is that it is more
accurate than compared widely used methods, while being scan resolution independent. Another interesting
aspect of our method is that it can be combined with our previously published method for skew detection to
have a single-step skew and orientation estimate of the page image. We demonstrate the effectiveness of our
approach on the UW-I dataset and show that our method achieves an accuracy of above 99% on this dataset. We
also show that our method is robust to different scanning resolutions and can reliably detect page orientations
for documents rendered at 150, 200, 300, and 400 dpi.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper addresses to text line extraction in free style document, such as business card, envelope, poster, etc. In free
style document, global property such as character size, line direction can hardly be concluded, which reveals a grave
limitation in traditional layout analysis.
'Line' is the most prominent and the highest structure in our bottom-up method. First, we apply a novel intensity
function found on gradient information to locate text areas where gradient within a window have large magnitude and
various directions, and split such areas into text pieces. We build a probability model of lines consist of text pieces via
statistics on training data. For an input image, we group text pieces to lines using a simulated annealing algorithm with
cost function based on the probability model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A page of a document is a set of small components which are grouped by a human reader into higher level components,
such as lines and text blocs. Document image analysis is aimed at detecting these components in document images. We
propose the encoding of local information by considering the properties that determine perceptual grouping. Each
connected component is labelled according to the location of its nearest neighbour connected component. These labelled
components constitute the input of a rule-based incremental process. Vertical and horizontal text lines are detected
without prior assumption on their direction. Touching characters belonging to different lines are detected early and
discarded from the grouping process to avoid line merging. The tolerance for grouping components increases in the
course of the process until the final decision. After each step of the grouping process, conflict resolution rules are
activated. This work was motivated by the automatic detection of Figure&Caption pairs in the documents of the
historical collection of the BIUM digital library (Bibliotheque InterUniversitaire Medicale). The images that were used
in this study belong to this collection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In previous work we showed that Look Up Table (LUT) classifiers can be trained to learn patterns of degradation
and correction in historical document images. The effectiveness of the classifiers is directly proportional to the
size of the pixel neighborhood it considers. However, the computational cost increases almost exponentially with
the neighborhood size. In this paper, we propose a novel algorithm that encodes the neighborhood information
efficiently using a shape descriptor. Using shape descriptor features, we are able to characterize the pixel
neighborhood of document images with much fewer bits and so obtain an efficient system with significantly
reduced computational cost. Experimental results demonstrate the effectiveness and efficiency of the proposed
approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose a mosaicing method of camera-captured document images. Since document images
captured using digital cameras suffer from perspective distortion, their alignment is a diffcult task for previous
methods. In the proposed method, correspondences of feature points are calculated using an image retrieval
method LLAH. Document images are aligned using a perspective transformation parameter estimated from the
correspondences. Since LLAH is invariant to perspective distortion, feature points can be matched without
compensation of perspective distortion. Experimental results show that document images captured by a digital
camera can be stitched using the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Analyzing paper-based election ballots requires finding all marks added to the base ballot. The position, size, shape,
rotation and shade of these marks are not known a priori. Scanned ballot images have additional differences from the
base ballot due to scanner noise. Different image processing techniques are evaluated to see under what conditions they
are able to detect what sorts of marks. Basing mark detection on the difference of raw images was found to be much
more sensitive to the mark darkness. Converting the raw images to foreground and background and then removing the
form produced better results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The semi-text-independent method of writer verification based on the linear framework is a method that can use
all characters of two handwritings to discriminate the writers in the condition of knowing the text contents. The
handwritings are allowed to just have small numbers of even totally different characters. This fills the vacancy
of the classical text-dependent methods and the text-independent methods of writer verification. Moreover, the
information, what every character is, is used for the semi-text-independent method in this paper. Two types
of standard templates, generated from many writer-unknown handwritten samples and printed samples of each
character, are introduced to represent the content information of each character. The difference vectors of the
character samples are gotten by subtracting the standard templates from the original feature vectors and used
to replace the original vectors in the process of writer verification. By removing a large amount of content
information and remaining the style information, the verification accuracy of the semi-text-independent method
is improved. On a handwriting database involving 30 writers, when the query handwriting and the reference
handwriting are composed of 30 distinct characters respectively, the average equal error rate (EER) of writer
verification reaches 9.96%. And when the handwritings contain 50 characters, the average EER falls to 6.34%,
which is 23.9% lower than the EER of not using the difference vectors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In camera-based optical character recognition (OCR) applications, warping is a primary problem. Warped document
images should be restored before they are recognized by traditional OCR algorithm. This paper presents a novel
restoration approach, which first makes an estimation of baseline and vertical direction estimation based on rough line
and character segmentation, then selects several key points and determines their restoration mapping as a result of the
estimation step, at last performs Thin-Plate Splines (TPS) interpolation on full page image using these key points
mapping. The restored document image is expected to have straight baselines and erect character direction. This method
can restore arbitrary local warping as well as keep the restoration result natural and smooth, consequently improves the
performance of the OCR application. Experiments on several camera captured warped document images show
effectiveness of this approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many governments have some form of "direct democracy" legislation procedure whereby individual citizens can
propose various measures creating or altering laws. Generally, such a process is started with the gathering of a
large number of signatures. There is interest in whether or not there are fraudulent signatures present in such a
petition, and if so what percentage of the signatures are indeed fraudulent. However, due to the large number
of signatures (tens of thousands), it is not feasible to have a document examiner verify the signatures directly.
Instead, there is interest in creating a subset of signatures where there is a high probability of fraud that can be
verified. We present a method by which a pairwise comparison of signatures can be performed and subsequent
sorting can generate such subsets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new approach to Arabic printed text analysis and recognition. This approach is based on
linguistic concepts of Arabic vocabulary. For the text, we allow to categorize the words in decomposable words (derived
from a root) and indecomposable words (not derived from a root) and to put forth morpho-syntactic characterization
hypotheses for each word. For the decomposable words, we attempt to recognize word basic morphemes: antefix, prefix,
infix, suffix, postfix and root contrary to existing approaches which are usually based on recognition of word entity by
holistic approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a new sliding window based local thresholding technique 'NICK' and give a detailed
comparison of some existing sliding-window based thresholding algorithms with our method. The proposed method aims
at achieving better binarization results, specifically, for ancient document images. NICK has been inspired from the
Niblack's binarization method and exhibits its robustness and effectiveness when evaluated on low quality ancient
document images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biomedical images are invaluable in medical education and establishing clinical diagnosis. Clinical decision support
(CDS) can be improved by combining biomedical text with automatically annotated images extracted from relevant
biomedical publications. In a previous study we reported 76.6% accuracy using supervised machine learning on the
feasibility of automatically classifying images by combining figure captions and image content for usefulness in finding
clinical evidence. Image content extraction is traditionally applied on entire images or on pre-determined image regions.
Figure images articles vary greatly limiting benefit of whole image extraction beyond gross categorization for CDS due
to the large variety. However, text annotations and pointers on them indicate regions of interest (ROI) that are then
referenced in the caption or discussion in the article text. We have previously reported 72.02% accuracy in text and
symbols localization but we failed to take advantage of the referenced image locality.
In this work we combine article text analysis and figure image analysis for localizing pointer (arrows, symbols) to extract
ROI pointed that can then be used to measure meaningful image content and associate it with the identified biomedical
concepts for improved (text and image) content-based retrieval of biomedical articles. Biomedical concepts are identified
using National Library of Medicine's Unified Medical Language System (UMLS) Metathesaurus. Our methods report
an average precision and recall of 92.3% and 75.3%, respectively on identifying pointing symbols in images from a
randomly selected image subset made available through the ImageCLEF 2008 campaign.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Traditional classifiers are trained from labeled data only. Labeled samples are often expensive to obtain, while unlabeled
data are abundant. Semi-supervised learning can therefore be of great value by using both labeled and unlabeled data for
training. We introduce a semi-supervised learning method named decision-directed approximation combined with
Support Vector Machines to detect zones containing information on grant support (a type of bibliographic data) from
online medical journal articles. We analyzed the performance of our model using different sizes of unlabeled samples,
and demonstrated that our proposed rules are effective to boost classification accuracy. The experimental results show
that the decision-directed approximation method with SVM improves the classification accuracy when a small amount of
labeled data is used in conjunction with unlabeled data to train the SVM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For user convenience, processing of document images captured by a digital camera has been attracted much attention.
However, most existing processing methods require an upright image such like captured by a scanner. Therefore, we have
to cancel perspective distortion of a camera-captured image before processing. Although there are rectification methods of
the distortion, most of them work under certain assumptions on the layout; the borders of a document are available, textlines
are in parallel, a stereo camera or a video image is required and so on. In this paper, we propose a layout-free rectification
method which requires none of the above assumptions. We confirm the effectiveness of the proposed method by experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ancient documents present an important part of our individual and collective memory. In addition
to their preservation, the digitization of these documents may offer users a great number of services
like remote look-up and browsing rare documents. However, the documents, digitally formed, are likely
to be modified or pirated. Therefore, we need to develop techniques of protecting images stemming
from ancient documents. Watermarking figures to be one of the promising solutions. Nevertheless, the
performance of watermarking procedure depends on being neither too robust nor too invisible. Thus,
choosing the insertion field or mode as well as the carrier points of the signature is decisive.
We propose in this work a method of watermarking images stemming from ancient documents based
on wavelet packet decomposition. The insertion is carried out into the maximum amplitude ratio being
in the best base of decomposition, which is determined beforehand according to a criterion on entropy.
This work is part of a project of digitizing ancient documents in cooperation with the National Library
of Tunis (BNT).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a system for script identification of handwritten
word images. The system is divided into two main
phases, training and testing. The training phase performs a
moment based feature extraction on the training word images
and generates their corresponding feature vectors. The testing
phase extracts moment features from a test word image
and classifies it into one of the candidate script classes using
information from the trained feature vectors. Experiments
are reported on handwritten word images from three scripts:
Latin, Devanagari and Arabic. Three different classifiers are
evaluated over a dataset consisting of 12000 word images in
training set and 7942word images in testing set. Results show
significant strength in the approach with all the classifiers having
a consistent accuracy of over 97%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.