The paper describes the use of Conditional Random Fields(CRF) utilizing contextual information in automatically
labeling extracted segments of scanned documents as Machine-print, Handwriting and Noise. The result of
such a labeling can serve as an indexing step for a context-based image retrieval system or a bio-metric signature
verification system. A simple region growing algorithm is first used to segment the document into a number of
patches. A label for each such segmented patch is inferred using a CRF model. The model is flexible enough
to include signatures as a type of handwriting and isolate it from machine-print and noise. The robustness of
the model is due to the inherent nature of modeling neighboring spatial dependencies in the labels as well as
the observed data using CRF. Maximum pseudo-likelihood estimates for the parameters of the CRF model are
learnt using conjugate gradient descent. Inference of labels is done by computing the probability of the labels
under the model with Gibbs sampling. Experimental results show that this approach provides for 95.75% of the
data being assigned correct labels. The CRF based model is shown to be superior to Neural Networks and Naive
Bayes.
The fingerprint verification task answers the question of whether or not two fingerprints belongs to the same finger. The paper focuses on the classification aspect of fingerprint verification. Classification is the third and final step after after the two earlier steps of feature extraction, where a known set of features (minutiae points) have been extracted from each fingerprint, and scoring, where a matcher has determined a degree of match between the two sets of features. Since this is a binary classification problem involving a single variable, the commonly used threshold method is related to the so-called receiver operating characteristics (ROC). In the ROC approach the optimal threshold on the score is determined so as to determine match or non-match. Such a method works well when there is a well-registered fingerprint image. On the other hand more sophisticated methods are needed when there exists a partial imprint of a finger- as in the case of latent prints in forensics or due to limitations of the biometric device. In such situations it is useful to consider classification methods based on computing the likelihood ratio of match/non-match. Such methods are commonly used in some biometric and forensic domains such as speaker verification where there is a much higher degree of uncertainty. This paper compares the two approaches empirically for the fingerprint classification task when the number of available minutiae are varied. In both ROC-based and likelihood ratio methods, learning is from a general population of ensemble of pairs, each of which is labeled as being from the same finger or from different fingers. In the ROC-based method the best operating point is derived from the ROC curve. In the likelihood method the distributions of same finger and different finger scores are modeled using Gaussian and Gamma distributions. The performances of the two methods are compared for varying numbers of minutiae points available. Results show that the likelihood method performs better than the ROC-based method when fewer minutiae points are available. Both methods converge to the same accuracy as more minutiae points are available.
New machine learning strategies are proposed for person identification which can be used in several biometric
modalities such as friction ridges, handwriting, signatures and speech. The biometric or forensic performance
task answers the question of whether or not a sample belongs to a known person. Two different learning paradigms
are discussed: person-independent (or general learning) and person-dependent (or person-specific learning). In
the first paradigm, learning is from a general population of ensemble of pairs, each of which is labelled as being
from the same person or from different persons- the learning process determines the range of variations for given
persons and between different persons. In the second paradigm the identity of a person is learnt when presented
with multiple known samples of that person- where the variation and similarities within a particular person are
learnt. The person-specific learning strategy is seen to perform better than general learning (5% higher performace
with signatures). Improvement of person-specific performance with increasing number of samples is also
observed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.