For digital imagery, face detection and identification are functions of great importance in wide-ranging applications, including full facial recognition systems. The development and evaluation of unique and existing face detection and face identification applications require a significant amount of data. Increased availability of such data volumes could benefit the formulation and advancement of many biometric algorithms. Here, the utility of using synthetically generated face data to evaluate facial biometry methodologies to a precision that would be unrealistic for a parametrically uncontrolled dataset, is demonstrated. Particular attention is given to similarity metrics, symmetry within and between recognition algorithms, discriminatory power and optimality of pan and/or tilt in reference images or libraries, susceptibilities to variations, identification confidence, meaningful identification mislabelings, sensitivity, specificity, and threshold values. The face identification results, in particular, could be generalized to address shortcomings in various applications and help to inform the design of future strategies.
An efficient parallel architecture design for the iris unwrapping process in a real-time iris recognition system using the
Bresenham Circle Algorithm is presented in this paper. Based on the characteristics of the model parameters this
algorithm was chosen over the widely used polar conversion technique as the iris unwrapping model. The architecture
design is parallelized to increase the throughput of the system and is suitable for processing an inputted image size of
320 × 240 pixels in real-time using Field Programmable Gate Array (FPGA) technology. Quartus software is used to
implement, verify, and analyze the design’s performance using the VHSIC Hardware Description Language. The
system’s predicted processing time is faster than the modern iris unwrapping technique used today∗.
Improvements in face detection performance would benefit many applications. The OpenCV library implements a standard solution, the Viola-Jones detector, with a statistically boosted rejection cascade of binary classifiers. Empirical evidence has shown that Viola-Jones underdetects in some instances. This research shows that a truncated cascade augmented by a neural network could recover these undetected faces. A hybrid framework is constructed, with a truncated Viola-Jones cascade followed by an artificial neural network, used to refine the face decision. Optimally, a truncation stage that captured all faces and allowed the neural network to remove the false alarms is selected. A feedforward backpropagation network with one hidden layer is trained to discriminate faces based upon the thresholding (detection) values of intermediate stages of the full rejection cascade. A clustering algorithm is used as a precursor to the neural network, to group significant overlappings. Evaluated on the CMU/VASC Image Database, comparison with an unmodified OpenCV approach shows: (1) a 37% increase in detection rates if constrained by the requirement of no increase in false alarms, (2) a 48% increase in detection rates if some additional false alarms are tolerated, and (3) an 82% reduction in false alarms with no reduction in detection rates. These results demonstrate improved face detection and could address the need for such improvement in various applications.
KEYWORDS: Video surveillance, Video, Video processing, Field programmable gate arrays, Image processing, Prototyping, Embedded systems, Parallel processing, Digital signal processing, Standards development
FPGA devices with embedded DSP and memory blocks, and high-speed interfaces are ideal for real-time video
processing applications. In this work, a hardware-software co-design approach is proposed to effectively utilize FPGA
features for a prototype of an automated video surveillance system. Time-critical steps of the video surveillance
algorithm are designed and implemented in the FPGAs logic elements to maximize parallel processing. Other non timecritical
tasks are achieved by executing a high level language program on an embedded Nios-II processor. Pre-tested and
verified video and interface functions from a standard video framework are utilized to significantly reduce development
and verification time. Custom and parallel processing modules are integrated into the video processing chain by Altera's
Avalon Streaming video protocol. Other data control interfaces are achieved by connecting hardware controllers to a
Nios-II processor using Altera's Avalon Memory Mapped protocol.
In the past two years the processing power of video graphics cards has quadrupled and is approaching super computer
levels. State-of-the-art graphical processing units (GPU) boast of theoretical computational performance in the range of
1.5 trillion floating point operations per second (1.5 Teraflops). This processing power is readily accessible to the
scientific community at a relatively small cost. High level programming languages are now available that give access to
the internal architecture of the graphics card allowing greater algorithm optimization. This research takes memory access
expensive portions of an image-based iris identification algorithm and hosts it on a GPU using the C++ compatible
CUDA language. The selected segmentation algorithm uses basic image processing techniques such as image inversion,
value squaring, thresholding, dilation, erosion and memory/computationally intensive calculations such as the circular
Hough transform. Portions of the iris segmentation algorithm were accelerated by a factor of 77 over the 2008 GPU
results. Some parts of the algorithm ran at speeds that were over 1600 times faster than their CPU counterparts. Strengths
and limitations of the GPU Single Instruction Multiple Data architecture are discussed. Memory access times, instruction
execution times, programming details and code samples are presented as part of the research.
The iris is currently believed to be one of the most accurate biometrics for human identification. The majority of fielded iris identification systems use fractional Hamming distance to compare a new feature template to a stored database. Fractional Hamming distance is extremely fast, but mathematically weights all regions of the iris equally. Research has shown that different regions of the iris contain varying levels of discriminatory information when using circular boundary assumptions. This research evaluates four statistical metrics for accuracy improvements on low resolution and poor quality images. Each metric statistically weights iris regions in an attempt to use the iris information in a more intelligent manner. A similarity metric extracted from the output stage of an artificial neural network demonstrated the most promise. Experiments were performed using occluded, subsampled, and motion blurred images from the CASIA, University of Bath, and ICE 2005 databases. The neural network-based metric improved accuracy at nearly every operating point.
One of the basic challenges to robust iris recognition is iris segmentation. This paper proposes the
use of an artificial neural network and a feature saliency algorithm to better localize boundary pixels of the iris.
No circular boundary assumption is made. A neural network is used to near-optimally combine current iris
segmentation methods to more accurate localize the iris boundary. A feature saliency technique is performed to
determine which features contain the greatest discriminatory information. Both visual inspection and
automated testing showed greater than 98 percent accuracy in determining which pixels in an image of the eye
were iris pixels when compared to human determined boundaries.
Iris recognition algorithms depend on image processing techniques for proper segmentation of the iris. In the Ridge
Energy Direction (RED) iris recognition algorithm, the initial step in the segmentation process searches for the pupil by
thresholding and using binary morphology functions to rectify artifacts obfuscating the pupil. These functions take
substantial processing time in software on the order of a few hundred million operations. Alternatively, a hardware
version of the binary morphology functions is implemented to assist in the segmentation process. The hardware binary
morphology functions have negligible hardware footprint and power consumption while achieving speed up of 200 times
compared to the original software functions.
KEYWORDS: Facial recognition systems, Field programmable gate arrays, Video, Sensors, Detection and tracking algorithms, Statistical analysis, Image processing, Video surveillance, Digital signal processing, Computer simulations
The first step in a facial recognition system is to find and extract human faces in a static image or video frame. Most face
detection methods are based on statistical models that can be trained and then used to classify faces. These methods are
effective but the main drawback is speed because a massive number of sub-windows at different image scales are
considered in the detection procedure. A robust face detection technique based on an encoded image known as an
"integral image" has been proposed by Viola and Jones. The use of an integral image helps to reduce the number of
operations to access a sub-image to a relatively small and fixed number. Additional speedup is achieved by incorporating
a cascade of simple classifiers to quickly eliminate non-face sub-windows. Even with the reduced number of accesses to
image data to extract features in Viola-Jones algorithm, the number of memory accesses is still too high to support realtime
operations for high resolution images or video frames. The proposed hardware design in this research work
employs a modular approach to represent the "integral image" for this memory-intensive application. An efficient
memory manage strategy is also proposed to aggressively utilize embedded memory modules to reduce interaction with
external memory chips. The proposed design is targeted for a low-cost FPGA prototype board for a cost-effective face
detection/recognition system.
Iris recognition systems have recently become an attractive identification method because of their extremely high
accuracy. Most modern iris recognition systems are currently deployed on traditional sequential digital systems, such as
a computer. However, modern advancements in configurable hardware, most notably Field-Programmable Gate Arrays
(FPGAs) have provided an exciting opportunity to discover the parallel nature of modern image processing algorithms.
In this study, iris matching, a repeatedly executed portion of a modern iris recognition algorithm is parallelized on an
FPGA system. We demonstrate a 19 times speedup of the parallelized algorithm on the FPGA system when compared to
a state-of-the-art CPU-based version.
Iris recognition is an increasingly popular biometric due to its relative ease of use and high reliability. However, commercially available systems typically require on-axis images for recognition, meaning the subject is looking in the direction of the camera. The feasibility of using off-axis images is an important area of investigation for iris systems with more flexible user interfaces. The authors present an analysis of two image transform processes for off-axis images and an analysis of the utility of correcting for cornea refraction effects. The performance is assessed on the U.S. Naval Academy iris image database using the Ridge Energy Direction recognition algorithm developed by the authors, as well as with a commercial implementation of the Daugman algorithm.
The iris contains fibrous structures of various sizes and orientations which can be used for human identification.
Drawing from a directional energy iris identification technique, this paper investigates the size, orientation, and location
of the iris structures that hold stable discriminatory information. Template height, template width, filter size, and the
number of filter orientations were investigated for their individual and combined impact on identification accuracy.
Further, the iris was segmented into annuli and radial sectors to determine in which portions of the iris the best
discriminatory information is found. Over 2 billion template comparisons were performed to produce this analysis.
The iris is currently believed to be the most accurate biometric for human identification. The majority of fielded iris
identification systems are based on the highly accurate wavelet-based Daugman algorithm. Another promising
recognition algorithm by Ives et al uses Directional Energy features to create the iris template. Both algorithms use
Hamming distance to compare a new template to a stored database. Hamming distance is an extremely fast computation,
but weights all regions of the iris equally. Work from multiple authors has shown that different regions of the iris contain
varying levels of discriminatory information. This research evaluates four post-processing similarity metrics for
accuracy impacts on the Directional Energy and wavelets based algorithms. Each metric builds on the Hamming distance
method in an attempt to use the template information in a more salient manner. A similarity metric extracted from the
output stage of a feed-forward multi-layer perceptron artificial neural network demonstrated the most promise. Accuracy
tables and ROC curves of tests performed on the publicly available Chinese Academy of Sciences Institute of
Automation database show that the neural network based distance achieves greater accuracy than Hamming distance at
every operating point, while adding less than one percent computational overhead.
The human iris is perhaps the most accurate biometric
for use in identification. Commercial iris recognition systems currently
can be found in several types of settings where a person’s
true identity is required: to allow passengers in some airports to be
rapidly processed through security; for access to secure areas; and
for secure access to computer networks. The growing employment
of iris recognition systems and the associated research to develop
new algorithms will require large databases of iris images. If the
required storage space is not adequate for these databases, image
compression is an alternative. Compression allows a reduction in
the storage space needed to store these iris images. This may, however,
come at a cost: some amount of information may be lost in the
process. We investigate the effects of image compression on the
performance of an iris recognition system. Compression is performed
using JPEG-2000 and JPEG, and the iris recognition algorithm
used is an implementation of the Daugman algorithm. The
imagery used includes both the CASIA iris database as well as the
iris database collected by the University of Bath. Results demonstrate
that compression up to 50:1 can be used with minimal effects
on recognition.
Video surveillance is ubiquitous in modern society, but surveillance cameras are severely limited in utility by their low
resolution. With this in mind, we have developed a system that can autonomously take high resolution still frame
images of moving objects. In order to do this, we combine a low resolution video camera and a high resolution still
frame camera mounted on a pan/tilt mount. In order to determine what should be photographed (objects of interest), we
employ a hierarchical method which first separates foreground from background using a temporal-based median
filtering technique. We then use a feed-forward neural network classifier on the foreground regions to determine
whether the regions contain the objects of interest. This is done over several frames, and a motion vector is deduced for
the object. The pan/tilt mount then focuses the high resolution camera on the next predicted location of the object, and
an image is acquired. All components are controlled through a single MATLAB graphical user interface (GUI). The
final system we present will be able to detect multiple moving objects simultaneously, track them, and acquire high
resolution images of them. Results will demonstrate performance tracking and imaging varying numbers of objects
moving at different speeds.
This paper investigates the use of a beta value derived from a receiver operator characteristic curve for target recognition. Using a physiologically-motivated sensor-fusion algorithm, lower-level data is filtered and fused using a pulse-coupled neural network (PCNN) to represent the feature processing of the parvocellular and magnetocellular pathways. High level decision making includes feature association from the PCNN filter, information fusion, and selection of a signal-detection beta value that optimizes performance. A beta value is represent bias based on a likelihood ratio of Gaussian distributions that can be used as a decision strategy to discriminate between targets. By employing a beta value as the output of the physiologic- motivated sensor fusion algorithm, targets are classified based on the fusion of feature data.
This paper explores using linear regression and artificial neural networks (ANN) to model the performance of an ATR algorithm based on a given set of data. Here, a probability of detection response surfaces as a function of relevant parameters is simulated. It is then shown that this surface can be approximated using either linear regression or an ANN with good results. These regression surfaces can provide valuable information to the ATR developer/customer in terms of trying to predict ATR performance in untested areas. The application of this ATR performance modeling methodology becomes clear when we consider applying it to a common problem, such as air-to-ground target detection, where the changing parameters of the target can give a good set of data points from which to build the response curve.
A model-based vision (MBV) approach to automatic target cuing/recognition (ATC/R) using real infrared (IR) image data will be presented. The MBV-ATC/R is comprised of three parts: Focus of attention (FOA), indexing, and match/search. The FOA module analyzes the IR image and extracts (segments) regions of interest that may contain targets. The focus of this article will be on the FOA portion of the MBV-ATC/R approach. In particular, three methods of FOA will be optimized, compared, and fused. The first FOA module is a Least Asymmetrical Daubechies wavelet decomposition method. The second FOA module is a physiologically based Difference of Gaussians. The third FOA module is a Morphological hit- and-miss transform. The three FOA algorithms are individually optimized using a genetic algorithm. Then an adaptive pulse coupled neural network is used to fuse the results.
This paper uses a high level vision model to describe the information passing and linking within the primate visual system. Information linking schemes, such as state dependent modulation and temporal synchronization, are presented as methods the vision system uses to combine information using expectation to fill in missing information and remove unneeded information. The possibility of using linking methods derived from physiologically based theoretical models to combine current image processing techniques for pattern recognition purposes is investigated. These image processing techniques are transforms such as (but not limited to) wavelet filters, hit or miss filters, morphological filters, and difference of gausian filters. These particular filters are chosen because they simulate functions that are performed in the primate visual system. To implement the physiologically motivated linking methods, the Pulse Coupled Neural Network (PCNN) is chosen as a basic building block for the vision model which performs linking at the neuronal pulse level. Last, an image fusion network which incorporates information linking based on the PCNN is described, and initial results are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.