Gender classification, a two-class problem (male or female), has been the subject of extensive research recently and gained a lot of attention due to its varied set of applications. The proposed work relies on individual facial features to train a convolutional neural network (CNN) for gender classification. In contrast with previously reported results that assume the facial features are independent, we consider the facial features as correlated features by training a single CNN that jointly learns from all facial features. In terms of accuracy, our results either outperform, or are on par with, other gender classification techniques applied to three different datasets namely specs on faces, groups, and face recognition technology. In terms of performance, the proposed CNN has significantly fewer parameters as compared with other techniques reported in the literature. Our learnable parameters are fewer than those required in techniques reported in recent work, which enables them to make the network less sensitive to over-fitting and easier to train than techniques that use different CNNs for each facial feature as reported in the literature.
The visibility and continuity of the inner segment outer segment (ISOS) junction layer of the photoreceptors on spectral domain optical coherence tomography images is known to be related to visual acuity in patients with age-related macular degeneration (AMD). Automatic detection and segmentation of lesions and pathologies in retinal images is crucial for the screening, diagnosis, and follow-up of patients with retinal diseases. One of the challenges of using the classical level-set algorithms for segmentation involves the placement of the initial contour. Manually defining the contour or randomly placing it in the image may lead to segmentation of erroneous structures. It is important to be able to automatically define the contour by using information provided by image features. We explored a level-set method which is based on the classical Chan-Vese model and which utilizes image feature information for automatic contour placement for the segmentation of pathologies in fluorescein angiograms and en face retinal images of the ISOS layer. This was accomplished by exploiting a priori knowledge of the shape and intensity distribution allowing the use of projection profiles to detect the presence of pathologies that are characterized by intensity differences with surrounding areas in retinal images. We first tested our method by applying it to fluorescein angiograms. We then applied our method to en face retinal images of patients with AMD. The experimental results included demonstrate that the proposed method provided a quick and improved outcome as compared to the classical Chan-Vese method in which the initial contour is randomly placed, thus indicating the potential to provide a more accurate and detailed view of changes in pathologies due to disease progression and treatment.
In this paper, we examine the problem of using video analysis to assess pain, an important problem especially for
critically ill, non-communicative patients, and people with dementia. We propose and evaluate an automated method to
detect the presence of pain manifested in patient videos using a unique and large collection of cancer patient videos
captured in patient homes. The method is based on detecting pain-related facial action units defined in the Facial Action
Coding System (FACS) that is widely used for objective assessment in pain analysis. In our research, a person-specific
Active Appearance Model (AAM) based on Project-Out Inverse Compositional Method is trained for each patient
individually for the modeling purpose. A flexible representation of the shape model is used in a rule-based method that is
better suited than the more commonly used classifier-based methods for application to the cancer patient videos in
which pain-related facial actions occur infrequently and more subtly. The rule-based method relies on the feature points
that provide facial action cues and is extracted from the shape vertices of AAM, which have a natural correspondence to
face muscular movement. In this paper, we investigate the detection of a commonly used set of pain-related action units
in both the upper and lower face. Our detection results show good agreement with the results obtained by three trained
FACS coders who independently reviewed and scored the action units in the cancer patient videos.
KEYWORDS: Video, Antennas, Scalable video coding, Receivers, Signal to noise ratio, Transmitters, Telecommunications, Computer programming, Error analysis, Video processing
A novel cross-layer method is proposed for real-time transmission of standard compliant scalable video over a powerlimited
multiple-input multiple-output (MIMO) system with channel state feedback. In the MIMO system, adaptive
power allocation and antenna selection are utilized for creation of unequal bit error rate (BER) sub-channels. BER
across all the sub-channels can be improved by reducing the channel throughput. In the proposed method, the scalable
video is first divided into multiple video sub-streams of unequal importance by content-based partitioning and sorting of
video layers. A novel technique is utilized to select the sub-stream data to be sent over the available MIMO sub-channels
as to match the importance of the video data to both the channel BER and data transmission delay. Video
packets that are delayed excessively are discarded at the transmitter. A trade-off exists between the losses in video peak
signal-to-noise ratio (PSNR) resulting from discarded video packets at the transmitter, and gains in video PSNR due to
lower channel BER. Simulation results show that the proposed method results in significantly improved performance
compared with video transmission over constant BER channels with throughput equal to the video bit rate.
KEYWORDS: Oxygen, Arteries, Veins, Blood vessels, Phosphorescence, Optic nerve, Monte Carlo methods, Medical imaging, Visualization, Information science
Phosphorescence lifetime measurement based on a frequency domain approach is used to estimate oxygen tension in
large retinal blood vessels. The classical least squares (LS) estimation was initially used to determine oxygen tension
indirectly from intermediate variables. A spatial regularized least squares (RLS) method was later proposed to reduce the
high variance of oxygen tension estimated by LS method. In this paper, we provide a solution using a modified RLS
(MRLS) approach that utilizes prior knowledge about retinal vessels oxygenation based on expected oxygen tension
values in retinal arteries and veins. The performance of MRLS method was evaluated in simulated and experimental
data by determining the bias, variance, and mean absolute error (MAE) of oxygen tension measurements and comparing
these parameters with those derived with the use of LS and RLS methods.
Automated counting of photoreceptor cells in high-resolution retinal images generated by adaptive optics (AO) imaging
systems is important due to its potential for screening and diagnosis of diseases that affect human vision. A drawback in
recently reported photoreceptor cell counting methods is that they require user input of cell structure parameters. This
paper introduces a method that overcomes this shortcoming by using content-adaptive filtering (CAF). In this method,
image frequency content is initially analyzed to design a customized filter with a passband to emphasize cell structures
suitable for subsequent processing. The McClellan transform is used to design a bandpass filter with a circularly
symmetric frequency response since retinal cells have no preferred orientation. The automated filter design eliminates
the need for manual determination of cell structure parameters, such as cell spacing. Following the preprocessing step,
cell counting is performed on the binarized filtered image by finding regional points of high intensity. Photoreceptor cell
count estimates using this automated procedure were found to be comparable to manual counts (gold standard). The new
counting method when applied to test images showed overall improved performance compared with previously reported
methods requiring user-supplied input. The performance of the method was also examined with retinal images with
variable cell spacing.
We have developed a new method to segment and analyze retinal layers in optical coherence tomography (OCT) images
with the intent of monitoring changes in thickness of retinal layers due to disease. OCT is an imaging modality that
obtains cross-sectional images of the retina, which makes it possible to measure thickness of individual layers. In this
paper we present a method that identifies six key layers in OCT images. OCT images present challenges to conventional
edge detection algorithms, including that due to the presence of speckle noise which affects the sharpness of inter-layer
boundaries significantly. We use a directional filter bank, which has a wedge shaped passband that helps reduce noise
while maintaining edge sharpness, in contrast to previous methods that use Gaussian filter or median filter variants that
reduce the edge sharpness resulting in poor edge-detection performance. This filter is utilized in a spatially variant
setting which uses additional information from the intersecting scans. The validity of extracted edge cues is determined
according to the amount of gray-level transition across the edge, strength, continuity, relative location and polarity.
These cues are processed according to the retinal model that we have developed and the processing yields edge contours.
In this paper, we use a set-theoretic approach to provide an efficient and deterministic iterative solution for the
compensated signature embedding (CSE) scheme introduced in an earlier work.4 In CSE, a fragile signature is
derived and embedded into the media using a robust watermarking technique. Since the embedding process leads
to altering the media, the media samples are iteratively adjusted to compensate for the embedding distortion.
Projections Onto Convex Sets (POCS) is an iterative set-theoretic approach known to be deterministic, effective
and has been used in many image processing applications. We propose to use POCS for providing a compensation
mechanism to address the CSE problem. We identify two convex constraint sets defined according to image
fidelity and signature-generation criteria, and use them in a POCS-based CSE image authentication system.
The system utilizes the wavelet transform domain for embedding and compensation. Simulation results are
presented to show that the proposed scheme is efficient and accurate in terms of both achieving high convergence
speed and maintaining image fidelity.
Image decomposition using directional filter banks is useful in discovering and extracting edge orientation cues for
target detection in airborne surveillance images. Since images of interest are very large and the filtered images are not
downsampled in the application of interest, conventional filtering can be computationally extremely demanding and
there is a need to explore procedures to make the filtering efficient. In this paper a novel filter bank structure for
directional filtering of images is proposed and its design described. The design is carried out by imposing structural
constraints on the filters, which are implemented using a generalized notion of separable filtering. The structure uses
one-dimensional (1-D) filters as building blocks, which are employed in novel configurations to obtain filters with
narrow wedge-shaped passbands. Design procedures have been developed for constructing 16-band, 32-band, and 64-
band partitions starting with either built-in or user-specified 1-D prototypes. Implementations of filters using the
proposed method show significant improvement compared with conventional implementation, often more by an order of
magnitude, which is also supported by a theoretical analysis of the filter complexity.
Efficient processing of imagery derived from remote sensing systems has become ever more important due to increasing
data sizes, rates, and bit depths. This paper proposes a target detection method that uses a special class of wavelets based on
highly frequency-selective directional filter banks. The approach helps isolate object features in different directional filter
output components. These components lend themselves well to the application of powerful denoising and edge detection
procedures in the wavelet domain. Edge information is derived from directional wavelet decompositions to detect targets
of known dimension in electro optical imagery. Results of successful detection of objects using the proposed method are
presented in the paper. The approach highlights many of the benefits of working with directional wavelet analysis for
image denoising and detection.
KEYWORDS: Digital watermarking, Sensors, Independent component analysis, Signal detection, Distortion, Data modeling, Signal processing, Sensor performance, Error analysis, Statistical analysis
This paper presents a novel scheme for detection of watermarks embedded in multimedia signals using spread spectrum (SS) techniques. The detection method is centered on using the model that the embedded watermark and the host signal are mutually independent. The proposed detector assumes that the host signal and the watermark obey non-Gaussian distributions. The proposed blind watermark detector employs underdetermined blind source separation (BSS) based on independent component analysis (ICA) for watermark estimation from the watermarked image. The mean-field theory based undetermined BSS scheme is used for watermark estimation. Analytical results are presented showing that the proposed detector performs significantly better than the existing correlation based blind detectors traditionally used for SS-based image watermarking.
In human visual system the spatial resolution of a scene under view decreases uniformly at points of increasing distance from the point of gaze, also called foveation point. This phenomenon is referred to as foveation and has been exploited in foveated imaging to allocate bits in image and video coding according to spatially varying perceived resolution. Several digital image processing techniques have been proposed in the past to realize foveated images and video. In most cases a single foveation point is assumed in a scene. Recently there has been a significant interest in dynamic as well as multi-point foveation. The complexity involved in identification of foveation points is however significantly high in the proposed approaches. In this paper, an adaptive multi-point foveation technique for video data based on the concepts of regions of interests (ROIs) is proposed and its performance is investigated. The points of interest are assumed to be centroid of moving objects and dynamically determined by the foveation algorithm proposed. Fast algorithm for implementing region based multi-foveation processing is proposed. The proposed adaptive multi-foveation fully integrates with existing video codec standard in both spatial and DCT domain.
Methods of near-lossless image compression based on the criterion of maximum allowable deviation of pixel values are described in this paper. Predictive and multi resolution techniques for performing near-lossless compression are investigated. A procedure for near-lossless compression using a modification of lossless compression are investigated. A procedure for near-lossless compression using a modification of lossless predictive coding techniques to satisfy the specified tolerance is descried. Simulation results with modified versions of two of the best lossless predictive coding techniques known, CALIC and JPEG- LS, are provided. It is shown that the application of lossless coding based on reversible transforms in conjunction with pre-quantization is inferior to predictive techniques for near-lossless compression. A partial embedding two-layer scheme is proposed in which an embedded multi-resolution coder generates a lossy base layer, and a simple but effective context-based lossless coder codes the difference between the original image and the lossy reconstruction. Simulation results show that this lossy plus-lossless technique yields compression ratios very close to those obtained with predictive techniques, while providing the feature of a partially embedded bit-stream.
Computer-aided diagnosis will be an important feature of the next generation picture archiving and communication systems. In this paper, computer-aided detection of microcalcifications in mammograms using a nonlinear subband decomposition and outlier labeling is examined. The mammogram image is first decomposed into subimages using a nonlinear subband decomposition filter bank. A suitably identified subimage is divided into overlapping square regions in which skewness and kurtosis as measures of the asymmetry and impulsiveness of the distribution are estimated. A region with high positive skewness and kurtosis is marked as a region of interest. Finally, an outlier labeling method is used to find the locations of microcalcifications in these regions. Simulation studies are presented.
The transmission of digitized medical images over the existing telecommunications infrastructure presents a formidable challenge. To achieve delivery times on the order of seconds--as opposed to minutes or hours--for large-format high-resolution images, data compression on the order of 25:1 is necessary. This degree of data compression cannot be reached with lossless techniques. This paper reports on an adaptation of a standardized technique for lossy image compression (the JPEG approach), which provides high compression ratios for radiographic images with minimal apparent loss of diagnostic quality.
Filters with diamond shaped passbands and stopbands are used in image and video processing for several different tasks, one of which is the subband decomposition of signals. This subband decomposition can be carried out in a tree-structured manner by using diamond prefilters followed by downsampling on quincunx grids at each stage. In order to use such a decomposition in hierarchical coding it is desirable to choose the lowpass filter to be a halfband filter so as to limit the amount of aliasing in the low frequency component of the signal at each stage. This paper addresses the design and implementation of a two-channel filter bank for such an application. A special class of one-dimensional (1-D) prototype filters is used to derive the filter banks. The procedure is based on obtaining a halfband filter using a pair of lower order halfband filters. The resulting solutions preserve the exact reconstruction property even when the filter coefficients are quantized, a property which is useful in implementation. Examples of such filter banks are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.