Significance: Automated understanding of human embryonic stem cell (hESC) videos is essential for the quantified analysis and classification of various states of hESCs and their health for diverse applications in regenerative medicine.
Aim: This paper aims to develop an ensemble method and bagging of deep learning classifiers as a model for hESC classification on a video dataset collected using a phase contrast microscope.
Approach: The paper describes a deep learning-based random network (RandNet) with an autoencoded feature extractor for the classification of hESCs into six different classes, namely, (1) cell clusters, (2) debris, (3) unattached cells, (4) attached cells, (5) dynamically blebbing cells, and (6) apoptotically blebbing cells. The approach uses unlabeled data to pre-train the autoencoder network and fine-tunes it using the available annotated data.
Results: The proposed approach achieves a classification accuracy of 97.23 ± 0.94 % and outperforms the state-of-the-art methods. Additionally, the approach has a very low training cost compared with the other deep-learning-based approaches, and it can be used as a tool for annotating new videos, saving enormous hours of manual labor.
Conclusions: RandNet is an efficient and effective method that uses a combination of subnetworks trained using both labeled and unlabeled data to classify hESC images.
We explore the blending of model-based and deep learning approaches for target recognition in inverse synthetic aperture radar (ISAR) imagery. It evaluates five different approaches, namely, a model-based geometric hashing approach, a supervised deep learning approach, and three different blending models that fuse the model-based and deep learning approaches. The model-based approach extracts scattering centers as features and requires domain experts to identify and characterize important features of a target, which makes the training process very costly and hard to generalize when the image quality degrades in low signal-to-interference-plus-noise-ratio conditions. Next, a deep learning algorithm using a convolutional neural network is considered to extract the spatial features when raw ISAR data are used as input. This approach does not need an expert and only requires the labels of images for training. Finally, three model-based and deep learning approaches are blended together at the feature level and decision level to benefit from the advantages of both approaches, achieving a higher performance. The results show that the blending of the two approaches achieves a high performance while providing explainable inferences. The performance of the five different approaches is evaluated under varying conditions of occlusion, clutter, masking of the target, and adversarial attacks. It is empirically shown that the model-based and deep learning approaches are able to complement each other and can achieve better classification accuracy upon fusing the integrated approach.
Video networks is an emerging interdisciplinary field with significant and exciting scientific and technological
challenges. It has great promise in solving many real-world problems and enabling a broad range of applications,
including smart homes, video surveillance, environment and traffic monitoring, elderly care, intelligent
environments, and entertainment in public and private spaces. This paper provides an overview of the design of a
wireless video network as an experimental environment, camera selection, hand-off and control, anomaly detection.
It addresses challenging questions for individual identification using gait and face at a distance and present new
techniques and their comparison for robust identification.
Recognizing people at a distance is challenging from various considerations, including sensing, robust processing
algorithms, changing environmental conditions and fusing multiple modalities. This paper considers face, side face, gait
and ear and their possible fusion for human recognition. It presents an overview of some of the techniques that we have
developed for (a) super-resolution-based face recognition in video, (b) gait-based recognition in video, (c) fusion of
super-resolved side face and gait in video, (d) ear recognition in color/range images, and (e) fusion performance
prediction and validation. It presents various real-world examples to illustrate the ideas and points out the relative merits
of the approaches that are discussed.
This paper focuses on a genetic algorithm based method that automates the construction of local feature based composite class models to capture the salient characteristics of configuration variants of vehicle targets in SAR imagery and increase the performance of SAR recognition systems. The recognition models are based on quasi-invariant local features: SAR scattering center locations and magnitudes. The approach uses an efficient SAR recognition system as an evaluation function to determine the fitness class models. Experimental results are given on the fitness of the composite models and the similarity of both the original training model configurations and the synthesized composite models to the test configurations. In addition, results are presented to show the SAR recognition variants of MSTAR vehicle targets.
This paper describes a novel evolutionary method for automatic induction of target recognition procedures from examples. The learning process starts with training data containing SAR images with labeled targets and consists in coevolving the population of feature extraction agents that cooperate to build an appropriate representation of the input image. Features extracted by a team of cooperating agents are used to induce a machine learning classifier that is responsible for making the final decision of recognizing a target in a SAR image. Each agent (individual) contains feature extraction procedure encoded according to the principles of linear genetic programming (LGP). Like 'plain' genetic programming, in LGP an agent's genome encodes a program that is executed and tested on the set of training images during the fitness calculation. The program is a sequence of calls to the library of parameterized operations, including, but not limited to, global and local image processing operations, elementary feature extraction, and logic and arithmetic operations. Particular calls operate on working variables that enable the program to store intermediate results, and therefore design complex features. This paper contains detailed description of the learning and recognition methodology outlined here. In experimental part, we report and analyze the results obtained when testing the proposed approach for SAR target recognition using MSTAR database.
The focus of this paper is predicting the bounds on performance of a vote-based object recognition system, when the test data features are distorted by uncertainty in both feature locations and magnitudes, by occlusion and by clutter. An improved method is presented to calculate lower and upper bound predictions of the probability that objects with various levels of distorted features will be recognized correctly. The prediction method takes model similarity into account, so that when models of objects are more similar to each other, then the probability of correct recognition is lower. The effectiveness of the prediction method is validated in a synthetic aperture radar
(SAR) automatic target recognition (ATR) application using MSTAR
public SAR data, which are obtained under different depression
angles, object configurations and object articulations. Experiments show the performance improvement that can obtained by considering the feature magnitudes, compared to a previous performance prediction method that only considered the locations of features. In addition, the predicted performance is compared with actual performance of a vote-based SAR recognition system using the same SAR scatterer location and magnitude features.
The focus of this paper is optimizing the recognition of vehicles in Synthetic Aperture Radar (SAR) imagery using multiple SAR recognizers at different look angles. The variance of SAR scattering center locations with target azimuth leads to recognition system results at different azimuths that are independent, even for small azimuth deltas. Extensive experimental recognition results are presented in terms of receiver operating characteristic (ROC) curves to show the effects of multiple look angles on recognition performance for MSTAR vehicle targets with configuration variants, articulation, and occlusion.
KEYWORDS: Synthetic aperture radar, Data modeling, Performance modeling, Information operations, Scattering, Systems modeling, Target recognition, Object recognition, Detection and tracking algorithms, Intelligence systems
The focus of this paper is optimizing recognition models for Synthetic Aperture Radar signatures of vehicles to improve the performance of a recognition algorithm under the extended operating conditions of target articulation, occlusion and configuration variants. The recognition models are based on quasi-invariant local features, scattering center locations and magnitudes. The approach determines the similarities and differences among the various vehicle models. Methods to penalize similar features or reward dissimilar features are used to increase the distinguishability of the recognition model instances. Extensive experimental recognition results are presented in terms of confusion matrices and receiver operating characteristic curves to show the improvements in recognition performance for MSTAR vehicle targets with articulation, configuration variants and occlusion.
Performance prediction of SAR ATR has been a challenging problem. In our previous work, we developed a statistical framework for predicting bounds on fundamental performance of vote-based SAR ATR using scattering centers. This framework considered data distortion factors such as uncertainty, occlusion and clutter, in addition to model similarity. In this paper, we present an initial study on learning the statistical distributions of these factors. We focus on the development of a method for learning the distribution of a parameter that encodes the combined effect of the occlusion and similarity factors on performance. The impact of incorporating such a distribution on the accuracy of the predicted bounds is demonstrated by comparing bounds obtained using it with those obtained assuming simplified distributions. The data used in the experiments are obtained from the MSTAR public domain under different configurations and depression angles.
This paper presents an approach for recognizing occluded vehicle targets in Synthetic Aperture Radar (SAR) images. Using quasi-invariant local features, SAR scattering center locations and magnitudes, a recognition algorithm is presented that successfully recognizes highly occluded versions of actual vehicles from the MSTAR public data. Extensive experimental results are presented to show the effect of occlusion on recognition performance in terms of Probability of Correct Identification, Receiver Operating Characteristic (ROC) curves and confusion matrices. The effect of occlusion on performance of this recognition algorithm is accurately predicted. Combined effects such as occlusion and measured positional noise, as well as occlusion and other observed extended operating conditions (e.g., articulation) are also addressed. Although excellent forced recognition results can be achieved at very high (70%) occlusion, practical limitations are found due to the similarity of unoccluded confuser vehicles to highly occluded targets.
The focus of this paper is recognizing articulated vehicles and actual vehicle configuration variants in real SAR images from the MSTAR public data. Using SAR scattering center locations and magnitudes as features, the invariance of these features is shown with articulation (i.e. turret rotation for the T72 tank and ZSU 23/4 gun), with configuration variants and with a small change in depression angle. This scatterer location and magnitude quasi-invariance (e.g. location within one pixel, magnitude within about ten percent in radar cross- section) is used as a basis for development of a SAR recognition engine that successfully identified real articulated and non-standard configuration vehicles based on non-articulated, standard recognition models. Identification performance results are presented as vote space scatter plots and ROC curves for configuration variants, for articulated objects and for a small change in depression angle with the MSTAR data.
Similarity between model targets plays a fundamental role in determining the performance of target recognition. We analyze the effect of model similarity on the performance of a vote- based approach for target recognition from SAR images. In such an approach, each model target is represented by a set of SAR views sampled at a variety of azimuth angles and a specific depression angle. Both model and data views are represented by locations of scattering centers, which are peak features. The model hypothesis (view of a specific target and associated location) corresponding to a given data view is chosen to be the one with the highest number of data-supported model features (votes). We address three issues in this paper. Firstly, we present a quantitative measure of the similarity between a pair of model views. Such a measure depends on the degree of structural overlap between the two views, and the amount of uncertainty. Secondly, we describe a similarity- based framework for predicting an upper bound on recognition performance in the presence of uncertainty, occlusion and clutter. Thirdly, we validate the proposed framework using MSTAR public data, which are obtained under different depression angles, configurations and articulations.
The focus of this paper is recognizing articulated objects and the pose of the articulated parts in SAR images. Using SAR scattering center locations as features, the invariance with articulation (i.e. turret rotation for the T72, T80 and M1a tanks, missile erect vs. down for the SCUD launcher) is shown as a function of object azimuth. Similar data is shown for configuration differences in the MSTAR (Public) Targets. The UCR model-based recognition engine (which uses non- articulated models to recognize articulated, occluded and non-standard configuration objects) is described and target identification performance results are given as confusion matrices and ROC curves for six inch and one foot resolution XPATCH images and the one foot resolution MSTAR data. Separate body and turret models are developed that are independent of the relative positions between the body and the turret. These models are used with a subsequent matching technique to refine the pose of the body and determine the pose of the turret. An expression of the probability that a random match will occur is derived and this function is used to set thresholds to minimize the probability of a random match for the recognition system. Results for identification, body pose and turret pose are presented as a function of percent occlusion for articulated XPATCH data and results are given for identification and body pose for articulated MSTAR data.
We present a novel method for modeling the performance of a vote-based approach for target classification in SAR imagery. In this approach, the geometric locations of the scattering centers are used to represent 2D model views of a 3D target for a specific sensor under a given viewing condition (azimuth, depression and squint angles). Performance of such an approach is modeled in the presence of data uncertainty, occlusion, and clutter. The proposed method captures the structural similarity between model views, which plays an important role in determining the classification performance. In particular, performance would improve if the model views are dissimilar and vice versa. The method consists of the following steps. In the first step, given a bound on data uncertainty, model similarity is determined by finding feature correspondence in the space of relative translations between each pair of model views. In the second step, statistical analysis is carried out in the vote, occlusion and clutter space, in order to determine the probability of misclassifying each model view. In the third step, the misclassification probability is averaged for all model views to estimate the probability-of-correct- identification (PCI) plot as a function of occlusion and clutter rates. Validity of the method is demonstrated by comparing predicted PCI plots with ones that are obtained experimentally. Results are presented using both XPATCH and MSTAR SAR data.
The performance of a model-based automatic target recognition (ATR) engine with articulated and occluded objects in SAR imagery is characterized based on invariant properties of the objects. Using SAR scattering center locations as features, the invariance with articulation is shown as a function of object azimuth. The basic elements of our model-based recognition engine are described and performance results are given for various design parameters. The articulation invariant properties of the objects are used to characterize recognition engine performance, in terms of probability of correct identification as a function of percent invariance with articulation. Similar results are presented for object occlusion in the presence of noise, with percent unoccluded as the invariant measure. Finally, performance is characterized for occluded articulated objects as a function of number of features that are used. Results are presented using 4320 chips generated by XPATCH for 5 targets.
Building a hierarchical vision model of an object with multiple representations requires two steps: (1) decomposing the object into parts/subparts and obtaining appropriate representations, and (2) constructing relational links between decomposed parts/subparts obtained in step 1. In this paper, we describe volume-based decomposition and surface-based decomposition of 3-D objects into parts, where the objects are designed by a B-spline based geometric modeler called Alpha-1. Multiple-representation descriptions can be derived for each of these subparts using various techniques such as polygonal approximation, concave/convex edge detection, curvature extrema, and surface normals. For example, subparts of a hammer can be described by two generalized cylinders or one generalized cylinder and one polyhedron. Several examples are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.