KEYWORDS: Video acceleration, Video, Video surveillance, Surveillance, Video processing, Cameras, Infrared cameras, Video coding, Neodymium, Target recognition
Digital video mosaicking from Unmanned Aircraft Systems (UAS) is being used for many military and
civilian applications, including surveillance, target recognition, border protection, forest fire monitoring,
traffic control on highways, monitoring of transmission lines, among others. Additionally, NASA is using
digital video mosaicking to explore the moon and planets such as Mars. In order to compute a "good"
mosaic from video captured by a UAS, the algorithm must deal with motion blur, frame-to-frame jitter
associated with an imperfectly stabilized platform, perspective changes as the camera tilts in flight, as well
as a number of other factors. The most suitable algorithms use SIFT (Scale-Invariant Feature Transform) to
detect the features consistent between video frames. Utilizing these features, the next step is to estimate the
homography between two consecutives video frames, perform warping to properly register the image data,
and finally blend the video frames resulting in a seamless video mosaick. All this processing takes a great
deal of resources of resources from the CPU, so it is almost impossible to compute a real time video mosaic
on a single processor. Modern graphics processing units (GPUs) offer computational performance that far
exceeds current CPU technology, allowing for real-time operation.
This paper presents the development of a GPU-accelerated digital video mosaicking implementation and
compares it with CPU performance. Our tests are based on two sets of real video captured by a small UAS
aircraft; one video comes from Infrared (IR) and Electro-Optical (EO) cameras. Our results show that we
can obtain a speed-up of more than 50 times using GPU technology, so real-time operation at a video
capture of 30 frames per second is feasible.
Free space laser communications provides wide bandwidth and high security capabilities to Unmanned Aircraft Systems
(UAS) in order to successfully accomplish Intelligence, Surveillance, Target Acquisition, and Reconnaissance (ISTAR)
missions. A practical implementation of a laser-based video communications payload flown by a small UAS aircraft is
described as a proof-of-concept. The two-axis gimbal pointing control algorithm calculates the line-of-sight vector in
real-time by using differential GPS and IMU information gathered from the UAS vehicle's autopilot so that the laser
transmitter in the airborne payload can accurately track a ground-based photodiode array receiver with a known GPS
location. One of the future goals of the project is to move from UAS-to-ground communications to UAS-to-UAS free
space laser communications.
Image de-noising is a widely-used technology in modern real-world surveillance systems. Methods can
seldom do both de-noising and texture preservation very well without a direct knowledge of the noise model.
Most of the neighborhood fusion-based de-noising methods tend to over-smooth the images, which causes
a significant loss of detail. Recently, a new non-local means method has been developed, which is based
on the similarities among the different pixels. This technique results in good preservation of the textures;
however, it also causes some artifacts. In this paper, we utilize the scale-invariant feature transform (SIFT)
[1] method to find the corresponding region between different images, and then reconstruct the de-noised
images by a weighted sum of these corresponding regions. Both hard and soft criteria are chosen in order to
minimize the artifacts. Experiments applied to real unmanned aerial vehicle thermal infrared surveillance
video show that our method is superior to popular methods in the literature.
In traditional super-resolution methods, researchers generally assume that accurate
subpixel image registration parameters are given a priori. In reality, accurate image registration on
a subpixel grid is the single most critically important step for the accuracy of super-resolution
image reconstruction. In this paper, we introduce affine invariant features to improve subpixel
image registration, which considerably reduces the number of mismatched points and hence makes
traditional image registration more efficient and more accurate for super-resolution video
enhancement. Affine invariant features are invariant to affine transformations, including scale,
rotation, and translation. They are extracted from the second moment matrix through the
integration and differentiation covariance matrices. The experimental results show that affine
invariant interest points are more robust to perspective distortion and present more accurate
matching than traditional Harris/SIFT corners. In our experiments, all matching affine invariant
interest points are found correctly. In addition, for the same super-resolution problem, we can use
much fewer affine invariant points than Harris/SIFT corners to obtain good super-resolution
results.
The concept surrounding super-resolution image reconstruction is to recover a highly-resolved
image from a series of low-resolution images via between-frame subpixel image
registration. In this paper, we propose a novel and efficient super-resolution algorithm, and then
apply it to the reconstruction of real video data captured by a small Unmanned Aircraft System
(UAS). Small UAS aircraft generally have a wingspan of less than four meters, so that these vehicles
and their payloads can be buffeted by even light winds, resulting in potentially unstable video. This
algorithm is based on a coarse-to-fine strategy, in which a coarsely super-resolved image sequence is
first built from the original video data by image registration and bi-cubic interpolation between a
fixed reference frame and every additional frame. It is well known that the median filter is robust to
outliers. If we calculate pixel-wise medians in the coarsely super-resolved image sequence, we can
restore a refined super-resolved image. The primary advantage is that this is a noniterative algorithm,
unlike traditional approaches based on highly-computational iterative algorithms. Experimental
results show that our coarse-to-fine super-resolution algorithm is not only robust, but also very
efficient. In comparison with five well-known super-resolution algorithms, namely the robust super-resolution
algorithm, bi-cubic interpolation, projection onto convex sets (POCS), the Papoulis-Gerchberg algorithm, and the iterated back projection algorithm, our proposed algorithm gives both
strong efficiency and robustness, as well as good visual performance. This is particularly useful for
the application of super-resolution to UAS surveillance video, where real-time processing is highly
desired.
In traditional super-resolution methods, researchers generally assume that accurate subpixel image registration parameters are given a priori. In reality, accurate image registration on a subpixel grid is the single most critically important step for the accuracy of super-resolution image reconstruction. In this paper, we introduce affine invariant features to improve subpixel image registration, which considerably reduces the number of mismatched points and hence makes traditional image registration more efficient and more accurate for super-resolution video enhancement. Affine invariant interest points include those corners that are invariant to affine transformations, including scale, rotation, and translation. They are extracted from the second moment matrix through the integration and differentiation covariance matrices. Our tests are based on two sets of real video captured by a small Unmanned Aircraft System (UAS) aircraft, which is highly susceptible to vibration from even light winds. The experimental results from real UAS surveillance video show that affine invariant interest points are more robust to perspective distortion and present more accurate matching than traditional Harris/SIFT corners. In our experiments on real video, all matching affine invariant interest points are found correctly. In addition, for the same super-resolution problem, we can use many fewer affine invariant points than Harris/SIFT corners to obtain good super-resolution results.
Polymerase chain reaction (PCR) and gel electrophoresis are two widely used techniques for genetic studies that require
the bench scientist to perform many tedious manual steps. Advances in automation are making these techniques more
accessible, but detection and image analysis still remain labor-intensive. Although several commercial software
packages are now available, DNA image analysis still requires some intervention by the user, and thus a certain level of
image processing expertise. To allow researchers to speed up their analyses and obtain more repeatable results, we
present a fully automated image analysis system for DNA or protein studies with high accuracy. The proposed system is
based mainly on four steps: automatic thresholding, shifting, filtering, and processing. The automatic thresholding that
is used to equalize the gray values of the gel electrophoreses image background is one of the key and novel operations
in this algorithm. An enhancement is also used to improve poor quality images that have faint DNA bands.
Experimental results show that the proposed method eliminates defects due to noise for good and average quality gel
electrophoresis images, while it also improves the appearance of poor quality images.
A subpixel-resolution image registration algorithm based on the nonlinear projective transformation model is proposed to account for camera translation, rotation, zoom, pan, and tilt. Typically, parameter estimation techniques for rigid- body transformation require the user to manually select feature point pairs between the images undergoing registration. In this research, the block matching algorithm is used to automatically select correlated feature point pairs between two images; these features are ten used to calculate an iterative least squares estimate of the nonlinear projective transformation parameters. Since block matching is only capable of estimating accurate displacement vectors in image regions containing a large number of edges, inaccurate feature point pairs are statistically eliminated prior to computing the least squares parameter estimate. Convergence of the registration algorithm is generally achieved in several iterations. Simulations show that the algorithm estimates accurate integer- and subpixel- resolution registration parameters for similar sensor data sets such as intensity image sequence frames, as well as for dissimilar sensor images such as multimodality slices from the Visible Human Project. Through subpixel-resolution registration, integrating the registered pixels form a short sequence of low-resolution video frames generates a high- resolution video still. Experimental results are also shown in utilizing dissimilar data registration followed by vector quantization to segment tissues from multimodality Visible Human Project image slices.
Super-resolution enhancement algorithms are used to estimate a high-resolution video still (HRVS) from several low- resolution frames, provided that objects within the image sequence move with subpixel increments. However, estimating an accurate subpixel-resolution motion field between two low-resolution, noisy video frames has proven to be a formidable challenge. Up-sampling the image sequence frames followed by the application of block matching, optical flow estimation, or Bayesian motion estimation results in relatively poor subpixel-resolution motion fields, and consequently inaccurate regions within the super-resolution enhanced video still. This is particularly true for large interpolation factors (greater than or equal to 4). To improve the quality of the subpixel motion fields and the corresponding HRVS, motion can be estimated for each object within a segmented image sequence. First, a reference video frame is segmented into its constituent objects, and a mask is generated for each object which describes its spatial location. As described previously, subpixel-resolution motion estimation is then conducted by video frame up- sampling followed by the application of a motion estimation algorithm. Finally, the motion vectors are averaged over the region of each mask by applying an (alpha) -trimmed mean filter to the horizontal and vertical components separately. Since each object moves as a single entity, averaging eliminates many of the motion estimation errors and results in much more consistent subpixel motion fields. A substantial improvement is also visible within particular regions of the HRVS estimates. Subpixel-resolution motion fields and HRVS estimates are computed for interpolation factors of 2, 4, 8, and 16, to examine the benefits of object segmentation and motion field averaging.
Multiframe resolution enhancement algorithms are used to estimate a high-resolution video still (HRVS) from several low-resolution frames, provided that objects within the image sequence move with subpixel increments. A Bayesian multiframe enhancement algorithm is presented to compute an HRVS using the spatial information present within each frame as well as the temporal information present due to object motion between frames. However, the required subpixel- resolution motion vectors must be estimated from low- resolution and noisy video frames, resulting in an inaccurate motion field which can adversely impact the quality of the enhanced image. Several subpixel motion estimation techniques are incorporated into the Bayesian multiframe enhancement algorithm to determine their efficacy in the presence of global data transformations between frames and independent object motion. Visual and quantitative comparisons of the resulting high-resolution video stills computed from two video frames and the corresponding estimated motion fields show that the eight- parameter projective motion model is appropriate for global scene changes, while block matching and Horn-Schunck optical flow estimation each have their own advantages and disadvantages when used to estimate independent object motion.
KEYWORDS: Video, Motion estimation, Motion models, Data conversion, Image enhancement, Cameras, Data modeling, Signal to noise ratio, Resolution enhancement technologies, Data analysis
When an interlaced image sequence is viewed at the rate of sixty frames per second, the human visual system interpolates the data so that the missing fields are not noticeable. However, if frames are viewed individually, interlacing artifacts are quite prominent. This paper addresses the problem of deinterlacing image sequences for the purposes of analyzing video stills and generating high-resolution hardcopy of individual frames. Multiple interlaced frames are temporally integrated to estimate a single progressively-scanned still image, with motion compensation used between frames. A video observation model is defined which incorporates temporal information via estimated interframe motion vectors. The resulting ill- posed inverse problem is regularized through Bayesian maximum a posteriori (MAP) estimation, utilizing a discontinuity-preserving prior model for the spatial data. Progressively- scanned estimates computed from interlaced image sequences are shown at several spatial interpolation factors, since the multiframe Bayesian scan conversion algorithm is capable of simultaneously deinterlacing the data and enhancing spatial resolution. Problems encountered in the estimation of motion vectors from interlaced frames are addressed.
An image sequence containing a missile in flight often contains a strong plume signature expelled by the missile along with a weak signal corresponding to the missile hardbody. Enhancement of the missile signal is necessary to accurately track the trajectory throughout the image sequence. A parametric motion estimation algorithm is proposed for the passive stabilization of the data set. By stabilizing the image sequence with respect to the missile vacuum core, a registered sequence is produced with the missile located in the same position within each frame. Missile contrast can then be enhanced by applying a novel technique known as product correlation to the stabilized data. Product correlation is presented in the context of higher order statistics, and it offers a computationally efficient means of obtaining sample moments of high orders collectively. Simulations with missile sequences acquired in the IR and visible portions of the spectrum show that image stabilization followed by product correlation successfully enhances missile signal contrast, particularly in image sequences characterized by low SNRs.
With the advent of High Definition Television, it will become desirable to convert existing video sequence data into higher-resolution formats. This conversion process already occurs within the human visual system to some extent, since the perceived spatial resolution of a sequence appears much higher than the actual spatial resolution of an individual frame. This paper addresses how to utilize both the spatial and temporal information present in a sequence in order to generate high-resolution video. A novel observation model based on motion compensated subsampling is proposed for a video sequence. Since the reconstruction problem is ill-posed, Bayesian restoration with a discontinuity-preserving prior image model is used to extract high-resolution image sequences will be shown, with dramatic improvements provided over various single frame interpolation methods.
Missile image sequences often contain a strong plume signature expelled by the missile, with a weak signal corresponding to the missile hardbody. Enhancement of the missile signal in low signal-to-noise ratio environments is necessary to accurately track the trajectory throughout the sequence. By stabilizing the image sequence a registered data set is produced in which the missile hardbody can be found in the same location within each frame. A statistical method known as product correlation (PC) may be applied to the stabilized data, enhancing missile contrast with respect to the background. An algorithm for the passive stabilization of video sequences consisting of missile imagery is described. PC is presented in the context of higher order statistics and then applied to stabilized video sequences to enhance the missile hardbody signal within the data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.