In 2023, Richards and Hübner proposed silux as a new standard unit of irradiance for the full 350-1100 [nm] band, specifically addressing the mismatch between the photopic response of the human eye and spectral sensitivity of new low-light, Silicon, CMOS sensors with enhanced NIR response. This spectral mismatch between the response of the human eye and the spectral sensitivity of the sensor can lead to significant errors in measuring the magnitude of the signal available to a different camera system with the traditional lux unit. In this correspondence, we demonstrate a per-pixel calibration of a camera to create the first imaging siluxmeter. To do this, we developed a comprehensive per-pixel model as well as the experimental and data reduction methods to estimate the parameters. These parameters are then combined to an updated NVIPM measured system component that now provides the conversion factor from device units of DN to silux, lux, and other radiometric units. Additionally, the accuracy of the measurements and modeling are assessed through comparisons to field observations and validating/transferring calibration from one low light camera to another. Following this process, other low-light cameras can be calibrated and applied to scenes such that they may be accurately characterized using silux as the standard unit.
A Jones Detectivity, denoted D*, metric is commonly used to compare thermal camera focal plane arrays. D* projects the thermal noise back into time (1 second) and area (1cm^2), thereby normalizing its bandwidth. This makes it easier to compare the sensitivity for different thermal detectors. Here we extend the basic idea of this bandwidth normalization to low light cameras, by using a signal to noise ratio (SNR), denoted 𝑆N𝑅𝐷∗ . One 𝑆N𝑅𝐷∗ goal is to compare the performance of the low light sensor in the darkest of conditions, and therefore a dark version is defined using the absolute noise floor of the camera. The signal and noise are normalized by projecting it back to the scene (through the optics) to an angular space. It is argued that projecting the SNR back to the scene makes it capable of comparing complete low light camera systems, including the lens. We also explore the SNR defined and specified by image intensifier tubes, and show why it is not a good prediction for the performance of low light cameras.
Traditional helmet-mounted devices (HMDs), such as Night Vision Googles, are direct view systems where parallax, or image offset, is only along the line of sight and the impact to user performance is minimal. As HMDs transition to adding digital cameras while maintaining direct view capabilities, the sensor must be placed outside of the user’s line of sight. These offsets create more significant parallax and can greatly impact a user’s ability to navigate and to interact with objects at close distances. Parallax error can be easily corrected for a fixed distance to an object, but the error progressively increases when viewing objects that are closer or farther than the selected distance. More complicated methods can be employed, such as ground plane re-projection or correction based on a depth sensor, but those methods each have their own disadvantages. Factors such as alignment accuracy across the field of view and nauseating effects must also be considered. This paper describes the development of an image simulation representing parallax error in a virtual reality headset with the ability to apply different correction techniques with varying parameters. This simulation was used with a group of observers who were asked to move around a scene and qualitatively evaluate the effectiveness of each correction method with different combinations of sensors. Questions focused on their ability to complete certain tasks and their subjective experiences while using each method. Results from this evaluation are presented and recommendations are made for optimal settings and future studies.
Sensitivity of a camera is most often measured by recording video segments while viewing a constant (space and time) scene. This video, commonly referred to as a noise cube, provides information about how much the signals are varying away from the average. In this work, we describe the systematic decomposition of noise cubes into components. First, the average of a noise cube (when combined with other cube measurements) is used to determine the cameras Signal Transfer Function (SiTF). Removing the average results in a cube that exhibits variations in both spatial and temporal directions. These variations also occur at different scales (spatial/temporal frequencies), therefore we propose applying a 3-dimensional filter to separate fast and slow variation. Slowly varying temporal variation can indicate an artifact in measurement, the camera signal, or the camera’s response to measurement. Slowly varying spatial variation can be considered as non-uniformity, and conventional metrics applied. Fast varying spatial/temporal noise is combined and evaluated through the conventional 3D noise model (providing 7 independent noise measurements. In support of the reproducible research effort, the functions associated with this work can be found on the Mathworks file exchange.
Atmospheric turbulence can cause significant image quality degradation in long-range, ground-to-ground imagery. There is recent interest in characterizing the performance of machine learning algorithms for long-range imaging applications. However, such as task requires a databases of realistic turbulence-degraded imagery. Modeling and simulation provides a reliable, repeatable means of generating long-range data at a substantial cost-savings compared to live field collections. We present updates to the Night Vision Electronic Sensors Directorate (NVESD) Turbulence Simulation algorithm that simulates the effect of turbulence on imagery by imposing realistic blur and distortion on pristine input imagery for a given range, turbulence condition, and optical parameters. Key improvements to the model are: (1) the incorporation of the exact short-exposure atmospheric modulation transfer function into the blurring routine; (2) a random walk algorithm that generates blur and distortion statistics on-the-fly at the characteristic frequency of turbulence degradations. The algorithm is fast and lightweight, computationally-speaking, so as to be scalable to high-performance computing. We perform a qualitative assessment of the results with real field imagery, as well as a quantitative comparison using the structural similarity metric (SSIM).
KEYWORDS: Optical filters, Electronic filtering, Target detection, Image filtering, Cameras, 3D acquisition, Signal detection, Color difference, RGB color model, Signal to noise ratio
In this article, a method for applying matched filters to a 3-dimentional hyperspectral data cube is discussed. In many applications, color visible cameras or hyperspectral cameras are used for target detection where the color or spectral optical properties of the imaged materials are partially known in advance. Therefore, the use of matched filtering with spectral data along with shape data is an effective method for detecting certain targets. Since many methods for 2D image filtering have been researched, we propose a multi-layer filter where ordinary spatially matched filters are used before the spectral filters. We discuss a way to layer the spectral filters for a 3D hyperspectral data cube, accompanied by a detectability metric for calculating the SNR of the filter. This method is appropriate for visible color cameras and hyperspectral cameras. We also demonstrate an analysis using the Night Vision Integrated Performance Model (NV-IPM) and a Monte Carlo simulation in order to confirm the effectiveness of the filtering in providing a higher output SNR and a lower false alarm rate.
KEYWORDS: Imaging systems, Computational imaging, Computing systems, Systems modeling, Modulation transfer functions, Signal to noise ratio, Detection and tracking algorithms, Performance modeling, Optical filters, Linear filtering
In the modern tactical imaging environment, new computational imaging (CI) systems and algorithms are being used to improve speed and accuracy for detection tasks. Therefore, a measurement technique is needed to predict the performance of complex non-shift invariant EO/IR imaging systems, including CI systems. Detection performance of traditional imaging systems can be modeled using current system metrics and measurements such as Modulation Transfer Function (MTF), Signal to Noise (SNR), and instantaneous Field of View (iFOV). In this correspondence, we propose a technique to experimentally measure a detection sensitivity metric for non-traditional CI systems. The detection sensitivity metric predicts the upper bound of linear algorithm performance though evaluation of a matched filter. The experimental results are compared with theoretical expected values though the Night Vision Integrated Performance Model (NV-IPM). Additionally, we demonstrate the experimental results for a variety of imaging systems (IR, visible, and color), target sizes and orientations, as well as SNR values. Our results demonstrate how this detection sensitivity metric can be measured to provide additional insight into the final system performance.
Linear system theory is employed to make target acquisition performance predictions for electro-optical/infrared imaging systems where the modulation transfer function (MTF) may be imposed from a nonlinear degradation process. Previous research relying on image quality metrics (IQM) methods, which heuristically estimate perceived MTF has supported that an average perceived MTF can be used to model some types of degradation such as image compression. Here, we discuss the validity of the IQM approach by mathematically analyzing the associated heuristics from the perspective of reliability, robustness, and tractability. Experiments with standard images compressed by x.264 encoding suggest that the compression degradation can be estimated by a perceived MTF within boundaries defined by well-behaved curves with marginal error. Our results confirm that the IQM linearizer methodology provides a credible tool for sensor performance modeling.
In the pursuit of fully-automated display optimization, the US Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate (NVESD) is evaluating a variety of approaches, including the effects of viewing distance and magnification on target acquisition performance. Two such approaches are the Targeting Task Performance (TTP) metric, which NVESD has developed to model target acquisition performance in a wide range of conditions, and a newer Detectivity metric, based on matched-filter analysis by the observer. While NVESD has previously evaluated the TTP metric for predicting the peak-performance viewing distance as a function of blur, no such study has been done for noise-limited conditions. In this paper, the authors present a study of human task performance for images with noise versus viewing distance using both metrics. Experimental results are compared to predictions using the Night Vision Integrated Performance Model (NV-IPM). The potential impact of the results on the development of automated display optimization are discussed, as well as assumptions that must be made about the targets being displayed.
Due to the large quantity of low-cost, high-speed computational processing available today, computational imaging (CI) systems are expected to have a major role for next generation multifunctional cameras. The purpose of this work is to quantify the performance of theses CI systems in a standardized manner. Due to the diversity of CI system designs that are available today or proposed in the near future, significant challenges in modeling and calculating a standardized detection signal-to-noise ratio (SNR) to measure the performance of these systems. In this paper, we developed a path forward for a standardized detectivity metric for CI systems. The detectivity metric is designed to evaluate the performance of a CI system searching for a specific known target or signal of interest, and is defined as the optimal linear matched filter SNR, similar to the Hotelling SNR, calculated in computational space with special considerations for standardization. Therefore, the detectivity metric is designed to be flexible, in order to handle various types of CI systems and specific targets, while keeping the complexity and assumptions of the systems to a minimum.
The desire to provide the warfighter both ranging and reflected intensity information is increasing to meet expanding operational needs. LIDAR imaging systems can provide the user with intensity, range, and even velocity information of a scene. The ability to predict the performance of LIDAR systems is critical for the development of future designs without the need to conduct time consuming and costly field studies. Performance modeling of a frequency modulated continuous wave (FMCW) LIDAR system is challenging due to the addition of the chirped laser source and waveform mixing. The FMCW LIDAR model is implemented in the NV-IPM framework using the custom component generation tool. This paper presents an overview of the FMCW Lidar, the customized LIDAR components, and a series of trade studies using the LIDAR model.
An objective performance of the reflective-band imaging systems is required in order to provide the warfighter with the right technology for a specific task. Various methods to measure and model performance in the visible (Vis) spectral regions have been proposed in the literature. This correspondence shows the influence of the spectral region averaging on the monochromatic modulation transfer function (MTF). This works unequivocally shows that the illumination source plays a crucial role in the accurate predictive analysis of the system performance. For accurate analysis the illumination sources need to be carefully considered for the atmospheric conditions. This work shows the possibility of using an LED configuration in the system performance analysis. Such configurations need rigorous calibration in order to become a valuable asset in system characterization.
Conventional sensors measure the light incident at each pixel in a focal plane array. Compressive sensing (CS) involves capturing a smaller number of unconventional measurements from the scene, and then using a companion process to recover the image. CS has the potential to acquire imagery with equivalent information content to a large format array while using smaller, cheaper, and lower bandwidth components. However, the benefits of CS do not come without compromise. The CS architecture chosen must effectively balance between physical considerations, reconstruction accuracy, and reconstruction speed to meet operational requirements. Performance modeling of CS imagers is challenging due to the complexity and nonlinearity of the system and reconstruction algorithm. To properly assess the value of such systems, it is necessary to fully characterize the image quality, including artifacts and sensitivity to noise. Imagery of a two-handheld object target set was collected using an shortwave infrared single-pixel CS camera for various ranges and number of processed measurements. Human perception experiments were performed to determine the identification performance within the trade space. The performance of the nonlinear CS camera was modeled by mapping the nonlinear degradations to an equivalent linear shift invariant model. Finally, the limitations of CS modeling techniques are discussed.
KEYWORDS: Targeting Task Performance metric, Signal to noise ratio, Received signal strength, Targeting Task Performance metric, Wavelets, Target acquisition, Signal to noise ratio, Performance modeling, Eye, Image quality, Wavelet transforms, Imaging systems
Target acquisition performance depends strongly on the contrast of the target. The Targeting Task Performance (TTP)
metric, within the Night Vision Integrated Performance Model (NV-IPM), uses a combination of resolution, signal-to-noise
ratio (SNR), and contrast to predict and model system performance. While the dependence on resolution and SNR
are well defined and understood, defining a robust and versatile contrast metric for a wide variety of acquisition tasks is
more difficult. In this correspondence, a wavelet contrast metric (WCM) is developed under the assumption that the
human eye processes spatial differences in a manner similar to a wavelet transform. The amount of perceivable
information, or useful wavelet coefficients, is used to predict the total viewable contrast to the human eye. The WCM is
intended to better match the measured performance of the human vision system for high-contrast, low-contrast, and low-observable
targets. After further validation, the new contrast metric can be incorporated using a modified TTP metric
into the latest Army target acquisition software suite, the NV-IPM.
KEYWORDS: 3D metrology, 3D modeling, 3D image processing, Imaging systems, Sensors, Nonuniformity corrections, Data modeling, Convolution, Performance modeling, Image processing
When evaluated with a spatially uniform irradiance, an imaging sensor exhibits both spatial and temporal variations,
which can be described as a three-dimensional (3D) random process considered as noise. In the 1990s, NVESD
engineers developed an approximation to the 3D power spectral density (PSD) for noise in imaging systems known as
3D noise. In this correspondence, we describe how the confidence intervals for the 3D noise measurement allows for
determination of the sampling necessary to reach a desired precision. We then apply that knowledge to create a smaller
cube that can be evaluated spatially across the 2D image giving the noise as a function of position. The method
presented here allows for both defective pixel identification and implements the finite sampling correction matrix. In
support of the reproducible research effort, the Matlab functions associated with this work can be found on the
Mathworks file exchange [1].
KEYWORDS: Performance modeling, Visual process modeling, Monte Carlo methods, Systems modeling, Signal to noise ratio, Target detection, Imaging systems, Sensors, Modulation transfer functions, Signal detection
The purpose of this paper is to construct a robust modeling framework for imaging systems in order to predict the
performance of detecting small targets such as Unmanned Aerial Vehicles (UAVs). The underlying principle is to track
the flow of scene information and statistics, such as the energy spectra of the target and power spectra of the
background, through any number of imaging components. This information is then used to calculate a detectivity
metric. Each imaging component is treated as a single linear shift invariant (LSI) component with specified input and
output parameters. A component based approach enables the inclusion of existing component-level models and makes it
directly compatible with image modeling software such as the Night Vision Integrated Performance Model (NV-IPM).
The modeling framework also includes a parallel implementation of Monte Carlo simulations designed to verify the
analytic approach. However, the Monte Carlo simulations may also be used independently to accurately model nonlinear
processes where the analytic approach fails, allowing for even greater extensibility. A simple trade study is
conducted comparing the modeling framework to the simulation.
Conventional electro-optical and infrared (EO/IR) systems capture an image by measuring the light incident at each of
the millions of pixels in a focal plane array. Compressive sensing (CS) involves capturing a smaller number of
unconventional measurements from the scene, and then using a companion process known as sparse reconstruction to
recover the image as if a fully populated array that satisfies the Nyquist criteria was used. Therefore, CS operates under
the assumption that signal acquisition and data compression can be accomplished simultaneously. CS has the potential
to acquire an image with equivalent information content to a large format array while using smaller, cheaper, and lower
bandwidth components. However, the benefits of CS do not come without compromise. The CS architecture chosen
must effectively balance between physical considerations (SWaP-C), reconstruction accuracy, and reconstruction speed
to meet operational requirements.
To properly assess the value of such systems, it is necessary to fully characterize the image quality, including artifacts
and sensitivity to noise. Imagery of the two-handheld object target set at range was collected using a passive SWIR
single-pixel CS camera for various ranges, mirror resolution, and number of processed measurements. Human
perception experiments were performed to determine the identification performance within the trade space. The
performance of the nonlinear CS camera was modeled with the Night Vision Integrated Performance Model (NV-IPM)
by mapping the nonlinear degradations to an equivalent linear shift invariant model. Finally, the limitations of CS
modeling techniques will be discussed.
A critical step in creating an image using a Bayer pattern sampled color camera is demosaicing, the process of
combining the individual color channels using a post-processing algorithm to produce the final displayed image. The
demosaicing process can introduce degradations which reduce the quality of the final image. These degradations must be
accounted for in order to accurately predict the performance of color imaging systems. In this paper, we present
analytical derivations of transfer functions to allow description of the effects of demosaicing on the overall system blur
and noise. The effects of color balancing and the creation of the luminance channel image are also explored. The
methods presented are validated through Monte Carlo simulations, which can also be utilized to determine the transfer
functions of non-linear demosaicing methods. Together with this new treatment of demosaicing, the framework behind
the color detector component in NV-IPM is discussed.
As the defense budget reduces and we are asked to do more with less (seems to have been a major theme
now for over 10 years), multifunction systems are becoming critical to the future of military EOIR systems.
The design of multifunction (MF) sensors is not a well-developed or well-understood discipline. In this
paper, we provide an example trade study of a ground combat system hyperhemispheric multifunction
system. In addition, we show how concept evaluation can be achieved using a virtual prototyping
environment.
Using the latest models from the U.S. Army Night Vision Electronic Sensors Directorate (NVESD), a survey of monochrome and color imaging systems at daylight and low light levels is conducted. Each camera system is evaluated and compared under several different assumptions, such as equivalent field of view with equal and variable f/#, common lens focal length and aperture, with high dynamic range comparisons and over several light levels. The modeling is done by use of the Targeting Task Performance (TTP) metric using the latest version of the Night Vision Integrated Performance Model (NV⁸IPM). The comparison is performed over the V parameter, the main output of the TTP metric. Probability of identification (PID) versus range predictions are a direct non-linear mapping of the V parameter as a function of range. Finally, a comparison between the performance of a Bayer-filtered color camera, the Bayer-filtered color camera with the IR block filter removed, and a monochrome version of the same camera is also conducted.
Panoramic infrared imaging is relatively new and has many applications to include tower mounted security systems, shipboard protection, and platform situational awareness. In this paper, we review metrics and methods that can be used for analysis of requirements for an infrared panoramic imaging system for military vehicles. We begin with a broad view of general military requirements organized into three categories, survivability, mobility, and lethality. A few requirements for the sensor modes of operation across all categories are selected so that panoramic system design can address as many needs as possible, but with affordability applied to system design. Metrics and associated methods that can translate military operational requirements into panoramic imager requirements are discussed in detail in this paper.
The design and modeling of compressive sensing (CS) imagers is difficult due to the complexity and non-linearity of the system and reconstruction algorithm. The Night Vision Integrated Performance Model (NV-IPM) is a linear imaging system design tool that is very useful for complex system trade studies. The custom component generator, included in NV-IPM, will be used to include a recently published theory for CS that links measurement noise, easily calculated with NV-IPM, to the noise of the reconstructed CS image given the estimated sparsity of the scene and the number of measurements as input. As the sparsity will also depend on other factors such as the optical transfer function and the scene content, an empirical relationship will be developed between the linear model within NV-IPM and the non-linear reconstruction algorithm using measured test data. Using the theory, a CS imager varying the number of measurements will be compared to a notional traditional imager.
Differential polarimetry has shown the ability to enhance target signatures by reducing background signatures, thus effectively increasing the signal-to-noise ratio on target. This method has mainly been done for resolved, high contrast targets. For ground-to-air search and tracking of small, slow, airborne targets, the target at range can be sub-pixel and hard to detect against the background sky. Given the unpolarized nature of the thermal emission of the background sky, it should be possible to use differential polarimetry to “filter out” the background, and thus enhance the ability of detecting sub-pixel targets. The first step in testing this hypothesis is to devise a set of surrogate sample targets and measure their polarimetric properties in the thermal IR in both the lab and the field. The goal of this paper is to determine whether or not it is feasible to use differential polarimetry to search, detect, and track small airborne objects.
Active imaging systems are currently being developed to increase the target acquisition and identification range performance of electro-optical systems. This paper reports on current efforts to extend the Night Vision Integrated Performance Model (NV-IPM) to include laser radar (LADAR) systems for unresolved targets. Combining this new LADAR modeling capability with existing sensor and environment capabilities already present in NV-IPM will enable modeling and trade studies for military relevant systems.
Compressive sensing (CS) can potentially form an image of equivalent quality to a large format, megapixel array, using a smaller number of individual measurements. This has the potential to provide smaller, cheaper, and lower bandwidth imaging systems. To properly assess the value of such systems, it is necessary to fully characterize the image quality, including artifacts, sensitivity to noise, and CS limitations. Full resolution imagery of an eight tracked vehicle target set at range was used as an input for simulated single-pixel CS camera measurements. The CS algorithm then reconstructs images from the simulated single-pixel CS camera for various levels of compression and noise. For comparison, a traditional camera was also simulated setting the number of pixels equal to the number of CS measurements in each case. Human perception experiments were performed to determine the identification performance within the trade space. The performance of the nonlinear CS camera was modeled with the Night Vision Integrated Performance Model (NVIPM) by mapping the nonlinear degradations to an equivalent linear shift invariant model. Finally, the limitations of compressive sensing modeling will be discussed.
KEYWORDS: Polarization, Signal to noise ratio, Optical filters, Sensors, Systems modeling, Imaging systems, Cameras, Mid-IR, Long wavelength infrared, Reflectivity
Polarization filters are commonly used as a means of increasing the contrast of a scene thereby increasing sensor range
performance. The change in the signal to noise ratio (SNR) is a function of the polarization of the target and background, the
type and orientation of the polarization filter(s), and the overall transparency of the filter. However, in the mid-wave and longwave
infrared bands (MWIR and LWIR), the noise equivalent temperature difference (NETD), which directly affects the SNR, is
a function of the filter’s re-emission and its reflected temperature radiance. This paper presents a model, by means of a Stokes
vector input, that can be incorporated into the Night Vision Integrated Performance Model (NV-IPM) in order to predict the
change in SNR, NETD, and noise equivalent irradiance (NEI) for infrared polarimeter imaging systems. The model is then used
to conduct a SNR trade study, using a modeled Stokes vector input, for a notional system looking at a reference target. Future
laboratory and field measurements conducted at Night Vision Electronic Sensors Directorate (NVESD) will be used to update,
validate, and mature the model of conventional infrared systems equipped with polarization filters.
Image noise originating from a sensor system is often the limiting factor in target acquisition performance, especially when limited by atmospheric transmission or low-light conditions. To accurately predict target acquisition range performance for a wide variety of imaging systems, image degradation introduced by the sensor must be properly combined with the limitations of the human visual system (HVS). This crucial step of incorporating the HVS has been improved and updated within NVESD’s latest imaging system performance model. The new noise model discussed here shows how an imaging system’s noise and blur are combined with the contrast threshold function (CTF) to form the system CTF. Model calibration constants were found by presenting low-contrast sine gratings with additive noise in a two alternative forced choice experiment. One of the principal improvements comes from adding an eye photon noise term allowing the noise CTF to be accurate over a wide range of luminance. The latest HVS noise model is then applied to the targeting task performance metric responsible for predicting system performance from the system CTF. To validate this model, human target acquisition performance was measured from a series of infrared and visible-band noise-limited imaging systems.
KEYWORDS: Eye, Image segmentation, Sensors, Imaging systems, Thermography, Black bodies, Video, Minimum resolvable temperature difference, Image processing, Human vision and color perception
The GStreamer architecture allows for simple modularized processing. Individual GStreamer elements have been
developed that allow for control, measurement, and ramping of a blackbody, for capturing continuous imagery
from a sensor, for segmenting out a MRTD target, for applying a blur equivalent to that of a human eye and a
display, and for thresholding a processed target contrast for "calling" it. A discussion of each of the components
will be followed by an analysis of its performance relative to that of human observers.
KEYWORDS: Targeting Task Performance metric, Modulation transfer functions, Interference (communication), Contrast transfer function, Image filtering, Performance modeling, Optical filters, Electronic filtering, Systems modeling, Signal to noise ratio
Using post-processing filters to enhance image detail, a process commonly referred to as boost, can significantly affect
the performance of an EO/IR system. The US Army's target acquisition models currently use the Targeting Task
Performance (TTP) metric to quantify sensor performance. The TTP metric accounts for each element in the system
including: blur and noise introduced by the imager, any additional post-processing steps, and the effects of the Human
Visual System (HVS). The current implementation of the TTP metric assumes spatial separability, which can introduce
significant errors when the TTP is applied to systems using non-separable filters. To accurately apply the TTP metric to
systems incorporating boost, we have implement a two-dimensional (2D) version of the TTP metric. The accuracy of the
2D TTP metric was verified through a series of perception experiments involving various levels of boost. The 2D TTP
metric has been incorporated into the Night Vision Integrated Performance Model (NV-IPM) allowing accurate system
modeling of non-separable image filters.
Image noise, originating from a sensor system, is often the limiting factor in target acquisition performance. This is
especially true of reflective-band sensors operating in low-light conditions. To accurately predict target acquisition
range performance, image degradation introduced by the sensor must be properly combined with the limitations of
the human visual system. This is modeled by adding system noise and blur to the contrast threshold function (CTF)
of the human visual system, creating a combined system CTF. Current U.S. Army sensor performance models
(NVThermIP, SSCAMIP, IICAM, and IINVD) do not properly address how external noise is added to the CTF as a
function of display luminance. Historically, the noise calibration constant was fit from data using image intensifiers
operating at low display luminance, typically much less than one foot-Lambert. However, noise calibration
experiments with thermal imagery used a higher display luminance, on the order of ten foot-Lamberts, resulting in a
larger noise calibration constant. To address this discrepancy, hundreds of CTF measurements were taken as a
function of display luminance, apparent target angle, frame rate, noise intensity and filter shape. The experimental
results show that the noise calibration constant varies as a function of display luminance. To account for this
luminance dependence, a photon shot noise term representing an additional limitation in the performance of the
human visual system is added to the observer model. The new noise model will be incorporated in the new U.S.
Army Integrated Performance Model (NV-IPM), allowing accurate comparisons over a wide variety of sensor
modalities and display luminance levels.
The applicability of two theories that account for aliasing artifacts, introduced by spatial sampling, on target
acquisition performance is addressed. Currently the Army's imager performance model, the Targeting Task
Performance (TTP) metric uses a parameterized model, based upon a fit to a number of perception experiments,
called MTF squeeze. MTF squeeze applies an additional degradation to the TTP metric based upon the amount of
spurious response in the final image. While this approach achieves satisfactory results for the data sets available, it is
not clear that these results extend to a wider variety of operating conditions. Other models treat the artifacts arising
from spurious response as a target-dependent noise. Modeling spurious response as noise allows proper treatment of
sampling artifacts across a wider variety of systems and post-processing techniques. Perception experiments are
used to assess the performance of both the MTF squeeze and aliasing as noise methods. The results demonstrate
that modeling all of the aliased frequencies as a target-dependent noise leads to erroneous predictions; however,
considering only aliased signals above the Nyquist rate as additive noise agrees with experimental observations.
The standard model used to describe the performance of infrared sensors is the U.S. Army thermal target acquisition
model, NVThermIP. The model is characterized by the apparent size and contrast of the target, and the resolution and
sensitivity of the sensor. Currently, manual gain and level determine optimal contrast for military targets. The Night
Vision models are calibrated to such images using a spatial average contrast consisting of the root sum squared of the
difference between the target and background means, and the standard deviation of the target internal contrast. This
definition of contrast applied to the model will show an unrealistic increase in performance for saturated targets. This
paper presents a modified definition of target contrast for use in NVThermIP, including a threshold value for target to
background mean difference and means to remove saturated pixels from the standard deviation of the target. Human
perception experiments were performed and the measured results are compared with the predicted performance using the
modified target contrast definition in NVThermIP.
KEYWORDS: 3D modeling, Video compression, Data modeling, Image compression, Performance modeling, Image quality, Sensors, Modulation transfer functions, Video, NVThermIP
The effect of video compression on image quality was investigated from the perspective of target acquisition
performance modeling. Human perception tests were conducted recently at the U.S. Army RDECOM CERDEC
NVESD, measuring identification (ID) performance on simulated military vehicle targets at various ranges. These videos
were compressed with different quality and/or quantization levels utilizing motion JPEG, motion JPEG2000, and
MPEG-4 encoding. To model the degradation on task performance, the loss in image quality is fit to an equivalent
Gaussian MTF scaled by the Structural Similarity Image Metric (SSIM). Residual compression artifacts are treated as
3-D spatio-temporal noise. This 3-D noise is found by taking the difference of the uncompressed frame, with the
estimated equivalent blur applied, and the corresponding compressed frame. Results show good agreement between the
experimental data and the model prediction. This method has led to a predictive performance model for video
compression by correlating various compression levels to particular blur and noise input parameters for NVESD target
acquisition performance model suite.
The bandwidth requirements of modern target acquisition systems continue to increase with larger sensor formats and
multi-spectral capabilities. To obviate this problem, still and moving imagery can be compressed, often resulting in
greater than 100 fold decrease in required bandwidth. Compression, however, is generally not error-free and the
generated artifacts can adversely affect task performance.
The U.S. Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate recently performed an assessment
of various compression techniques on static imagery for tank identification. In this paper, we expand this initial
assessment by studying and quantifying the effect of various video compression algorithms and their impact on tank
identification performance. We perform a series of controlled human perception tests using three dynamic simulated
scenarios: target moving/sensor static, target static/sensor static, sensor tracking the target. Results of this study will
quantify the effect of video compression on target identification and provide a framework to evaluate video compression
on future sensor systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.