We have collected a large dataset of subjective image quality “*nesses,” such as sharpness or colorfulness. The dataset comes from seven studies and contains 39,415 quotations from 146 observers who have evaluated 62 scenes either in print images or on display. We analyzed the subjective evaluations and formed a hierarchical image quality attribute lexicon for *nesses, which is visualized as image quality wheel (IQ-Wheel). Similar wheel diagrams for attributes have become industry standards in other sensory experience fields such as flavor and fragrance sciences. The IQ-Wheel contains the frequency information of 68 attributes relating to image quality. Only 20% of the attributes were positive, which agrees with previous findings showing a preference for negative attributes in image quality evaluation. Our results also show that excluding physical attributes of paper gloss, observers then use similar terminology when evaluating images with printed images or images viewed on a display. IQ-Wheel can be used to guide the selection of scenes and distortions when designing subjective experimental setups and creating image databases.
Evaluating algorithms used to assess image and video quality requires performance measures. Traditional performance measures (e.g., Pearson’s linear correlation coefficient, Spearman’s rank-order correlation coefficient, and root mean square error) compare quality predictions of algorithms to subjective mean opinion scores (mean opinion score/differential mean opinion score). We propose a subjective root-mean-square error (SRMSE) performance measure for evaluating the accuracy of algorithms used to assess image and video quality. The SRMSE performance measure takes into account dispersion between observers. The other important property of the SRMSE performance measure is its measurement scale, which is calibrated to units of the number of average observers. The results of the SRMSE performance measure indicate the extent to which the algorithm can replace the subjective experiment (as the number of observers). Furthermore, we have presented the concept of target values, which define the performance level of the ideal algorithm. We have calculated the target values for all sample sets of the CID2013, CVD2014, and LIVE multiply distorted image quality databases.The target values and MATLAB implementation of the SRMSE performance measure are available on the project page of this study.
An established way of validating and testing new image quality assessment (IQA) algorithms have been to compare how
well they correlate with subjective data on various image databases. One of the most common measures is to calculate
linear correlation coefficient (LCC) and Spearman’s rank order correlation coefficient (SROCC) against the subjective
mean opinion score (MOS). Recently, databases with multiply distorted images have emerged 1,2. However with
multidimensional stimuli, there is more disagreement between observers as the task is more preferential than that of
distortion detection. This reduces the statistical differences between image pairs. If the subjects cannot distinguish a
difference between some of the image pairs, should we demand any better performance with IQA algorithms? This paper
proposes alternative performance measures for the evaluation of IQA’s for the CID2013 database. One proposed
alternative performance measure is root-mean-square-error (RMSE) value for the subjective data as a function of the
number of observers. The other alternative performance measure is the number of statistical differences between image
pairs. This study shows that after 12 subjects the RMSE value saturates around the level of three, meaning that a target
RMSE value for an IQA algorithm for CID2013 database should be three. In addition, this study shows that the state-of-the-art IQA algorithms found the better image from the image pairs with a probability of 0.85 when the image pairs with
statistically significant differences were taken into account.
Image-quality assessment measures are largely based on the assumption that an image is only distorted by one type of distortion at a time. These conventional measures perform poorly if an image includes more than one distortion. In consumer photography, captured images are subject to many sources of distortions and modifications. We searched for feature subsets that predict the quality of photographs captured by different consumer cameras. For this, we used the new CID2013 image database, which includes photographs captured by a large number of consumer cameras. Principal component analysis showed that the features classified consumer camera images in terms of sharpness and noise energy. The sharpness dimension included lightness, detail reproduction, and contrast. The support vector regression model with the found feature subset predicted human observations well compared to state-of-the-art measures.
The added value of stereoscopy is an important factor for stereoscopic product development and content production.
Previous studies have shown that 'image quality' does not encompass the added value of stereoscopy, and thus the
attributes naturalness and viewing experience have been used to evaluate stereoscopic content. The objective of this
study was to explore what the added value of stereoscopy may consist of and what are the content properties that
contribute to the magnitude of the added value. The hypothesis was that interestingness is a significant component of the
added value. A subjective study was conducted where the participants evaluated three attributes of the stimuli in the
consumer photography domain: viewing experience, naturalness of depth and interestingness. In addition to the no-reference
direct scaling method a novel method, the recalled attention map, was introduced and used to study attention in
stereoscopic images. In the second part of our study, we use eye tracking to compare the salient regions in monoscopic
and stereoscopic conditions. We conclude from the subjective results that viewing experience and naturalness of depth
do not cover the entire added value of stereoscopy, and that interestingness brings a new dimension into the added value
research. The eye tracking data analysis revealed that the fixation maps are more consistent between participants in
stereoscopic viewing than in monoscopic viewing and from this we conclude that stereoscopic imagery is more effective
in directing the viewer's attention.
We present a method to evaluate stereo camera depth accuracy in human centered applications. It enables the comparison
between stereo camera depth resolution and human depth resolution. Our method uses a multilevel test target which can
be easily assembled and used in various studies. Binocular disparity enables humans to perceive relative depths
accurately, making a multilevel test target applicable for evaluating the stereo camera depth accuracy when the accuracy
requirements come from stereoscopic vision.
The method for measuring stereo camera depth accuracy was validated with a stereo camera built of two SLRs (singlelens
reflex). The depth resolution of the SLRs was better than normal stereo acuity at all measured distances ranging
from 0.7 m to 5.8 m. The method was used to evaluate the accuracy of a lower quality stereo camera. Two parameters,
focal length and baseline, were varied. Focal length had a larger effect on stereo camera's depth accuracy than baseline.
The tests showed that normal stereo acuity was achieved only using a tele lens.
However, a user's depth resolution in a video see-through system differs from direct naked eye viewing. The same test
target was used to evaluate this by mixing the levels of the test target randomly and asking users to sort the levels
according to their depth. The comparison between stereo camera depth resolution and perceived depth resolution was
done by calculating maximum erroneous classification of levels.
High dynamic range (HDR) imaging seems to have developed to a level of soon being a standard feature in
consumer cameras. This study was motivated by the need for evaluating tone mapping operators especially
for consumer imaging applications. A no-reference method based on ISO 20462-2:2005 triplet comparison was
created for evaluating tone mapping operators. Multiple HDR test images were photographed and the method
was validated by evaluating 25 tone mapping operators with five test images. Tone mapping operators were
evaluated based on image naturalness and pleasantness. The results indicate that the method successfully ranked
the method in terms of naturalness and pleasantness. The test image set could be improved for example based
on an imaging photo space for HDR photography. The test images of this study are available for non-commercial
research purposes.
Objective image quality metrics can be based on test targets or algorithms. Traditionally, the image quality of digital
cameras has been measured using test targets. Test-target measurements are tedious and require a controlled laboratory
environment. Algorithm metrics can be divided into three groups: full-reference (FR), reduced-reference (RR) and noreference
(NR). FR metrics cannot be applied to the computation of image quality captured by digital cameras because
pixel-wise reference images are missing. NR metrics are applicable only when the distortion type is known and the
distortion space is low-dimensional. RR metrics provide a tradeoff between NR and FR metrics. An RR metric does not
require a pixel-wise reference image; it only requires a set of extracted features. With the aid of RR features, it is
possible to avoid problems related to NR metrics. In this study, we evaluate the applicability of RR metrics to measuring
the image quality of natural images captured by digital cameras. We propose a method in which reference images are
captured using a reference camera. The reference images represented natural reproductions of the views under study. We
tested our method using three RR metrics proposed in the literature. The results suggest that the proposed method is
promising for measuring the quality of natural images captured by digital cameras for the purpose of camera
benchmarking.
Face detection techniques are used for many different applications. For example, face detection is a basic component in
many consumer still and video cameras. In this study, we compare the performance of face area data and freely selected
local area data for predicting the sharpness of photographs. The local values were collected systematically from images,
and for the analyses we selected only the values with the highest performance. The objective sharpness metric was based
on the statistics of the wavelet coefficients for the selected areas. We used three image contents whose subjective
sharpness values had been measured. The image contents were captured by 13 cameras, and the images were evaluated
by 25 subjects. The quality of the cameras ranged from low-end mobile phone cameras to low-end compact cameras.
The image contents simulated typical photos that consumers take with their mobile phones. The face area sizes on the
images were approximately 0.4, 1.0 or 4.0 %. Based on the results, the face area data proved to be valuable for
measuring the sharpness of the photographs if the face size was large enough. When the face area size was 1.0 or 4.0 %,
the performance of the measured sharpness values was equal to or better than the sharpness values measured from the
best local areas. When the face area was too small (0.4 %), the performance was low compared with the best local areas.
The goal of the study was to develop a method for quality computation of digitally printed images. We wanted to use
only the attributes which have a meaning for subjective visual quality experience of digitally printed images. Based on
the subjective data and our assessments the attributes for quality calculation were sharpness, graininess and color
contrast. The proposed graininess metric divides the fine detail image into blocks and used the low energy blocks for
graininess calculation. The proposed color contrast metric computes the contrast of dominant colors using the coarse
scale image. The proposed sharpness metric divides the coarse scale image into blocks and uses the high energy blocks
for sharpness calculation. The reduced reference features of sharpness and graininess metrics are the numbers of high or
low energy blocks. The reduced reference features of the color contrast metric are the directions of the dominant colors
in reference image. The overall image quality was calculated by combining the values. The performance of proposed
application specific image quality metric was high compared to the state of the art reduced reference applicationindependent
image quality metric. Linear correlation coefficients between subjective and predicted MOS were 0.88 for
electrophotography and 0.98 for ink-jet printed samples, for a sample set of 21 prints for electrophotography and for inkjet,
subjectively evaluated by 28 observers.
The aim of the study was to develop a test image for print quality evaluation to improve the current state of the art in
testing the quality of digital printing. The image presented by the authors in EI09 portrayed a breakfast scene, the content
of which could roughly be divided in four object categories: a woman, a table with objects, a landscape picture and a
gray wall. The image was considered to have four main areas of improvement: the busyness of the image, the control of
the color world, the salience of the object categories, and the naturalness of the event and the setting. To improve the first
image, another test image was developed. Whereas several aspects were improved, the shortcomings of the new image
found by visual testing and self-report were in the same four areas. To combine the insights of the two test images and to
avoid their pitfalls, a third image was developed. The goodness of the three test images was measured in subjective tests.
The third test image was found to address efficiently three of the four improvement areas, only the salience of the objects
left a bit to be desired.
Digital cameras, printers and displays have their own established methods to measure their performance. Different
devices have their own special features and also different metrics and measuring methods. The real meaning of
measuring data is often not learnt until hands-on experience is available. The goal of this study was to describe a
preliminary method and metrics for measuring the objective image quality of the TV-out function of mobile handsets.
The TV-out application was image browsing.
Image quality is often measured in terms of color reproduction, noise and sharpness and these attributes were also
applied in this study. The color reproduction attribute was studied with color depth, hue reproduction and color accuracy
metrics. The noise attribute was studied with the SNR (signal to noise ratio) and chroma noise metrics. The sharpness
attribute was studied with the SFR (spatial frequency response) and contrast modulation metrics. The measuring data
was gathered by using a method which digitized the analog signal of the TV-out device with a frame grabber card.
Based on the results, the quantization accuracy, chroma error and spatial reproduction of the signal were the three
fundamental factors which most strongly affected the performance of the TV-out device. The quantization accuracy of
the device affects the number of tones that can be reproduced in the image. The quantization accuracy also strongly
affects the correctness of hue reproduction. According to the results, the color depth metric was a good indicator of
quantization accuracy. The composite signal of TV-out devices transmits both chroma and luminance information in a
single signal. A change in the luminance value can change the constant chroma value. Based on the results, the chroma
noise metric was a good indicator for measuring this phenomenon. There were differences between the spatial
reproductions of the devices studied. The contrast modulation was a clear metric for measuring these differences. The
signal sharpening of some TV-out devices hindered the interpretation of SFR data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.