This paper analyzes how an experimenter can balance errors in subjective video quality tests between the statistical power of finding an effect if it is there and not claiming that an effect is there if the effect is not there, i.e., balancing Type I and Type II errors. The risk of committing Type I errors increases with the number of comparisons that are performed in statistical tests. We will show that when controlling for this and at the same time keeping the power of the experiment at a reasonably high level, it is unlikely that the number of test subjects that are normally used and recommended by the International Telecommunication Union (ITU), i.e., 15 is sufficient but the number used by the Video Quality Experts Group (VQEG), i.e., 24 is more likely to be sufficient. Examples will also be given for the influence of Type I error on the statistical significance of comparing objective metrics by correlation. We also present a comparison between parametric and nonparametric statistics. The comparison targets the question whether we would reach different conclusions on the statistical difference between the video quality ratings of different video clips in a subjective test, based on the comparison between the student T-test and the Mann–Whitney U-test. We found that there was hardly a difference when few comparisons are compensated for, i.e., then almost the same conclusions are reached. When the number of comparisons is increased, then larger and larger differences between the two methods are revealed. In these cases, the parametric T-test gives clearly more significant cases, than the nonparametric test, which makes it more important to investigate whether the assumptions are met for performing a certain test.
The perception of depth in images and video sequences is based on different depth cues. Studies have considered depth perception threshold as a function of viewing distance (Cutting and Vishton, 1995), the combination of different monocular depth cues and their quantitative relation with binocular depth cues and their different possible type of interactions (Landy, l995). But these studies only consider artificial stimuli and none of them attempts to provide a quantitative contribution of monocular and binocular depth cues compared to each other in the specific context of natural images. This study targets this particular application case. The evaluation of the strength of different depth cues compared to each other using a carefully designed image database to cover as much as possible different combinations of monocular (linear perspective, texture gradient, relative size and defocus blur) and binocular depth cues. The 200 images were evaluated in two distinct subjective experiments to evaluate separately perceived depth and different monocular depth cues. The methodology and the description of the definition of the different scales will be detailed. The image database (DC3Dimg) is also released for the scientific community.
The pair comparison method is often recommended in subjective experiments because of the reliability of the obtained results. However, a drawback of this method is that the number of comparisons increases exponentially with the number of stimuli, which limits its usability for a large number of stimuli. Several design methods that aim to reduce the number of comparisons were proposed in the literature. However, their performances in the context of 3DTV should be evaluated carefully due to the fact that the results obtained from a paired comparison experiment in 3DTV may be influenced by two important factors. One is the observation error from observer's attentiveness, in particular inverting the vote. The second factor concerns the dependence on the context in which the evaluation takes place. In this study, three design methods, namely Full Paired Comparison method (FPC), Square Design method (SD) and the Adaptive Square Design method (ASD) were evaluated by subjective visual discomfort experiment in 3DTV. The results from the FPC method were considered as the ground truth. Comparing with the ground truth, the ASD method provided the most accurate results with a given number of trials. It also showed the highest robustness against observation errors and interdependence of comparisons. Due to the efficiency of the ASD method, paired comparison experiments become feasible with a reasonably large number of stimuli for measuring 3DTV visual discomfort.
KEYWORDS: Video, Molybdenum, Statistical analysis, Camera shutters, 3D imaging standards, Display technology, Video coding, Glasses, Visualization, Standards development
Subjective assessment of Quality of Experience in stereoscopic 3D requires new guidelines for the environmental setup as existing standards such as ITU-R BT.500 may no longer be appropriate. A first step is to perform cross-lab experiments in different viewing conditions on the same video sequences. Three international labs performed Absolute Category Rating studies on a freely available video database containing degradations that are mainly related to video quality degradations. Different conditions have been used in the labs: Passive polarized displays, active shutter displays, differences in viewing distance, the number of parallel viewers, and the voting device. Implicit variations were introduced due to the three different languages in Sweden, South Korea, and France. Although the obtained Mean Opinion Scores are comparable, slight differences occur in function of the video degradations and the viewing distance. An analysis on the statistical differences obtained between the MOS of the video sequences revealed that obtaining an equivalent number of differences may require more observers in some viewing conditions. It was also seen that the alignment of the meaning of the attributes used in Absolute Category Rating in different languages may be beneficial. Statistical analysis was performed showing influence of the viewing distance on votes and MOS results.
KEYWORDS: Video, Video coding, Image quality, Video processing, Visualization, Information visualization, 3D modeling, Associative arrays, 3D video compression
The 3D video quality is of highest importance for the adoption of a new technology from a user’s point of view. In this paper we evaluated the impact of coding artefacts on stereoscopic 3D video quality by making use of several existing full reference 2D objective metrics. We analyzed the performance of objective metrics by comparing to the results of subjective experiment. The results show that pixel based Visual Information Fidelity metrics fits subjective data the best. The 2D stereoscopic video quality seems to have dominant impact on the coding artefacts impaired stereoscopic videos.
KEYWORDS: Visualization, Image quality, 3D modeling, 3D image processing, Image compression, Visual process modeling, 3D displays, Video, Distortion, Molybdenum
Modern stereoscopic 3DTV brings new QoE (quality of experience) to viewers, which not only enhances the 3D
sensation due to the added binocular depth, but may also induce new problems such as visual discomfort. Subjective
quality assessment is the conventional method to assess the QoE. However, the conventional perceived image quality
concept is not enough to reveal the advantages and the drawbacks of stereoscopic images in 3DTV. Higher-level
concepts such as visual experience were proposed to represent the overall visual QoE for stereoscopic images. In this
paper, both the higher-level concept quality indicator, i.e. visual experience and the basic level concepts quality
indicators including image quality, depth quantity, and visual comfort are defined. We aim to explore 3D QoE by
constructing the visual experience as a weight sum of image quality, depth quantity and visual comfort. Two experiments
in which depth quantity and image quality are varied respectively are designed to validate this model. In the first
experiment, the stimuli consist of three natural scenes and for each scene, there are four levels of perceived depth
variation in terms of depth of focus: 0, 0.1, 0.2 and 0.3 diopters. In the second experiment, five levels of JPEG 2000
compression ratio, 0, 50, 100, 175 and 250 are used to represent the image quality variation. Subjective quality
assessments based on the SAMVIQ method are used in both experiments to evaluate the subject's opinion in basic level
quality indicators as well as the higher-level indicator. Statistical analysis of result reveals how the perceived depth and
image quality variation affect different perceptual scales as well as the relationship between different quality aspects.
Crosstalk is one of the main display-related perceptual factors degrading image quality and causing visual discomfort on 3D-displays. It causes visual artifacts such as ghosting effects, blurring, and lack of color fidelity which are considerably annoying and can lead to difficulties to fuse stereoscopic images. On stereoscopic LCD with shutter-glasses, crosstalk is mainly due to dynamic temporal aspects: imprecise target luminance (highly dependent on the combination of left-view and right-view pixel color values in disparity regions) and synchronization issues between shutter-glasses and LCD. These different factors influence largely the reproducibility of crosstalk measurements across laboratories and need to be evaluated in several different locations involving similar and differing conditions.
In this paper we propose a fast and reproducible measurement procedure for crosstalk based on high-frequency temporal measurements of both display and shutter responses. It permits to fully characterize crosstalk for any right/left color combination and at any spatial position on the screen. Such a reliable objective crosstalk measurement method at several spatial positions is considered a mandatory prerequisite for evaluating the perceptual influence of crosstalk in further subjective studies.
KEYWORDS: Video, Computer programming, Molybdenum, Scalable video coding, Video coding, Video processing, Quantization, Visualization, Light sources and illumination, Databases
In video coding, it is commonly accepted that the encoding parameters such as the quantization step-size have
an influence on the perceived quality. It is also sometimes accepted that using given encoding parameters, the perceived quality does not change significantly according to the encoded source content. In this paper, we present the outcomes of two video subjective quality assessment experiments in the context of Scalable Video
Coding. We encoded a large set of video sequences under a group of constant quality scenarios based on two spatially scalable layers. One first experiment explores of the relation between a wide range of quantization parameters for each layer and the perceived quality, while the second experiment uses a subset of the encoding
scenarios on a large number of video sequences. The two experiments are aligned on a common scale using a set of shared processed video sequences, resulting in a database containing the subjective scores for 60 different sources combined with 20 SVC scenarios. We propose a detailed analysis of the experimental results of the two
experiments, bringing a clear insight of the relation between the encoding parameters combination of the scalable
layers and the perceived quality, as well as spreading light on the differences in terms of quality depending on the
encoded source content. As an endeavour to analyse these differences, we propose a classification of the sources
with regards to their relative behaviour when compared to the average of other source contents. We use this
classification to identify potential factors to explain the differences between source contents.
In this paper, a subjective study is presented which aims to measure the minimum perceivable depth difference on an
autostereoscopic display in order to provide an indication for visual fatigue. The developed experimental setup was used
to compare the subject's performance before and after 3D excitation on an autostereoscopic display. By comparing the
results to a verification session with 2D excitation, the effect of 3D visual fatigue can be isolated. It was seen that it is
possible to reach the threshold of acuity for stereo disparity on that autostereoscopic display. It was also found that the
measured depth acuity is slightly higher after 3D viewing than after 2D viewing.
KEYWORDS: Video, Video processing, Molybdenum, Video coding, Video compression, Visualization, Computer programming, Spatial resolution, 3D displays, Temporal resolution
Broadcasting of high definition (HD) stereobased 3D (S3D) TV are planned, or has already begun, in Europe, the US,
and Japan. Specific data processing operations such as compression and temporal and spatial resampling are commonly
used tools for saving network bandwidth when IPTV is the distribution form, as this results in more efficient recording
and transmission of 3DTV signals, however at the same time it inevitably brings quality degradations to the processed
video. This paper investigated observers quality judgments of state of the art video coding schemes (simulcast
H.264/AVC or H.264/MVC), with or without added temporal and spatial resolution reduction of S3D videos, by
subjective experiments using the Absolute Category Rating method (ACR) method. The results showed that a certain
spatial resolution reduction working together with high quality video compressing was the most bandwidth efficient way
of processing video data when the required video quality is to be judged as "good" quality. As the subjective experiment
was performed in two different laboratories in two different countries in parallel, a detailed analysis of the interlab
differences was performed.
Human binocular depth perception, the most important element brought by 3DTV, is proved to be closely connected to
not only the content acquisition (camera focal length, camera baseline and etc.) but also the viewing environment
(viewing distance, screen size and etc.). Conventional 3D stereography rule in the literature usually consider the general
viewing condition and basic human factors to guide the content acquisition, such as assuming human inter-pupil baseline
as the maximum disparity. A lot of new elements or problems of stereoscopic viewing was not considered or precisely
defined so that advanced shooting rule is needed to guarantee the overall quality of stereoscopic video. In this paper, we
proposed a new stereoscopic video shooting rule considering two most important issues in 3DTV: stereoscopic distortion
and comfortable viewing zone. Firstly, a mathematic model mapping the camera space to visualization space is
established in order to geometrically estimate the stereoscopic depth distortion. Depth and shape distortion factors are
defined and used to describe the stereoscopic distortion. Secondly, comfortable viewing zone (or Depth of focus) is
considered to reduce the problem of visual discomfort and visual fatigue. The new shooting rule is to optimize the
camera parameters (focal length, camera baseline and etc.) in order to control depth and shape distortion and also
guarantee that the perceived scene is located in comfortable viewing zone as possible. However, in some scenarios, the
above two conditions cannot be fulfill simultaneously, even sometimes contradict with each other so that priority should
be decided. In this paper, experimental stereoscopic synthetic content generation with various sets of camera parameters
and various sets of scenes representing different depth range are presented. Justification of the proposed new shooting
rule is based on 3D concepts (depth rendering, visual comfort and visual experience) subjective video assessment. The
results of this study will provide a new method to propose camera parameters based on management of new criteria
(shape distortion and depth of focus) in order to produce optimized stereoscopic images and videos.
The influence of a monocular depth cue, blur, on the apparent depth of stereoscopic scenes will be studied in this paper.
When 3D images are shown on a planar stereoscopic display, binocular disparity becomes a pre-eminent depth cue. But
it induces simultaneously the conflict between accommodation and vergence, which is often considered as a main reason
for visual discomfort. If we limit this visual discomfort by decreasing the disparity, the apparent depth also decreases.
We propose to decrease the (binocular) disparity of 3D presentations, and to reinforce (monocular) cues to compensate
the loss of perceived depth and keep an unaltered apparent depth. We conducted a subjective experiment using a twoalternative
forced choice task. Observers were required to identify the larger perceived depth in a pair of 3D images
with/without blur. By fitting the result to a psychometric function, we obtained points of subjective equality in terms of
disparity. We found that when blur is added to the background of the image, the viewer can perceive larger depth
comparing to the images without any blur in the background. The increase of perceived depth can be considered as a
function of the relative distance between the foreground and background, while it is insensitive to the distance between
the viewer and the depth plane at which the blur is added.
Scalable Video Coding (SVC) provides a way to encapsulate several video layers with increasing quality and resolution in a
single bitstream. Thus it is particularly adapted to address heterogeneous networks and a wide variety of decoding devices.
In this paper, we evaluate the interest of SVC in a different context, which is error concealment after transmission on
networks subject to packet loss. The encoded scalable video streams contain two layers with different spatial and temporal
resolutions designed for mobile video communications with medium size and average to low bitrates. The main idea is
to use the base layer to conceal errors in the higher layers if they are corrupted or lost. The base layer is first upscaled
either spatially or temporally to reach the same resolution as the layer to conceal. Two error-concealment techniques
using the base layer are then proposed for the MPEG-4 SVC standard, involving frame-level concealment and pixel-level
concealment. These techniques are compared to the upscaled base layer as well as to a classical single-layer MPEG-
4 AVC/H.264 error-concealment technique. The comparison is carried out through a subjective experiment, in order to
evaluate the Quality-of-Experience of the proposed techniques. We study several scenarios involving various bitrates
and resolutions for the base layer of the SVC streams. The results show that SVC-based error concealment can provide
significantly higher visual quality than single-layer-based techniques. Moreover, we demonstrate that the resolution and
bitrate of the base layer have a strong impact on the perceived quality of the concealment.
Viewing 3D content on an autostereoscopic is an exciting experience. This is partly due to the fact that the 3D
effect is seen without glasses. Nevertheless, it is an unnatural condition for the eyes as the depth effect is created
by the disparity of the left and the right view on a flat screen instead of having a real object at the corresponding
location. Thus, it may be more tiring to watch 3D than 2D. This question is investigated in this contribution by
a subjective experiment. A search task experiment is conducted and the behavior of the participants is recorded
with an eyetracker. Several indicators both for low level perception as well as for the task performance itself are
evaluated. In addition two optometric tests are performed. A verification session with conventional 2D viewing
is included. The results are discussed in detail and it can be concluded that the 3D viewing does not have a
negative impact on the task performance used in the experiment.
In this paper we address the problem of crosstalk reduction for autostereoscopic displays. Crosstalk refers to the
perception of one or more unwanted views in addition to the desired one. Specifically, the proposed approach consists of
three different stages: a crosstalk measurement stage, where the crosstalk is modeled, a filter design stage, based on the
results obtained out of the measurements, to mitigate the crosstalk effect, and a validation test carried out by means of
subjective measurements performed in a controlled environment as recommended in ITU BT 500-11. Our analysis,
synthesis, and subjective experiments are performed on the Alioscopy® display, which is a lenticular multiview display.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.