We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.
KEYWORDS: RGB color model, Fuzzy logic, Distance measurement, Reflectivity, Colorimetry, Visualization, Visual process modeling, Quantization, Modeling, Human vision and color perception
A color similarity test was conducted on the 24 color patches of a Gretag Macbeth color checker. Color similarities were measured either by distances between standard colorimetric representations (such as RGB, Lab or spectral reflectance curves) or by human observer judgments. In each case, the dissimilarity matrix was processed by a classical, metric, multidimensional scaling algorithm, in order to produce a visually-interpretable two-dimensional plot of color dissimilarity. The analysis of the plots produces some interesting conclusions. First, the plots produced by the Lab, RGB and spectral representations exhibit very evident variation axes according to the luminance and basic chromatic differences (red-green, blue-yellow). This behavior (trivial for the Lab representation) suggests that the color similarity measurement by chromatic differences is implicitly embedded in the RGB and spectral representations. The color dissimilarity plots associated to the human judgments (for any individual, as well as for an “average” observer) exhibit a different organization, which mixes hue, saturation and luminance (HSV). According to these plots, the human similarity judgment is not entirely HSV-based. We prove that it is possible to obtain the same color dissimilarity plots if a fuzzy color model is assumed. The fuzzy color model provides similarity coefficients (similarity degrees) between pairs of colors, based on their inter-distance, according to an imposed “color confusion” control parameter, which seems to be relevant for the human vision.
This paper describes a preliminary study aimed at improving the quality of soft-blue veined cheeses by the use of magnetic resonance images analysis. MRI measurements were performed on thirty-two samples from two different processing conditions and at three different stages from day 3 after the production to day 37. A segmentation algorithm based on a Self Organizing Map was used to segment the images into six classes. A cavity extraction was then performed. A principal component analysis was computed on variables corresponding to the cavities surface distribution. The results pointed out differences between the two types of cheese, particularly for day 3 and day 37. This confirmed the interest to use MRI to analyze such products. Further investigations are planned for the analysis of other characteristics of the cheeses and other methods of segmentation.
Histogram equalization (HE) is one of the simplest and most effective techniques for enhancing gray-level images. For color images, HE becomes a more difficult task, due to the vectorial nature of data. We propose a new method for color image enhancement that uses two hierarchical levels of HE: global and local. In order to preserve the hue, equalization is only applied to intensities. For each pixel (called the 'seed' when processed) a variable-sized, variable-shaped neighborhood is determined to contain pixels that are 'similar' to the seed. Then, the histogram of the region is stretched in a range that is computed with respect to the statistical parameters of the region (mean, variance) and to the global HE function (of intensities), and only the seed will be given a new intensity value. We applied the proposed color HE method to various images and observed the results to be subjectively 'pleasant to the human eye,' with emphasized details, preserved colors, and with the histogram of intensities close to the ideal uniform one. The results compared favorably to those of three other methods (histogram explosion, histogram decimation, and three-dimensional histogram equalization) in terms of subjective visual quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.