Nowadays there is a trend towards the use of unimodal databases for multimedia content description, organization and retrieval applications of a single type of content like text, voice and images, instead bimodal databases allow to associate semantically two different types of content like audio-video, image-text, among others. The generation of a bimodal database of audio-video implies the creation of a connection between the multimedia content through the semantic relation that associates the actions of both types of information. This paper describes in detail the used characteristics and methodology for the creation of the bimodal database of violent content; the semantic relationship is stablished by the proposed concepts that describe the audiovisual information. The use of bimodal databases in applications related to the audiovisual content processing allows an increase in the semantic performance only and only if these applications process both type of content. This bimodal database counts with 580 audiovisual annotated segments, with a duration of 28 minutes, divided in 41 classes. Bimodal databases are a tool in the generation of applications for the semantic web.
The automatic identification and classification of musical genres based on the sound similarities to form musical textures, it is a very active investigation area. In this context it has been created recognition systems of musical genres, formed by time-frequency characteristics extraction methods and by classification methods. The selection of this methods are important for a good development in the recognition systems. In this article they are proposed the Mel-Frequency Cepstral Coefficients (MFCC) methods as a characteristic extractor and Support Vector Machines (SVM) as a classifier for our system. The stablished parameters of the MFCC method in the system by our time-frequency analysis, represents the gamma of Mexican culture musical genres in this article. For the precision of a classification system of musical genres it is necessary that the descriptors represent the correct spectrum of each gender; to achieve this we must realize a correct parametrization of the MFCC like the one we present in this article. With the system developed we get satisfactory detection results, where the least identification percentage of musical genres was 66.67% and the one with the most precision was 100%.
Current search engines are based upon search methods that involve the combination of words (text-based search); which
has been efficient until now. However, the Internet’s growing demand indicates that there’s more diversity on it with each
passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different
types of content denominated multimedia content (images, audio files, video files).
Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display
of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn’t help
improve the content or speed of the search itself. One solution is to improve them through the characterization of the
content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is
presented, focusing the needs according to new technologies.
Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of
having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown
that there are not many search engines that can perform content searches. The area of research of multimedia search engines
of new generation is a multidisciplinary area that’s in constant growth, generating tools that satisfy the different needs of
new generation systems.
In the computer world, the consumption and generation of multimedia content are in constant growth due to the popularization of mobile devices and new communication technologies. Retrieve information from multimedia content to describe Mexican buildings is a challenging problem. Our objective is to determine patterns related to three building eras (Pre-Hispanic, colonial and modern). For this purpose, existing recognition systems need to process a plenty of videos and images. The automatic learning systems trains the recognition capability with a semantic-annotated database. We built the database taking into account high-level feature concepts, user knowledge and experience. The annotations helps correlating context and content to understand the data on multimedia files. Without a method, the user needs a super mind to remember all and registry this data manually. This article presents a methodology for a quick images annotation using a graphical interface and intuitive controls. Emphasizing in the most two important features: time-consuming during annotations task and the quality of selected images. Though, we only classify images by its era and its quality. Finally, we obtain a dataset of Mexican buildings preserving the contextual information with semantic-annotations for training and test of buildings recognition systems. Therefore, research on content low-level descriptors is other possible use for this dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.