Paper
29 January 2007 Video to the rescue of audio: shot boundary assisted speaker change detection
Author Affiliations +
Proceedings Volume 6506, Multimedia Content Access: Algorithms and Systems; 650609 (2007) https://doi.org/10.1117/12.703114
Event: Electronic Imaging 2007, 2007, San Jose, CA, United States
Abstract
Speaker change detection (SCD) is a preliminary step for many audio applications such as speaker segmentation and recognition. Thus, its robustness is crucial to achieve a good performance in the later steps. Especially, misses (false negatives) affect the results. For some applications, domain-specific characteristics can be used to improve the reliability of the SCD. In broadcast news and discussions, the cooccurrence of shot boundaries and change points provides a robust clue for speaker changes. In this paper, two multimodal approaches are presented that utilize the results of a shot boundary detection (SBD) step to improve the robustness of the SCD. Both approaches clearly outperform the audio-only approach and are exclusively applicable for TV broadcast news and plenary discussions.
© (2007) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Amjad Samour, Mustafa Karaman, Lutz Goldmann, and Thomas Sikora "Video to the rescue of audio: shot boundary assisted speaker change detection", Proc. SPIE 6506, Multimedia Content Access: Algorithms and Systems, 650609 (29 January 2007); https://doi.org/10.1117/12.703114
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Single crystal X-ray diffraction

Image segmentation

Multimedia

Reliability

Speaker recognition

Acoustics

RELATED CONTENT


Back to Top