Paper
21 July 2023 A video keyframe retrieval algorithm based on CLIP tech
Zuoxing Zhang, Xiaowei Su, Hongyi Hou, Kesheng Qi, Baoguo Lu
Author Affiliations +
Proceedings Volume 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023); 1271723 (2023) https://doi.org/10.1117/12.2684607
Event: 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 2023, Wuhan, China
Abstract
Traditional video keyframe retrieval algorithms have certain limitations in terms of retrieval accuracy and user experience, such as the inability to accurately capture the semantic information of the video and weak correlation between keyframes. To address these issues, this paper proposes a video keyframe retrieval algorithm based on CLIP technology. The algorithm extracts video keyframes using a clustering method based on SIFT features and uses a pre-trained CLIP model to extract both visual and semantic features of the keyframes, achieving more accurate retrieval. Additionally, the algorithm uses a bidirectional LSTM model to model inter-frame contexts, enhancing the correlation between keyframes and improving retrieval accuracy. Experimental results show that the proposed algorithm outperforms traditional algorithms in terms of retrieval accuracy and user experience.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zuoxing Zhang, Xiaowei Su, Hongyi Hou, Kesheng Qi, and Baoguo Lu "A video keyframe retrieval algorithm based on CLIP tech", Proc. SPIE 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 1271723 (21 July 2023); https://doi.org/10.1117/12.2684607
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Feature extraction

Semantic video

Video surveillance

Semantics

Video processing

Visualization

Back to Top