A video keyframe retrieval algorithm based on CLIP tech

Zuoxing Zhang; Xiaowei Su; Hongyi Hou; Kesheng Qi; Baoguo Lu

doi:10.1117/12.2684607

21 July 2023 A video keyframe retrieval algorithm based on CLIP tech

Zuoxing Zhang, Xiaowei Su, Hongyi Hou, Kesheng Qi, Baoguo Lu

Proceedings Volume 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023); 1271723 (2023) https://doi.org/10.1117/12.2684607
Event: 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 2023, Wuhan, China

Abstract

Traditional video keyframe retrieval algorithms have certain limitations in terms of retrieval accuracy and user experience, such as the inability to accurately capture the semantic information of the video and weak correlation between keyframes. To address these issues, this paper proposes a video keyframe retrieval algorithm based on CLIP technology. The algorithm extracts video keyframes using a clustering method based on SIFT features and uses a pre-trained CLIP model to extract both visual and semantic features of the keyframes, achieving more accurate retrieval. Additionally, the algorithm uses a bidirectional LSTM model to model inter-frame contexts, enhancing the correlation between keyframes and improving retrieval accuracy. Experimental results show that the proposed algorithm outperforms traditional algorithms in terms of retrieval accuracy and user experience.

Citation Download Citation

Zuoxing Zhang, Xiaowei Su, Hongyi Hou, Kesheng Qi, and Baoguo Lu "A video keyframe retrieval algorithm based on CLIP tech", Proc. SPIE 12717, 3rd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2023), 1271723 (21 July 2023); https://doi.org/10.1117/12.2684607

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
6 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Video

Feature extraction

Semantic video

Video surveillance

Semantics

Video processing

Visualization

RELATED CONTENT

Fusing video and text data by integrating appearance and behavior...
Proceedings of SPIE (May 28 2013)

Real time algorithm invariant to natural lighting with LBP techniques...
Proceedings of SPIE (March 07 2014)

Efficient visual information indexation for supporting actionable intelligence and knowledge...
Proceedings of SPIE (October 17 2023)

Video description method with fusion of instance-aware temporal features
Proceedings of SPIE (August 09 2023)

Highlight detection for video content analysis through double filters
Proceedings of SPIE (June 24 2005)

Activity-based exploitation of Full Motion Video (FMV)
Proceedings of SPIE (May 25 2012)

Novel approach to collusion-resistant video watermarking
Proceedings of SPIE (April 29 2002)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years