Paper
14 February 2020 Image caption generation method based on adaptive attention mechanism
Huazhong Jin, Yu Wu, Fang Wan, Man Hu, Qingqing Li
Author Affiliations +
Proceedings Volume 11430, MIPPR 2019: Pattern Recognition and Computer Vision; 114301C (2020) https://doi.org/10.1117/12.2539338
Event: Eleventh International Symposium on Multispectral Image Processing and Pattern Recognition (MIPPR2019), 2019, Wuhan, China
Abstract
An image caption generation model with adaptive attention mechanism is proposed for dealing with the weakness of the image description model by the local image features. Under the framework of encoder and decoder architecture, the local and global features of images are extracted by using inception V3 and VGG19 network models at the encoder. Since the adaptive attention mechanism proposed in this paper can automatically identify and acquire the importance of local and global image information, the decoder can generate sentences describing the image more intuitively and accurately. The proposed model is trained and tested on Microsoft COCO dataset. The experimental results show that the proposed method can extract more abundant and complete information from the image and generate more accurate sentences, compared with the image caption model based on local features.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Huazhong Jin, Yu Wu, Fang Wan, Man Hu, and Qingqing Li "Image caption generation method based on adaptive attention mechanism", Proc. SPIE 11430, MIPPR 2019: Pattern Recognition and Computer Vision, 114301C (14 February 2020); https://doi.org/10.1117/12.2539338
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer programming

Neural networks

Visualization

Feature extraction

Image enhancement

Data modeling

Image compression

Back to Top