Paper
3 January 2020 Generating description with multi-feature and saliency maps of image
Lisha Liu, Chunna Tian, Ruiguo Zhang, Yuxuan Ding
Author Affiliations +
Proceedings Volume 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019); 113730Z (2020) https://doi.org/10.1117/12.2557584
Event: Eleventh International Conference on Graphics and Image Processing, 2019, Hangzhou, China
Abstract
Automatically generating the description of an image is a task that connects computer vision and natural language processing. It has gained more and more attention in the field of artificial intelligence. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with multi-feature weighted by object attention to represent images. We use LSTM (long short term memory), which is a RNN model, to translate multi-feature of images to text. Most existing methods use single CNN (convolution neural network) trained on ImageNet to extract image features which mainly focuses on objects in images. However, the context in the scene is also informative to image captioning. So we incorporate the scene feature extracted with CNN trained on Places205. We evaluate our model on MSCOCO dataset based on standard metrics. Experiments show that multi-feature performs better than single feature. In addition, the saliency weight on images emphasizes the salient objects in images as the subject in image descriptions. The results show that our model performs better than several state-of-the-art methods on image captioning.
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Lisha Liu, Chunna Tian, Ruiguo Zhang, and Yuxuan Ding "Generating description with multi-feature and saliency maps of image", Proc. SPIE 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019), 113730Z (3 January 2020); https://doi.org/10.1117/12.2557584
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Feature extraction

Neural networks

Visual process modeling

Content addressable memory

Image retrieval

Image segmentation

Back to Top