A CNN-LSTM-based model for fashion image aesthetic captioning

Binbin Yan

doi:10.1117/12.2660052

3 February 2023 A CNN-LSTM-based model for fashion image aesthetic captioning

Binbin Yan

Proceedings Volume 12511, Third International Conference on Computer Vision and Data Mining (ICCVDM 2022); 1251119 (2023) https://doi.org/10.1117/12.2660052
Event: Third International Conference on Computer Vision and Data Mining (ICCVDM 2022), 2022, Hulun Buir, China

Abstract

We propose a new task, how to describe apparel in an aesthetic way, which called fashion image aesthetic captioning. It can be beneficial to the E-commerce since there are tons of clothes needed captioned to capture customers’ eyes. It will also help people understand fashion better. We adopt the architecture of encoder-decoder as our baseline. We introduce two classifiers - color harmony classifier pretrained on AVA dataset as well as clothes type classifier to enable encoder to extract more correct features from clothes images. As for decoder we use LSTM with attention mechanism to generate sentences. Additionally, we build a new dataset containing 79,105 fashion images with aesthetic description and attributes. The experiment on the dataset shows great results of our model.

Citation Download Citation

Binbin Yan "A CNN-LSTM-based model for fashion image aesthetic captioning", Proc. SPIE 12511, Third International Conference on Computer Vision and Data Mining (ICCVDM 2022), 1251119 (3 February 2023); https://doi.org/10.1117/12.2660052

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
7 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Computer programming

Data modeling

Neural networks

Performance modeling

RELATED CONTENT

Few-shot sketch recognition for plotting system
Proceedings of SPIE (December 06 2021)

Anomaly detection algorithm based on deep autoencoder ensembles
Proceedings of SPIE (April 11 2022)

Research on cross platform integration method based on natural language...
Proceedings of SPIE (May 06 2022)

Integrating knowledge distillation of multiple strategies
Proceedings of SPIE (December 28 2022)

JSVulExplorer a JavaScript vulnerability detection model based on transfer...
Proceedings of SPIE (March 28 2023)

Study for CT to MRI translation based on cycle GAN...
Proceedings of SPIE (March 28 2023)

Genetic algorithm for neural networks optimization
Proceedings of SPIE (November 11 2004)

Subscribe to Digital Library

Receive Erratum Email Alert

Keywords/Phrases

Search In:

Publication Years