Paper
19 July 2024 Image generation method based on diversified feature learning attention mechanism
Wenhao Sun, Gengchen Liu, Weiqi Xue, Qingqi Zhou, Jiaju Wang, Jiaqing Yan
Author Affiliations +
Proceedings Volume 13213, International Conference on Image Processing and Artificial Intelligence (ICIPAl 2024); 1321306 (2024) https://doi.org/10.1117/12.3035290
Event: International Conference on Image Processing and Artificial Intelligence (ICIPAl2024), 2024, Suzhou, China
Abstract
Although attention mechanisms have been widely applied in natural language processing (NLP) tasks, there are still limitations in their utilization within the field of computer vision. To integrate the advantages of convolutional neural networks (CNNs) with attention mechanisms, this study proposes a multimodal image generation model based on the Stable Diffusion architecture. The model incorporates two types of convolutional modules, namely ICB and TCB. By replacing the linear networks with convolutional neural networks, the model enhances its capability to process complex features. Subsequently, the low-dimensional encoding reconstruction of images is achieved by maximizing the mutual information between the input of the new model and the output of the encoder. Finally, the proposed model is validated using a publicly available dataset.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Wenhao Sun, Gengchen Liu, Weiqi Xue, Qingqi Zhou, Jiaju Wang, and Jiaqing Yan "Image generation method based on diversified feature learning attention mechanism", Proc. SPIE 13213, International Conference on Image Processing and Artificial Intelligence (ICIPAl 2024), 1321306 (19 July 2024); https://doi.org/10.1117/12.3035290
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Convolution

Diffusion

Feature extraction

Education and training

Convolutional neural networks

Image quality

Image processing

Back to Top