Traffic sign recognition using optimized YOLO network based on multi-scale feature extraction and attention mechanism

Yong Yan; Haoran Zhao; Yunpeng Li; Xuejun Liu; Weiwei Wang; Jiacheng Guo; Shuo Zhang; Yun Sha; Xuerui Wang; Yinan Jiang

doi:10.1117/1.JEI.33.5.053023

28 September 2024 Traffic sign recognition using optimized YOLO network based on multi-scale feature extraction and attention mechanism

Yong Yan, Haoran Zhao, Yunpeng Li, Xuejun Liu, Weiwei Wang, Jiacheng Guo, Shuo Zhang, Yun Sha, Xuerui Wang, Yinan Jiang

Author Affiliations +

Journal of Electronic Imaging, Vol. 33, Issue 5, 053023 (September 2024). https://doi.org/10.1117/1.JEI.33.5.053023

Abstract

In the natural driving state, the autonomous vehicle needs to make its driving decision, which requires precise traffic sign recognition with different distances and angles according to road traffic rules. However, the change in the observation points of autonomous cameras causes traffic signs to exhibit visual characteristics of variable size and complex backgrounds. The coupling of multiple characteristics poses severe challenges for traffic sign identification and positioning. In this paper, a multi-scale feature extraction and coordinate network based on the YOLO framework (MFEC-YOLOv7) is proposed to achieve accurate traffic sign recognition in complex situations. In more detail, the feature for traffic signs at different scales is extracted and enhanced by fusing the pooling layer and group convolution in the convolutional architecture of the backbone network, thus improving the multi-scale feature perception capability. Furthermore, the coordinate attention (CA) module is introduced to pay attention to traffic signs under multiple interference factors. And the collaborative working mechanism of the bidirectional feature pyramid is brought in the neck network, and combined with the CA module to accurately locate the position information of features, to enhance the feature extraction ability of the model. Besides, depth separable convolution is adopted to significantly reduce the number of convolution parameters and improve the efficiency without reducing the effectiveness of feature extraction. The experimental results on the public data set TT100K and the self-built data set show that compared with the YOLOv7 method, MFEC-YOLOv7 improves the detection accuracy by 19.9%, reduces the amount of parameter calculation by 9.5%, and improves the speed by about 16.9%.

Citation Download Citation

Yong Yan, Haoran Zhao, Yunpeng Li, Xuejun Liu, Weiwei Wang, Jiacheng Guo, Shuo Zhang, Yun Sha, Xuerui Wang, and Yinan Jiang "Traffic sign recognition using optimized YOLO network based on multi-scale feature extraction and attention mechanism," Journal of Electronic Imaging 33(5), 053023 (28 September 2024). https://doi.org/10.1117/1.JEI.33.5.053023

Received: 21 May 2024; Accepted: 3 September 2024; Published: 28 September 2024

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $24.00

Non-members: $28.00 ADD TO CART

JOURNAL ARTICLE
18 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Convolution

Object detection

Feature extraction

Target detection

Neck

Performance modeling

Feature fusion

Show All Keywords

Keywords/Phrases

Search In:

Publication Years