Open Access Paper
12 November 2024 Global context-aware and attention mechanism method for small-scale pedestrian detection
Tian Li, Mingxing Li
Author Affiliations +
Proceedings Volume 13395, International Conference on Optics, Electronics, and Communication Engineering (OECE 2024) ; 133951Z (2024) https://doi.org/10.1117/12.3049502
Event: International Conference on Optics, Electronics, and Communication Engineering, 2024, Wuhan, China
Abstract
Small-scale pedestrian detection is a major challenge due to limited pixel resolution and insufficient distinguishing features, frequently resulting in incorrect or missed detections. To address it, this paper proposed a global context-aware and attention mechanism algorithm for small-scale pedestrian detection. Firstly, considering the problem of small-scale pedestrian features gradually decreasing with network depth, we leverage the advantage of Transformers in capturing longrange dependencies. This allows us to design a global context information module that can retain a large number of smallscale pedestrian features. Then, considering the issue of small-scale pedestrian features easily being confused with background information, a Coordinate and Channel Attention Module (CCAM) is proposed. Coordinate attention can capture direction-aware and position-sensitive information, which helps the model to locate and recognize objects of interest more accurately. Channel Attention can effectively enhance small-scale pedestrian features and suppressing background information. Experimental results on the CrowdHuman dataset fully demonstrate that the proposed method can significantly improve the detection ability for small-scale pedestrian.
© (2024) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tian Li and Mingxing Li "Global context-aware and attention mechanism method for small-scale pedestrian detection", Proc. SPIE 13395, International Conference on Optics, Electronics, and Communication Engineering (OECE 2024) , 133951Z (12 November 2024); https://doi.org/10.1117/12.3049502
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Detection and tracking algorithms

Transformers

Visual process modeling

Performance modeling

Ablation

Convolution

Back to Top