Paper
4 December 2024 ETSS-Net: a one-stage detection model for infrared small target images based on efficient transformer and scale sequence attention mechanism
Jingcheng Shi, Jiaxing Li, Zhiwei Jiang, Shu Zhang, Ningning Song, Shiguo Chen, Liping Wei
Author Affiliations +
Proceedings Volume 13283, Conference on Spectral Technology and Applications (CSTA 2024); 132833G (2024) https://doi.org/10.1117/12.3037121
Event: Conference on Spectral Technology and Applications (CSTA 2024), 2024, Dalian, China
Abstract
Infrared small target detection is crucial in various applications, such as nighttime vessel inspection, disaster warning systems, and others. Compared with traditional object detection tasks, infrared scenes present distinctive challenges. First of all, due to the target’s distance or shape, the proportion of the target is exceedingly small and the information of the background is complex. Generally, the target comprises a few pixels in extreme cases. Second, the targets in infrared detection tasks are typically sparsely distributed and low contrast contains only a few instances, each of object occupies a minuscule portion of the entire infrared image. Meanwhile, the processing system has high requirements for real-time and multi-frame performance is difficult to achieve. In order to solve the above problems, we proposed a novel Detection Transformer network (DETR) for infrared images, which integrates the efficient transformer and a novel scale sequence attention mechanism (ETSS-Net). The contributions of this work can be summarized as follows: (1) with the success of Transformers in computer vision tasks, recent studies try to optimize the complexity of Transformers in the detection tasks. However, the variants of the Transformers still have higher considerably parameters than some lightweight convolutional neural networks. According to this idea, we designed a self-attention Transformer block, which we called Frequency-based Intrascale Feature Interaction (FIFI). It is inspired by the interaction and expression of frequency information between image pixels. (2) Second, we propose a plug-and-play dimension scale selection module (DSSM) to maintain the balance between detection speed, effect, and the number of parameters and to make the proposed SLAM improve the performance of detection model, and it can simultaneously incorporate both spatial and channel information in the training processing, as well as local and global information. (3) The proposed ETSS-Net improved the detection performance of infrared small targets. The feature learning ability of this model can be well enhanced by the designed backbone Transformer and attention mechanism. Experiments on numerous infrared datasets proved that the proposed method could improve the expression effect and detection ability of the small target detection method. Meanwhile, our method outperforms state-of-the-art methods in terms of both accuracy and parameters.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Jingcheng Shi, Jiaxing Li, Zhiwei Jiang, Shu Zhang, Ningning Song, Shiguo Chen, and Liping Wei "ETSS-Net: a one-stage detection model for infrared small target images based on efficient transformer and scale sequence attention mechanism", Proc. SPIE 13283, Conference on Spectral Technology and Applications (CSTA 2024), 132833G (4 December 2024); https://doi.org/10.1117/12.3037121
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Infrared radiation

Infrared imaging

Small targets

Infrared detectors

Target detection

Transformers

Back to Top