SKAttention technology effectively enhances the flexibility and accuracy of the model in processing aircraft images of different scales by adaptively selecting the most suitable convolution kernel size. MSCAM, on the other hand, optimizes the model's ability to process aircraft details and background information by enhancing channel attention at different scales. By combining these two methods into the Vision Transformer architecture, our model achieved accuracy and recall of 98.7% and 98.2%, respectively. These results validate the effectiveness of SKAttention and MSCAM in improving the performance of aviation aircraft detection based on Vision Transformer, providing new technological approaches and research directions for aviation image processing and object detection. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Object detection
Visual process modeling
Image processing
Target detection
Transformers
Performance modeling
Convolution