Paper
7 October 2022 Semantic aggregation for accurate and efficient object detection
Author Affiliations +
Proceedings Volume 12344, International Conference on Intelligent and Human-Computer Interaction Technology (IHCIT 2022); 123440T (2022) https://doi.org/10.1117/12.2655778
Event: International Conference on Intelligent and Human-Computer Interaction Technology (IHCIT 2022), 2022, Zhuhai, China
Abstract
Single Shot Multibox Detector (SSD) uses multi-scale feature maps to detect and recognize objects, which greatly improves the performance of single-stage approaches, but it is still not conducive to detecting small objects. Researchers focus on enhancing features on the feature pyramid. However, many networks simply merge several feature maps, ignoring fully aggregating the semantics among different scale features. On this basis, this paper proposes an efficient semantic aggregation module and a lightweight feature combination module, which can significantly improve the detection accuracy based on SSD. In the semantic aggregation module, the feature maps of different sizes are adjusted and integrated through different channels to obtain the enhanced features with rich semantic information, which can improve the discrimination and expression ability of the features. In the feature combination module, the detector can fully utilize the multi-scale convolution layers in the feature pyramid to produce more descriptive and representative enhanced features with rich semantics. Our proposed network with 512×512 input size can respectively achieve 82.6% mAP and 81.3% mAP (mean Average Precision) in VOC2007 test and VOC2012 test datasets. Some experiments and ablation studies show that this method is superior to many advanced detectors in accuracy and speed.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhipeng Duan, Jing Wang, Shu Zhan, Zhenping Ruan, Fei Li, and Zhen Yang "Semantic aggregation for accurate and efficient object detection", Proc. SPIE 12344, International Conference on Intelligent and Human-Computer Interaction Technology (IHCIT 2022), 123440T (7 October 2022); https://doi.org/10.1117/12.2655778
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Sensors

Deconvolution

Convolution

Multilayers

Information science

Ions

Lithium

RELATED CONTENT

DTB Net A text detector in a football match...
Proceedings of SPIE (October 29 2018)
Hybrid model for the analysis of radar sensors
Proceedings of SPIE (July 09 1992)

Back to Top