22 December 2023 Loop closure detection based on feature pyramids and NetVLAD
Mingrong Ren, Bo Gao
Author Affiliations +
Abstract

Most traditional loop closure detection (LCD) methods rely on manual feature design, which is sensitive to environmental conditions. Convolutional neural networks (CNNs) cope better with illumination changes by extracting hierarchical features and ignoring the local spatial characteristics of images. We propose an LCD algorithm that combines VGG16, NetVLAD, and image pyramids to enhance its accuracy and robustness. In particular, a three-level image pyramid was constructed via downsampling, and then a feature pyramid (FP) layer was obtained by extracting features through VGG16 network on different image resolutions. The obtained FPs were then passed into the VLAD model, and this model outputted VLAD vectors by performing residual summation with L2 normalization. Finally, a triplet loss function was employed for training. Experimental results on two benchmark datasets and a real scenario dataset demonstrated that this algorithm outperforms the NetVLAD baseline and the VGG16 network, exhibiting superior feature-learning capabilities and achieving a higher LCD accuracy. Further, it maintained real-time performance with only a 2% increase in processing time. The results indicate that the proposed method detects loop closures even in complex environments with varying conditions and perspectives. Hence, the approach can be used for large-scale visual simultaneous localization and mapping applications, such as autonomous driving, where LCD plays a crucial role in mapping.

© 2023 SPIE and IS&T
Mingrong Ren and Bo Gao "Loop closure detection based on feature pyramids and NetVLAD," Journal of Electronic Imaging 32(6), 063033 (22 December 2023). https://doi.org/10.1117/1.JEI.32.6.063033
Received: 12 July 2023; Accepted: 6 December 2023; Published: 22 December 2023
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Liquid crystal displays

Performance modeling

Feature extraction

Visualization

Data modeling

Image resolution

Back to Top