Paper
30 April 2018 Benchmarking deep learning trackers on aerial videos
Breton Minnehan, Anthony Salmin, Karl Salva, Andreas Savakis
Author Affiliations +
Abstract
In this paper, we benchmark five state-of-the-art trackers on aerial platform videos: Multi-domain Convolutional Neural Network (MDNET) tracker, which was the winner of the VOT2015 tracking challenge, the Fully Convolutional Neural network Tracker (FCNT), the Spatially Regularized Correlation Filter (SRDCF) tracker, the Continuous Convolution Operator Tracker (CCOT) tracker, which was the winner of the VOT2016 challenge, and the Tree structure Convolutional Neural Network (TCNN) tracker. We assess performance in terms of both tracking accuracy and processing speed based on two sets of videos: a subset of the OTB dataset where the cameras are located at a high vantage point and a new dataset of aerial videos captured by a moving platform. Our results indicate that these trackers performed as expected for the videos in the OTB subset, however, tracker performance degraded significantly in aerial videos due to target size, camera motion and target occlusions. The CCOT tracker yielded the best overall performance in terms of accuracy, while the SRDCF tracker was the fastest.
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Breton Minnehan, Anthony Salmin, Karl Salva, and Andreas Savakis "Benchmarking deep learning trackers on aerial videos", Proc. SPIE 10649, Pattern Recognition and Tracking XXIX, 1064915 (30 April 2018); https://doi.org/10.1117/12.2323866
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Optical tracking

Video surveillance

Detection and tracking algorithms

Convolutional neural networks

Feature extraction

Network architectures

Back to Top