Object detection from images captured by Unmanned Aerial Vehicles (UAVs) are widely used for surveillance, precision agricultural, package delivery, aerial photography, among others. Very recently, a benchmark on object detection using UAVs collected images called VisDrone2018 has been released. However, large performance drop is observed when current state-of-the-art object detection approaches developed primarily for ground-to-ground images are directly applied on the VisDrone2018 dataset. For example, the best detection model on the VisDrone2018 has only achieved detection accuracy of 0.31 mAP, significantly lower than that of ground-based object detection. This performance drop is mainly caused by several challenges, such as 1) varying flying altitudes from 1000 feet to 10 feet, 2) different weather conditions like foggy, rainy and low-light 3) a wide range of camera viewing angles. To overcome these challenges, in this paper we propose to leverage a novel approach of adversarial training that aims to learn domain invariant features with respect to varying altitudes, viewing angles, weather conditions, and object scales. The adversarial training draws on “free” meta-data that comes with the UAV datasets providing information about the data themselves, such as heights, scene visibility, viewing angles, etc. We demonstrate the effectiveness of our proposed algorithm on the recently proposed UAVDT dataset, and also show it to generalize well when applied to a different VisDrone2018 dataset. We will also show robustness of the proposed approach to variations in altitude, viewing angle, weather, and object scale.
|