Autonomous vehicles typically employ several types of sensors simultaneously, such as LiDAR (Light Detection and Ranging) sensors, visual cameras, and radar to provide information about the surrounding scene while driving. Ideally, coupling multiple sensors would improve a system that only utilizes a single sensing modality for its perception task. For an autonomous system to understand the scene intelligently, it must be equipped with object localization and classification, which is commonly employed using a visual camera. Object detection and classification may also be applied to LiDAR and thermal sensors to further enhance scene awareness of the autonomous system. Herein, we investigate the fusing of information obtained from visual (RGB), LiDAR, and infrared (IR) sensors in order to improve object classification accuracy and heighten scene awareness. In autonomy, there are several levels of fusion that can be employed, such as sensor-level, feature-level, and decision-level; for the scope of this research, we will be exploring the impact of decisionlevel fusion. Three state-of-the-art object detection and classification models (visual-based, LiDAR-based, and thermalbased) will be employed in parallel, and object predictions will be fused using voting-based (or rules-based) fusion methods. Some questions remain as to the discrepancies that could occur: what can we do to mitigate sensor disagreement? Does the coupling of sensor decisions generally boost confidence or induce confusion? Will different fusion methods provide differing levels of impact to the final solution? Additionally, does this multi-source fusion application transfer well to different scenes? A qualitative and quantitative analysis will be presented for applications of simple and complex fusion methods, and, understanding from past research, we hypothesize that multiple modality perception algorithms boost the final solution by balancing individual sensor strengths and weaknesses. The experiments performed herein will be conducted on a novel multi-sensor autonomous driving dataset created at the Center for Advanced Vehicular Systems (CAVS) at Mississippi State University and in collaboration with the US Army’s Engineer Research and Development Center (ERDC).
In autonomous driving, utilizing deep learning models to help make decisions has become a popular theme, particularly in the realm of computer vision. These models are heavily geared to make decisions based on the environment in which they are trained. Currently, there are very few datasets that exist for off-road autonomy. In order for autonomous vehicles to traverse off-road or unstructured settings, the vehicles must have an understanding of these environments. This paper seeks to lay the groundwork for a new, off-road, multi-modality dataset, with initial data being collected at Mississippi State University’s Center for Advanced Vehicular Systems (MSU CAVS). This dataset will include co-aligned and coregistered LiDAR, thermal, and visual sensor data. Additionally, this dataset will include semantic segmented visual data as well as object detection and classification labels for all three modalities. However, to perform semantic segmentation on each image individually would be strenuous and time-consuming due to the large quantity of images. Thus, this paper will explore the utility of transfer learning for auto-labeling. Specifically, this paper considers how transfer learning for a unique and under-represented (data-wise) domain performs in reducing the burden associated with hand-annotation datasets for deep learning applications.
The growing emergence of UAV swarms in industrial and security applications such as search-and-rescue and emergency response underlines a need for efficient and accurate aerial navigation in potentially cluttered and dynamic environments. However, the introduction of multiple drones presents a two-fold complexity within the collaboration space: 1) the agents must be able to perform some desired task(s) cooperatively or independently, and 2) the agents must avoid colliding into one another (or other obstacles) in the process. In this work, we investigate a conceptual method for path planning and teaming behaviors for UAV swarms. Specifically, we explore the conceptual implementation of a gravitational-inspired spatial organization and path planning method for autonomously controlling a UAV swarm in a simple wayfinding scenario. The method models drones and the environment itself with characteristics of orbital dynamics. Preliminary experiments are demonstrated, showing the potential applicability to individual drone navigation and obstacle avoidance, along with a proposed concept for swarming behaviors. Keywords: drones, teaming, swarm, wayfinding, autonomy
While deep learning is a very popular subsection of machine learning and has been used for decades, an availability gap exists for both knowledge and datasets in unstructured environments or in-the-wild applications. Knowledge of mobility in these free environments is an important stepping-stone for both Department of Defense applications as well as industrial autonomy applications. A few datasets that exist for unstructured environments such as RELLIS-3D for robotics, RUGD for navigation, and GOOSE for perception; however, due to the limited selection of datasets for this type of environment, most deep learning algorithms have not been thoroughly tested on this scenario. In this article, we will implement multiple deep learning methods on an in-house dataset to evaluate performance. Specifically, this article investigates the performance of pretrained, publicly available YOLOv4, ResNet-50, and Single Shot Detector (SSD) models on detection of unknown object classes encountered in the wild for improved, safe, and reliable maneuverability with minimized impediment in unstructured environments. The models used are tested using a dataset developed in-house for unstructured environment studies, and their performance is assessed with multiple metrics. The data used in this experiment was collected by the United States Army Corps of Engineers Engineer Research and Development Center.
The utilization of hyperspectral image data has contributed to improved performance of machine learning tasks by providing spectrally rich information that other more common sensor data lacks. An issue that can arise when using hyperspectral imagery is that it can often be computationally burdensome to collect and process. This study seeks to investigate the incorporation of hyperspectral image data collected on a co-aligned VNIR-SWIR sensor for the purpose of hyperspectral image classification. In which, the evaluation is focused on investigating the distinct effects pertaining to the VNIR data, to the SWIR data, and to the combination of the two data types with regards to hyperspectral image classification performance on vehicles. The experiments were run on data collected by the US Army Corps of Engineers Research and Development Center.
Recent years have seen the emergence of novel UAV swarm methodologies being developed for numerous applications within the Department of Defense. Such applications include, but are not limited to, search and rescue missions, intelligence, surveillance, and reconnaissance activities, and rapid disaster relief assessment. Herein, this article investigates an initial implementation of learning UAV swarm behaviors using reinforcement learning (RL). Specifically, we present a study implementing a leader-follower UAV swarm using RL-learned behaviors in a search-and-rescue task. Experiments are performed through simulations on synthetic data, specifically using a cross-platform flight simulator with Unreal Engine virtual environment. Performance is assessed by measuring key objective metrics, such as time to complete the mission, redundant actions, stagnation time, and goal success. This article seeks to provide an increased understanding and assessment of current reinforcement learning strategies being developed for controlling (or at a minimum suggesting) UAV swarm behaviors.
KEYWORDS: Education and training, Convolution, Data modeling, Deep learning, Performance modeling, Object detection, Neural networks, Visual process modeling, Genetic algorithms, Army
With numerous technologies, seeking to utilize deep learning-based object detection algorithms, there is an increased need for an innovative approach to compare one model to another. Often, models are compared one of two over-arching ways: performance metrics or through statistical measures on the dataset. One common approach for training an object detector for a new problem is to transfer learn a model, often initially trained extensively on the ImageNet dataset; however, why one feature backbone was selected over another is overlooked at times. Additionally, while whether it was trained on ImageNet, COCO, or some other benchmark dataset is noted, it is not necessarily considered by many practitioners outside the deep learning research community seeking to implement a state-of-the-art detector for their specific problem. This article proposes new strategies for comparing deep learning models that are associated with the same task, e.g., object detection.
Processing hyperspectral image data can be computationally expensive and difficult to employ for real-time applications due to its extensive spatial and spectral information. Further, applications in which computational resources may be limited, such as those requiring artificial intelligence at the edge, can be hindered by the volume of data that is common with airborne hyperspectral image data. This paper proposes utilizing band selection to down-select the number of spectral bands considering a given classification task so that classification can be done at the edge with lower computational complexity. Specifically, we consider popular techniques for band selection and investigate their feasibility to identify discriminative bands such that classification performance is not drastically hindered. This would greatly benefit applications where time-sensitive solutions are needed to ensure optimal outcomes (this could be related to defense, natural disaster relief/response, agriculture, etc.). Performance of the proposed approach is measured in terms of classification accuracy and run time.
Object detection remains an important and ever-present component of computer vision applications. While deep learning has been the focal point for much of the research actively being conducted in this area, there still exists certain applications in which such a sophisticated and complex system is not required. For example, if a very specific object or set of objects are desired to be automatically identified, and these objects' appearances are known a priori, then a much simpler and more straightforward approach known as matched filtering, or template matching, can be a very accurate and powerful tool to employ for object detection. In our previous work, we investigated using machine learning, specifically, the improved Evolution COnstructed features framework, to identify (near-) optimal templates for matched filtering given a specific problem. Herein, we explore how different search algorithms, e.g., genetic algorithm, particle swarm optimization, gravitational search algorithm, can derive not only (near-) optimal templates, but also promote templates that are more efficient. Specifically, given a defined template for a particular object of interest, can these search algorithms identify a subset of information that enables more efficient detection algorithms while minimizing degradation of detection performance. Performance is assessed in the context of algorithm efficiency, accuracy of the object detection algorithm and its associated false alarm rate, and search algorithm performance. Experiments are conducted on handpicked images of commercial aircraft from the xView dataset | one of the largest publicly available datasets of overhead imagery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.