Robots are ideal surrogates for performing tasks that are dull, dirty, and dangerous. To fully achieve this ideal, a robotic teammate should be able to autonomously perform human-level tasks in unstructured environments where we do not want humans to go. In this paper, we take a step toward realizing that vision by introducing the integration of state of the art advancements in intelligence, perception, and manipulation on the RoMan (Robotic Manipulation) platform. RoMan is comprised of two 7 degree of freedom (DoF) limbs connected to a 1 DoF torso and mounted on a tracked base. Multiple lidars are used for navigation, and a stereo depth camera visualizes point clouds for grasping. Each limb has a 6 DoF force-torque sensor at the wrist, with a dexterous 3-finger gripper on one limb and a stronger 4-finger claw-like hand on the other. Tasks begin with an operator specifying a mission type, a desired final destination for the robot, and a general region where the robot should look for grasps. All other portions of the task are completed autonomously. This includes navigation, object identification and pose estimation (if the object is known) via deep learning or perception through search, fine maneuvering, grasp planning via grasp library, arm motion planning, and manipulation planning (e.g. dragging if the object is deemed too heavy to freely lift). Finally, we present initial test results on two notional tasks: clearing a road of debris such as a heavy tree or a pile of unknown light debris, and opening a hinged container to retrieve a bag inside it.
Visual perception has become core technology in autonomous robotics to identify and localize objects of interest to ensure successful and safe task execution. As part of the recently concluded Robotics Collaborative Technology Alliance (RCTA) program, a collaborative research effort among government, academic, and industry partners, a vision acquisition and processing pipeline was developed and demonstrated to support manned-unmanned team ing for Army relevant applications. The perception pipeline provided accurate and cohesive situational awareness to support autonomous robot capabilities for maneuver in dynamic and unstructured environments, collaborative human-robot mission planning and execution, and mobile manipulation. Development of the pipeline involved a) collecting domain specific data, b) curating ground truth annotations, e.g., bounding boxes, keypoints, c) retraining deep networks to obtain updated object detection and pose estimation models, and d) deploying and testing the trained models on ground robots. We discuss the process of delivering this perception pipeline under limited time and resource constraints due to lack of a priori knowledge of the operational environment. We focus on experiments conducted to optimize the models despite using data that was noisy and exhibited sparse examples for some object classes. Additionally, we discuss our augmentation techniques used to enhance the data set given skewed class distributions. These efforts highlight some initial work that directly relates to learning and updating visual perception systems quickly in the field under sudden environment or mission changes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.