Presentation + Paper
27 May 2022 DTTNet: depth transverse transformer network for monocular depth estimation
Author Affiliations +
Abstract
Depth estimation is an essential component in understanding the 3D geometry of a scene. In comparison to traditional depth estimation methods such as structure from motion and stereo vision matching, determining depth relation using a single camera is challenging. The recent advancements in convolutional neural networks have accelerated the research in monocular depth estimation. However, most technologies infer depth maps using lower resolution images due to network capacity and complexity issues. Another challenge in depth estimation is ambiguous and sparse depth maps. These issues are caused due to labeling errors, hardware faults, or occlusions. This paper presents a novel end-to-end trainable convolutional neural network architecture – depth transverse transformer network (DTTNet). The proposed network is designed and optimized to perform monocular depth estimation. This network aims at exploring the multi-resolution representation to perform pixel-wise depth estimation more accurately. In order to improve the accuracy of depth estimation, different kinds of ad hoc networks are proposed subsequently. Extensive computer simulations on NYU Depth V2 and SUN RGB-D dataset demonstrate the effectiveness of the proposed DTTNet against state-of-the-art methods. DTTNet can potentially optimize depth perception in intelligent systems such as automated driving and video surveillance applications, computational photography, and augmented reality. The source code is available at https://github.com/shreyaskamathkm/DTTNet
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shreyas Kamath K. M., Srijith Rajeev, Karen Panetta, and Sos S. Agaian "DTTNet: depth transverse transformer network for monocular depth estimation", Proc. SPIE 12100, Multimodal Image Exploitation and Learning 2022, 1210003 (27 May 2022); https://doi.org/10.1117/12.2618535
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Computer programming

Network architectures

Convolution

Convolutional neural networks

Data modeling

Sun

Back to Top