Poster + Paper
28 November 2023 Real‐time deep learning semantic segmentation for 3-D augmented reality
V. Voronin, E. Semenishchev, A. Zelensky, M. Zhdanova, N. Gapon
Author Affiliations +
Conference Poster
Abstract
Augmented reality is a visualization technology that displays information by adding virtual images to the real world. In many cases, augmented reality requires recognition of the current scene. Extracting foreground objects from a video in real-time on limited hardware, such as a smartphone, is demanding. The augmented reality for the environment must have a model in which it is clear where it is necessary to detect the object or apply a mask. One way to recognize a scene in the current context without prior information is to use semantic segmentation techniques. This article proposes a new neural network architecture for efficient semantic image segmentation in the task of building augmented reality. The developed architecture is based on the combination of Shufflenet V2 and DPC, which provides good performance due to the balance between predictive accuracy and efficiency. First, the ShuffleNet V2 neural network architecture obtains features from RGB images. The resulting feature maps are then passed to one of the Deeplab V3+ Dense Prediction Cell encoders. At the final stage, the features are decoded by bilinear interpolation to create segmentation masks. The augmented reality construction algorithm is based on the ARCore framework and the OpenGL interface for embedded systems. The proposed approach for recognizing scene objects in augmented reality uses semantic segmentation, providing real-time information. The implementation of the algorithm shows that the detected objects can be tracked in 3-D space using visual-inertial odometry without resorting to constantly updating the environment model. The frequency of object detection and semantic mask generation can be reduced, resulting in battery and processing power savings, which is critical for mobile and embedded systems. The semantic information provided by these solutions can be used in autonomous driving, robotics navigation, localization, and scene recognition in conditions of limited resources. In augmented reality, the proposed approach can remove objects from a scene, draw attention to objects, or provide scene recognition to software logic. The experiment results confirmed the high efficiency of the proposed method compared to the state-of-the-art techniques for real‐time 3-D augmented reality construction.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
V. Voronin, E. Semenishchev, A. Zelensky, M. Zhdanova, and N. Gapon "Real‐time deep learning semantic segmentation for 3-D augmented reality", Proc. SPIE 12772, Real-time Photonic Measurements, Data Management, and Processing VII, 127720L (28 November 2023); https://doi.org/10.1117/12.2691152
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Augmented reality

Image segmentation

Semantics

Neural networks

RGB color model

Video

Detection and tracking algorithms

Back to Top