The current work introduces a neural processor architecture that allows hardware acceleration in the processing of convolutional neural networks (CNNs). The purpose of this study is to design the architecture and microarchitecture of the neural processor, which can be used to solve the problems of environmental analysis and recognition of objects that are located in the scene in augmented reality systems, presented as energy-efficient and compact wearable devices. The proposed architecture provides the opportunity to adjust the variable parameters of blocks for data processing and storage to optimize the performance, energy consumption and resources used to implement the neural processor. The article offers variants of scaling of computing blocks and memory blocks in the architecture of a neural processor, which can be used for increasing the performance of the end product. The paper describes a tool that generates a neural processor based on given limitations on power consumption and performance and on the structure of convolutional neural networks that you need to use for data processing. The proposed tool can potentially become a valuable product in the field of designing hardware accelerators of convolutional neural networks, as it allows to increase the degree of automation of the process of synthesis of neural processors for further implementation in mixed reality systems, made as portable devices. In general, this thesis presents tools that allow the developer of software based on convolutional neural networks for mixed reality systems to synthesize energy-efficient processors to accelerate the processing of convolutional neural networks.
KEYWORDS: 3D modeling, Visual process modeling, Point clouds, Depth maps, Cameras, Sensors, Modeling, Medical imaging, Machine learning, Imaging systems
Monocular vision-based 3D-scanning systems have revolutionized object and environment capture. The authors have developed a system that uses a rotating camera and an angle sensor to capture images from different angles. Through deep learning techniques, depth maps are extracted from the images. The resulting depth map is used to generate a 3D point cloud model, which is preprocessed to enhance efficiency. This research considered several methods to improve depth accuracy of panorama image, as well as to recalibrate depth maps and smooth 360-degree 3D models. Our method produces better results than the pre-trained model. The authors use a LIDAR to get distance and increase accuracy of model. To visualize the model, the web framework was developed. The rendered model can be accessed through a web browser, providing functionalities such as coordinate selection and distance calculation.
This work is devoted to the research of applicability of depth determination methods for solving problems of building 3D models of rooms. The authors propose a combined method of finding the disparity and statistical signals processing using auxiliary data, obtained due to the laser illumination of the scene. Solutions used in forming 3D models of objects or the surrounding space are considered, which led to the definition of the most appropriate method for building a scanning system – the stereo-reconstruction method. The finding of disparity by naive gradient descent method is presented. The results of the scanning system are presented.
Stereo matching is one of the most important computer vision tasks. Several methods can be used to compute a matching cost of two pictures. This paper proposes a method that uses convolutional neural networks to compute the matching cost. The network architecture is described as well as teaching process. The matching cost metric based on the result of neural network is applied to base method which uses support points grid (ELAS). The proposed method was tested on Middlebury benchmark images and showed an accuracy improvement compared to the base method.
The article describes the approach that allows to reconstruct the image formed by the video see-through mixed reality system corresponding to the convergence of the device user eyes. Convergence is defined by the user eye pupils position acquired from the mixed reality device eye tracking system. The image reconstruction method is based on the use of an extended (2.5-dimensional) representation of the image obtained, for example, using a 3D scanner that builds a depth map of the scene. In the proposed solution, lens optical systems that form images of the real world on LCD screens and eyepieces that project these images into the user eyes do not change their characteristics and position. The image is reconstructed by projecting the points of the original image to the image points corresponding to the required convergence by the method of "refocusing" at a distance for each point. The advantages and disadvantages of this method are shown. An approach is proposed that reduces visual perception discomfort caused by an ambiguous distance to the image point, for example, in the case of mirror or transparent objects. Virtual prototyping of the mixed reality system showed the benefits of the proposed approach to reduce the visual perception discomfort caused by the mismatch between the convergence of human eyes and the images formed by the lenses of the mixed reality system.
This paper proposes a stereo matching method that uses a support point grid in order to compute the prior disparity. Convolutional neural networks are used to compute the matching cost between pixels in two pictures. The network architecture is described as well as teaching process. The method was evaluated on Middlebury benchmark images. The results of accuracy estimation in case of using data from a LIDAR as an input for the support points grid is described. This approach can be used in multi-sensor devices and can give an advantage in accuracy up to 15%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.