As a rule, specially trained neural networks are engaged in the recognition and classification of objects in RGB and RGB-D images. The quality of object recognition depends on the quality of neural network training. Since the network cannot go far beyond the limits of the training set, the problems of forming datasets and their correct annotating are of particular relevance. These tasks are time-consuming and can be difficult to perform in real-world conditions since it is not always possible to create the required real-world illumination and observation conditions. So, synthesized images with a high degree of realism can be used as input data for deep learning. To synthesize realistic images, it is necessary to create appropriate realistic models of scene objects, illumination and observation conditions, including ones achieved with special optical devices. However, this is not enough to create a dataset, since it is necessary to generate thousands of images, which is hardly possible to do manually. Therefore, an automated solution is proposed, which allows us to automatically process the scene to observe it from different angles, modify the scene by adding, deleting, moving, or rotating individual objects, and then perform automatic annotation of a desired scene image. As a result, not only directly visible scene object images but also their reflections may be annotated. In addition to the segmented image, a segmented point cloud and a depth map image (RGB-D) are built, which helps in training neural networks working with such data. For this, a Python scripting interpreter was built-in into a realistic rendering system. It allows us to perform any actions with the scene that are allowed in the user interface to control the automatic synthesis and segmentation of images. The paper provides examples of automatic dataset generation and corresponding trained neural network results.
The current work introduces a neural processor architecture that allows hardware acceleration in the processing of convolutional neural networks (CNNs). The purpose of this study is to design the architecture and microarchitecture of the neural processor, which can be used to solve the problems of environmental analysis and recognition of objects that are located in the scene in augmented reality systems, presented as energy-efficient and compact wearable devices. The proposed architecture provides the opportunity to adjust the variable parameters of blocks for data processing and storage to optimize the performance, energy consumption and resources used to implement the neural processor. The article offers variants of scaling of computing blocks and memory blocks in the architecture of a neural processor, which can be used for increasing the performance of the end product. The paper describes a tool that generates a neural processor based on given limitations on power consumption and performance and on the structure of convolutional neural networks that you need to use for data processing. The proposed tool can potentially become a valuable product in the field of designing hardware accelerators of convolutional neural networks, as it allows to increase the degree of automation of the process of synthesis of neural processors for further implementation in mixed reality systems, made as portable devices. In general, this thesis presents tools that allow the developer of software based on convolutional neural networks for mixed reality systems to synthesize energy-efficient processors to accelerate the processing of convolutional neural networks.
This paper proposes a stereo matching method that uses a support point grid in order to compute the prior disparity. Convolutional neural networks are used to compute the matching cost between pixels in two pictures. The network architecture is described as well as teaching process. The method was evaluated on Middlebury benchmark images. The results of accuracy estimation in case of using data from a LIDAR as an input for the support points grid is described. This approach can be used in multi-sensor devices and can give an advantage in accuracy up to 15%.
The numerical modeling of influence of aberrations on pictur quality of thin periodic structures with taking into account a high entrance numerical aperture (NA) on the basis of the vector theory of diffraction is presented. The simulation algorithm and brief theoretical considerations are discussed. It is shown the influence of the entrance NA on image quality of microscope objectives is different for various values of NA and the contrast increases with the higher entrance NA.
THeoretical investigation of the distribution of light intensity close to the lens focus is discussed, the distribution itself being treated as a sum of unit vector plane waves. Each wave is characterized by a matrix coefficient, a wave vector, a vector of polarization, a matrix of polarization orientation and a Maxwell-Jones' vector. This approach offers to take easily into account polarization effects and aberrations of an optical system in image modeling. Calculations are based on fast Fourier transform.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.