Three-dimensional (3D) imaging has recently been applied to human gesture recognition using depth maps from RGB-D sensors. An alternative which has been scarcely explored is 3D Integral Imaging, which has shown to give very competitive results in object reconstruction and recognition tasks, even under challenging conditions (e.g. low illumination, occlusions). Integral Imaging has some remarkable advantages over other sensors that may give 3D information (like RGB-D sensors). One of the most important ones is its long range working capability, which stands out even more when compared against other sensors that lose their capabilities for depths of 2m or more. In this paper we present results corresponding to the application of the integral imaging 3D acquisition technique for the recognition of human gestures, when there are occlusions that may hinder the recognition capability. We also present results comparing its capability against that given by an RGB-D sensor (Kinect) and that obtained when only one of the cameras in the camera array is used. Our results show that Integral Imaging compares more or less similarly to Kinect and the monocular case when there are not occlusions, but much more favorably when there are. We also show that the camera spatial resolution may be an issue to account for, when we refer to gesture recognition under occlusions, for the monocular case, but it is less sensitive for the Integral Imaging case, because the features that are extracted from Integral Imaging seem to be more descriptive and discriminative than for the monocular counterpart case.
Imaging systems based on microstructured illumination and single-pixel detection offer several advantages over conventional imaging techniques. They are an effective method for imaging through scattering media even in the dynamic case. They work efficiently under low light levels, and the simplicity of the detector makes it easy to design imaging systems working out of the visible spectrum and to acquire multidimensional information. In particular, several approaches have been proposed to record 3D information. The technique is based on sampling the object with a sequence of microstructured light patterns codified onto a programmable spatial light modulator while light intensity is measured with a single-pixel detector. The image is retrieved computationally from the photocurrent fluctuations provided by the detector. In this contribution we describe an optical system able to produce full-color stereoscopic images by using few and simple optoelectronic components. In our setup we use an off-the-shelf digital light projector (DLP) based on a digital micromirror device (DMD) to generate the light patterns. To capture the color of the scene we take advantage of the codification procedure used by the DLP for color video projection. To record stereoscopic views we use a 90° beam splitter and two mirrors, allowing us two project the patterns form two different viewpoints. By using a single monochromatic photodiode we obtain a pair of color images that can be used as input in a 3-D display. To reduce the time we need to project the patterns we use a compressive sampling algorithm. Experimental results are shown.
KEYWORDS: Integral imaging, 3D image processing, Cameras, Video, Gesture recognition, 3D image reconstruction, Visualization, 3D acquisition, 3D displays, Image resolution
In this keynote address paper, we present an overview of our previously published work on the application of pattern recognition techniques and integral imaging for human gesture recognition.
Despite imaging systems that scan a single-element benefit from mature technology, they suffer from acquisition times linearly proportional to the spatial resolution. A promising option is to use a single-pixel system that benefits from data collection strategies based on compressive sampling. Single-pixel systems also offer the possibility to use dedicated sensors such as a fiber spectrometer for multispectral imaging or a distribution of photodiodes for 3D imaging. The image is obtained by lighting the scene with microstructured masks implemented onto a programmable spatial light modulator. The masks are used as generalized measurement modes where the object information is expressed and the image is recovered through algebraic optimization. The fundamental reason why the bucket detection strategy can outperform conventional optical array detection is the use of a single channel detector that simultaneously integrates all the photons transmitted through the patterned scene. Spatial frequencies that are not transmitted through this low-quality optics are demonstrated to be present in the retrieved image. Our work makes two specific contributions within the field of single-pixel imaging through patterned illumination. First, we demonstrate that single-pixel imaging improves the resolution of conventional imaging systems overcoming the Rayleigh criterion. An analysis of resolution using a low NA microscope objective for imaging at a CCD camera shows that single-pixel cameras are not limited at all by the optical quality of the collection optics. Second, we experimentally demonstrate the capability of our technique to properly recover an image even when an optical diffuser is located in between the sample and the bucket detector.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.