One of the roles of emergency first responders (e.g., police and fire departments) is to prevent and protect against events that can jeopardize the safety and well-being of a community. In the case of criminal gang activity, tools are needed for finding, documenting, and taking the necessary actions to mitigate the problem or issue. We describe an integrated mobile-based system capable of using location-based services, combined with image analysis, to track and analyze gang activity through the acquisition, indexing, and recognition of gang graffiti images. This approach uses image analysis methods for color recognition, image segmentation, and image retrieval and classification. A database of gang graffiti images is described that includes not only the images but also metadata related to the images, such as date and time, geoposition, gang, gang member, colors, and symbols. The user can then query the data in a useful manner. We have implemented these features both as applications for Android and iOS hand-held devices and as a web-based interface.
We discuss the problem of recognizing the shape of planar objects consisting of "blobs" that can be modeled as Gaussian mixture densities. We describe an empirical comparison method, assuming a large number of independent samples are given for each distribution. Instead of comparing the Gaussian mixtures directly, we compare the underlying distribution of distances of each mixture. Since distances are invariant under rotations and translations, this provides a workaround to the problem of aligning the objects before comparing them-thus speeding the comparison process. We prove that the distribution of distances is a lossless representation of the shape of generic Gaussian mixtures. Our numerical experiments indicate that, when all the components of the Gaussian mixtures are equally weighted and have the same standard deviation matrix, the proposed method is no less accurate than methods that compare the planar mixtures directly. The extension of our method to the problem of recognizing halftone patterns is briefly discussed.
In this paper we describe a mobile-based system that allows first responders to identify and track gang graffiti
by combining the use of image analysis and location-based-services. The gang graffiti image and metadata
(geoposition, date and time) obtained automatically are transferred to a server and uploaded to a database of
graffiti images. The database can then be queried with the matched results sent back to the mobile device where
the user can then review the results and provide extra inputs to refine the information.
We present a light-weight method for automatically detecting shapes that have an approximate rotational
symmetry (e.g., a square or equilateral triangle) on discrete-space images. Our motivation is the problem
of automatically detecting and recognizing hazardous material placards on a mobile platform (e.g., a mobile
telephone) equipped with a camera. The proposed method is
well-suited for mobile device applications,
which are characterized by limited memory, processing power and battery life. It is based on comparing the
magnitude of the coefficients of the Fourier series of the centralized moments of the Radon transform of the
image after segmentation. However, in our approach, the computation of the Radon transform is bypassed
as we obtain these coefficients directly from the rows of the Pascal Triangle of the segmented image. The
Pascal Triangle of an image is composed of complex moments arranged in a pyramidal fashion similar to
the binomial coefficients. These complex moments are obtained from a coarse segmentation of the shape
represented by a gray-scale image. In particular, the contours of the object do not need to be precisely
defined, and the shape needs not be connected. Moreover, our approach is invariant under translation,
rotation, and scaling. We tested our method on images from the
MPEG-7 shape database as well as images
from our own database of hazardous material placards.
When traveling in a region where the local language is not written using a "Roman alphabet," translating
written text (e.g., documents, road signs, or placards) is a particularly difficult problem since the text cannot
be easily entered into a translation device or searched using a dictionary. To address this problem, we are
developing the "Rosetta Phone," a handheld device (e.g., PDA or mobile telephone) capable of acquiring an
image of the text, locating the region (word) of interest within the image, and producing both an audio and a
visual English interpretation of the text. This paper presents a system targeted for interpreting words written in
Arabic script. The goal of this work is to develop an autonomous, segmentation-free Arabic phrase recognizer,
with computational complexity low enough to deploy on a mobile device. A prototype of the proposed system
has been deployed on an iPhone with a suitable user interface. The system was tested on a number of noisy
images, in addition to the images acquired from the iPhone's camera. It identifies Arabic words or phrases by
extracting appropriate features and assigning "codewords" to each word or phrase. On a dictionary of 5,000
words, the system uniquely mapped (word-image to codeword) 99.9% of the words. The system has a 82%
recognition accuracy on images of words captured using the iPhone's built-in camera.
We propose a method for automatically enhancing the eyes of all the faces in a digital image by whitening
their scleras (i.e., the white part of their eyes). The scleras are identified by combining existing face detection
and feature alignment technology with a color-based sclera probability map. We then smooth, brighten, and
desaturate the scleras. This reduces the appearance of blood vessels and produces a healthier, more "refreshed"
look.
The motivating application for this research is the problem of recognizing a planar object consisting of points
from a noisy observation of that object. Given is a planar Gaussian mixture model ρT (x) representing an object
along with a noise model for the observation process (the template). Also given are points representing the
observation of the object (the query). We propose a method to determine if these points were drawn from
a Gaussian mixture ρQ(x) with the same shape as the template. The method consists in comparing samples
from the distribution of distances of ρT (x) and ρQ(x), respectively. The distribution of distances is a faithful
representation of the shape of generic Gaussian mixtures. Since it is invariant under rotations and translations
of the Gaussian mixture, it provides a workaround to the problem of aligning objects before recognizing their
shape without sacrificing accuracy. Experiments using synthetic data show a robust performance against type I
errors, and few type II errors when the given template Gaussian mixtures are well distinguished.
Misalignment between the color planes used to print color images creates undesirable artifacts in printed images.
Color trapping is a technique used to diminish these artifacts. It consists of creating small overlaps between
the color planes, either at the page description language level or the rasterized image level. Existing color
trapping algorithms for rasterized images trap pixels independently. Once a pixel is trapped, the next pixel is
processed without making use of the information already acquired. We propose a more efficient strategy which
makes use of this information. Our strategy is based on the observation of some important properties of color
edges. Combined with any existing algorithm for trapping rasterized images, this strategy significantly reduces
its complexity. We implement this strategy in combination with a previously proposed color trapping algorithm
(WBTA08). Our numerical tests indicate an average reduction of close to 38% in the combined number of
multiplications, additions, and "if" statements required to trap a page, as compared with WBTA08 by itself.
We propose a new method for registering a cloud of points in 2D onto a planar curve. This method does not
require the knowledge of an initial guess for the position of the point cloud and proceeds without having to
order, smooth out or otherwise process the points of the query point cloud in any way. The method consists
in representing the planar curve by an algebraic curve, and in fitting the algebraic curve to the points of the
point cloud by solving a corresponding over-constrained system of polynomial equations. The solution of this
system is obtained using a new solution method for polynomial systems of equations, which we introduce in this
paper. This solution method, which can be seen as an extension of the pseudo-inverse approach to solving linear
systems of equations, naturally handles over-contrained systems of equations in a robust fashion.
When traveling in a region where the local language is not written using the Roman alphabet, translating written
text (e.g., documents, road signs, or placards) is a particularly difficult problem since the text cannot be easily
entered into a translation device or searched using a dictionary. To address this problem, we are developing
the "Rosetta Phone," a handheld device (e.g., PDA or mobile telephone) capable of acquiring a picture of the
text, identifying the text within the image, and producing both an audible and a visual English interpretation
of the text. We started with English, as a developement language, for which we achieved close to 100% accuracy
in identifying and reading text. We then modified the system to be able to read and translate words written
using the Arabic character set. We currently achieve approximately 95% accuracy in reading words from a small
directory of town names.
CMYK color separation is a technique commonly used in printing to reproduce mutli-color images. However,
the color planes are generally not perfectly aligned with respect to each other when they are rendered by the
imaging stations. This phenomenon, called color plane mis-registration, causes gap and halo artifacts. Trapping
algorithms aim to reduce these artifacts by scanning through an image, determining the edges susceptible to
mis-registration errors, and moving the edge boundaries of the lighter colorants underneath the edge boundaries
of the darker colorants. In this paper, we propose a low-complexity approach to automatic color trapping which
hides the effects of small color plane mis-registrations without negatively affecting the overall quality of the
printed image.
Accurately reconstructing the 3D geometry of a scene or object observed on 2D images is a difficult problem: there
are many unknowns involved (camera pose, scene structure, depth factors) and solving for all these unknowns
simultaneously is computationally intensive and suffers from numerical instability. In this paper, we algebraically
decouple some of the unknowns so that they can be solved for independently. Decoupling the pose from the other
variables has been previously discussed in the literature. Unfortunately, pose estimation is an ill-conditioned
problem. In this paper, we algebraically eliminate all the camera pose parameters (i.e., position and orientation)
from the structure-from-motion equations for an internally calibrated camera. We then also fully eliminate the
structure coordinates from the equations. This yields a very simple set of homogeneous polynomial equations of
low degree involving only the depths of the observed points. When considering a small number of tracked points
and pictures (e.g., five points on two pictures), these equations can be solved using the sparse resultant method.
KEYWORDS: Cameras, Atomic force microscopy, Error analysis, Numerical stability, Calibration, Motion models, 3D acquisition, 3D modeling, Reconstruction algorithms, Systems modeling
Structure from motion (SFM) is the problem of reconstructing the geometry of a scene from a stream of images on which features have been tracked. In this paper, we consider a projective camera model and assume that the internal parameters of the camera are known. Our goal is to reconstruct the geometry of the scene up to a rigid motion (i.e. Euclidean reconstruction.) It has been shown that estimating the pose of the camera from the images is an ill-conditioned problem, as variations in the camera orientation and camera position cannot
be distinguished. Unfortunately, the camera pose parameters are an intrinsic part of current formulations of SFM. This leads to numerical instability in the reconstruction of the scene. Using algebraic methods, we obtain a basis for a new formulation of SFM which does not involve pose estimation and thus eliminates this cause of instability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.