In this paper, we describe a novel algorithm based on diffraction tomography for 3D map generation using the received
backscattered radar Electro-Magnetic (EM) field from different spatially distributed multi-static radar sensors. Each
sensor at a given time will transmit a radar waveform and the other sensors including the one transmitted will receive the
waveform that is backscattered from the objects. A data cube of received data will be created at each sensor by changing
the location of sensors. This data cube is used in generating the 3D object profiles at each sensor and then the fused 3D
map will be outputted which will contain the fused 3D object profiles or structure obtained from each sensor. If there are
more than one object in the field of interest there would be inter object backscattering. This would result in receiving
mixed signals. This mixed signal might cause problems in the generation of the 3D map/structure. So to reduce the effect
of inter object backscattering we use the probabilistic based blind source separation (BSS) technique for convolutive
mixture separation. Before applying the mixture separation technique, we estimate the number of sources. For this we
have developed a technique. In this paper, all these techniques are described and also results using real radar
backscattered data are provided. A description of how this 3D maps can be used for biomimetics is also provided.
A low bit rate speech coder based on Gaussian adaptive wavelets is described. This speech coder ameliorates the problem of using two different mother wavelets to model voiced and unvoiced sounds. In addition, it has been demonstrated in this paper that Gaussian adaptive wavelets are better suited to model both voiced and unvoiced sounds as compared to Morlet and Daubechies' wavelets. The bit rate and speech quality of this speech coder is compared with the speech coder based on Morlet wavelet. Experimental results using TIMIT speech database are discussed with examples.
Our objective is to demonstrate the applicability of adaptive wavelets for speech applications. In particular, we discuss two applications, namely, classification of unvoiced sounds and speaker identification. First, a method to classify unvoiced sounds using adaptive wavelets, which would help in developing a unified algorithm to classify phonemes (speech sounds), is described. Next, the applicability of adaptive wavelets to identify speakers using very short speech data (one pitch period) is exhibited. The described text-independent phoneme based speaker identification algorithm identifies a speaker by first modeling phonemes and then by clustering all the phonemes belonging to the same speaker into one class. For both applications, we use feed-forward neural network architecture. We demonstrate the performance of both unvoiced sounds classifier and speaker identification algorithms by using representative real speech examples.
In this paper, we describe a text-independent phoneme-based speaker identification system that uses adaptive wavelets to model the phonemes. This system identifies a speaker by modeling a very short segment of phonemes and then by clustering all the phonemes belonging to the same speaker into one class. The classification is achieved by using a two layer feed forward neural network classifier. The performance of this speaker identification system is demonstrated by considering the phonemes that were extracted from various sentences spoken by three speakers in the TIMIT acoustic-phonetic speech corpus.
KEYWORDS: Wavelets, Neural networks, Acoustics, Fast wavelet transforms, Visual information processing, Sensors, Statistical analysis, Feature extraction, Error analysis, Signal to noise ratio
In this paper, we describe a method to represent and classify unvoiced sounds using the concept of super wavelets. A super wavelet is a linear combination of wavelets that itself can be treated as a wavelet. Since unvoiced sounds are high frequency and noise like, we use Daubechies' wavelet of order three to generate the super wavelet. The parameters of the wavelet for representation and classification of unvoiced sounds are generated using neural networks. Even though this paper addresses the problems of both signal representation and classification, emphasis is on classification problem, since it is natural to adaptively tune wavelets in conjunction with training the classifier in order to select the wavelet coefficients which contain the most information for discriminating between the classes. We demonstrate the applicability of this method for the representation and classification of unvoiced sounds with representative examples.
Methods are presented for adaptively generating wavelet templates for signal representation and classification using neural networks. Different network structures and energy functions are necessary and are given for representation and classification. The idea is introduced of a "super-wavelet," a linear combination of wavelets that itself is treated as a wavelet. The super-wavelet allows the shape of the wavelet to adapt to a particular problem, which goes beyond adapting parameters of a fixed-shape wavelet. Simulations are given for 1-D signals, with the concepts extendable to imagery. Ideas are discussed for applying the concepts in the paper to phoneme and speaker recognition.
Since the wavelet transform, a time-scale representation is linear by definition, it does not have any cross terms. However, since signal processors often use the plots of the quadratic square magnitude, i.e., the energy distribution of the WT, to represent a signal, there exists nonlinear cross terms which could cause problems while analyzing multicomponent signals. In this paper, we show that these WT cross terms do exist and discuss the nature and the geometry of these cross terms by deriving mathematical expressions for the energy distribution of the WT of a multicomponent signal. From the mathematical expressions for the WT cross terms, we can infer that the nature of these 'cross terms' are comparable with those found in the Wigner distribution (WD), a quadratic time-frequency representation, and the short-time Fourier transform (STFT), of closely spaced signals. The `cross terms' of the WT and the STFT energy distributions occur at the intersection of their respective WT and STFT spaces, whereas for the WD, cross terms occur midtime and midfrequency. The parameters of the cross terms are a function of the difference in frequency and time of the perpended signals. The amplitude of these cross terms can be twice as large as the product of the magnitudes of the transforms of the two signals in question in all the three cases. We also discuss the significance of the existence of WT cross terms while analyzing a multicomponent signal with representative examples.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.