Paper
6 December 2002 The speech scale, the Mel scale, and the tube model for speech
Srinivasan Umesh, Leon Cohen, Douglas J. Nelson
Author Affiliations +
Abstract
We use the tube model of speech production to study the speech-hearing connection. Recently, using real speech we showed that sounds made by different individuals and perceived to be the same can be transformed into each other by a universal warping function. We call the transformation function the speech scale and we have shown that it is similar to the Mel scale. Thus experimentally establishing the speech-hearing connection. In this paper we explore the possible origins of the speech scale and attempt to understand it from the point of view of the tube model of speech. We use the two-tube model for various vowels and study the effect of varying the lengths of the tubes on the location of formant frequencies. We show that if we use the commonly used assumption that the length of the front-tube does not change significantly when compared to the back tube for different individuals enunciating the same sound, then their corresponding formant frequencies are non-uniformly scaled. Using the same method we used for real speech we compute the warping function.
© (2002) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Srinivasan Umesh, Leon Cohen, and Douglas J. Nelson "The speech scale, the Mel scale, and the tube model for speech", Proc. SPIE 4791, Advanced Signal Processing Algorithms, Architectures, and Implementations XII, (6 December 2002); https://doi.org/10.1117/12.456493
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Commercial off the shelf technology

Composites

Fourier transforms

Ear

Mouth

Acoustics

Defense and security

RELATED CONTENT

Dragon Ears airborne acoustic array CSP analysis applied to...
Proceedings of SPIE (September 18 2003)
The speech scale and spectral transformation
Proceedings of SPIE (September 17 2005)
Estimating speaker scale factors from vowels
Proceedings of SPIE (November 13 2003)

Back to Top