Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component

Mykola M. Bykov; Viacheslav V. Kovtun; Iryna M. Kobylyanska; Waldemar Wójcik; Saule Smailova

doi:10.1117/12.2536888

6 November 2019 Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component

Mykola M. Bykov, Viacheslav V. Kovtun, Iryna M. Kobylyanska, Waldemar Wójcik, Saule Smailova

Author Affiliations +

Proceedings Volume 11176, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019; 1117620 (2019) https://doi.org/10.1117/12.2536888
Event: Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019, 2019, Wilga, Poland

Abstract

The article presents the results of the adaptation of the hybrid HMM-DNN speech synthesis model for use in automated speaker recognition system for critical use (ASRSCU). In particular, the process of learning the HMM-DNN speech synthesis model with the estimation of the difference between the posterior probability distributions of all HMM states and the actual a posteriori probability distribution, calculated by DNN, and the use of semantic information in the speaker recognition process, has been improved. The features that are observed in the sequence of frames to which the input phonogram is divided describe this information. The obtained results allowed improving the efficiency of the textdependent speaker recognition when using ASRSCU in a noisy acoustic environment. The article formulated measures for the structural integration of the HMM-DNN component in ASRSCU and describes the practical aspects of this process. In particular, the choice of the type and the method of normalization of the vectors of basic informative features at the frame level was substantiated, the number of HMM states and GMM parameters were determined depending on the parameters of the chosen formation model, and the procedure for interpreting the recognition results was described. The paper formulates measures to optimize the learning process of the ASRSCU with the HMM-DNN component, which will be exploited in noisy environments.

Citation Download Citation

Mykola M. Bykov, Viacheslav V. Kovtun, Iryna M. Kobylyanska, Waldemar Wójcik, and Saule Smailova "Improvement of the learning process of the automated speaker recognition system for critical use with HMM-DNN component", Proc. SPIE 11176, Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019, 1117620 (6 November 2019); https://doi.org/10.1117/12.2536888

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available