CONF
BenZeghiba_icslp-02/IDIAP
User-Customized Password Speaker Verification based on HMM/ANN and GMM Models
BenZeghiba, Mohamed Faouzi
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2002/rr02-10.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/benzeghiba-02a
Related documents
International Conference on Spoken Language Processing (ICSLP~2002)
2002
Denver, CO, USA
1325-1328
IDIAP-RR 02-10
In this paper, we present a new approach towards user-custom\-ized password speaker verification combining the advantages of hybrid HMM/ANN systems, using Artificial Neural Networks (ANN) to estimate emission probabilities of Hidden Markov Models, and Gaussian Mixture Models. In the approach presented here, we indeed exploit the properties of hybrid HMM/ANN systems, usually resulting in high phonetic recognition rates, to automatically infer the baseline phonetic transcription (HMM topology) associated with the user customized password from a few enrollment utterances and using a large, speaker independent, ANN. The emission probabilities of the resulting HMMs are then modeled in terms of speaker specific/adapted multi-Gaussian HMMs or speaker specific/adapted ANN. In the proposed approach, the hybrid HMM/ANN system is used as a model for utterance (password) verification, while still using a speaker independent GMM for speaker verification. Results (EER) are compared to a state-of-the-art text-dependent approach, using multi-Gaussian HMMs only.
REPORT
BenZeghiba-02a/IDIAP
User-Customized Password Speaker Verification based on HMM/ANN and GMM Models
BenZeghiba, Mohamed Faouzi
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2002/rr02-10.pdf
PUBLIC
Idiap-RR-10-2002
2002
IDIAP
published in ICSLP 2002
In this paper, we present a new approach towards user-custom\-ized password speaker verification combining the advantages of hybrid HMM/ANN systems, using Artificial Neural Networks (ANN) to estimate emission probabilities of Hidden Markov Models, and Gaussian Mixture Models. In the approach presented here, we indeed exploit the properties of hybrid HMM/ANN systems, usually resulting in high phonetic recognition rates, to automatically infer the baseline phonetic transcription (HMM topology) associated with the user customized password from a few enrollment utterances and using a large, speaker independent, ANN. The emission probabilities of the resulting HMMs are then modeled in terms of speaker specific/adapted multi-Gaussian HMMs or speaker specific/adapted ANN. In the proposed approach, the hybrid HMM/ANN system is used as a model for utterance (password) verification, while still using a speaker independent GMM for speaker verification. Results (EER) are compared to a state-of-the-art text-dependent approach, using multi-Gaussian HMMs only.