CONF
Rasipuram_ICASSP_2011/IDIAP
Integrating articulatory features using Kullback-Leibler divergence based acoustic model for phoneme recognition
Rasipuram, Ramya
Magimai-Doss, Mathew
https://publications.idiap.ch/index.php/publications/showcite/Rasipuram_Idiap-RR-02-2011
Related documents
Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
2011
5192 - 5195
10.1109/ICASSP.2011.5947527
doi
In this paper, we propose a novel framework to integrate articulatory features (AFs) into HMM- based ASR system. This is achieved by using posterior probabilities of different AFs (estimated by multilayer perceptrons) directly as observation features in Kullback-Leibler divergence based HMM (KL-HMM) system. On the TIMIT phoneme recognition task, the proposed framework yields a phoneme recognition accuracy of 72.4% which is comparable to KL-HMM system using posterior probabilities of phonemes as features (72.7%). Furthermore, a best performance of 73.5% phoneme recognition accuracy is achieved by jointly modeling AF probabilities and phoneme probabilities as features. This shows the efficacy and flexibility of the proposed approach.
REPORT
Rasipuram_Idiap-RR-02-2011/IDIAP
Integrating Articulatory Features using Kullback-Leibler Divergence based Acoustic Model for Phoneme Recognition
Rasipuram, Ramya
Magimai-Doss, Mathew
articulatory features
Automatic Speech Recognition
Kullback-Leibler divergence based hidden Markov model
multilayer perceptron
phonemes
posterior probabilities
EXTERNAL
https://publications.idiap.ch/attachments/reports/2010/Rasipuram_Idiap-RR-02-2011.pdf
PUBLIC
Idiap-RR-02-2011
2011
Idiap
February 2011
In this paper, we propose a novel framework to integrate articulatory features (AFs) into HMM- based ASR system. This is achieved by using posterior probabilities of different AFs (estimated by multilayer perceptrons) directly as observation features in Kullback-Leibler divergence based HMM (KL-HMM) system. On the TIMIT phoneme recognition task, the proposed framework yields a phoneme recognition accuracy of 72.4% which is comparable
to KL-HMM system using posterior probabilities of phonemes as features (72.7%). Furthermore, a best performance of 73.5% phoneme recognition accuracy is achieved by jointly modelling AF probabilities and phoneme probabilities as features. This shows the efficacy and flexibility of the proposed approach.