CONF
Imseng_INTERSPEECH_2011/IDIAP
Improving non-native ASR through stochastic multilingual phoneme space transformations
Imseng, David
Bourlard, Hervé
Dines, John
Garner, Philip N.
Magimai-Doss, Mathew
EXTERNAL
https://publications.idiap.ch/attachments/papers/2011/Imseng_INTERSPEECH_2011.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/Imseng_Idiap-RR-19-2011
Related documents
Proceedings of Interspeech
Florence, Italy
2011
537-540
We propose a stochastic phoneme space transformation technique that allows the conversion of conditional source phoneme posterior probabilities (conditioned on the acoustics) into target phoneme posterior probabilities. The source and target phonemes can be in any language and phoneme format such as the International Phonetic Alphabet. The novel technique makes use of a Kullback-Leibler divergence based hidden Markov model and can be applied to non-native and accented speech recognition or used to adapt systems to underresourced languages. In this paper, and in the context of hybrid HMM/MLP recognizers, we successfully apply the proposed approach to non-native English speech recognition on the HIWIRE dataset.
REPORT
Imseng_Idiap-RR-19-2011/IDIAP
Improving non-native ASR through stochastic multilingual phoneme space transformations
Imseng, David
Bourlard, Hervé
Dines, John
Garner, Philip N.
Magimai-Doss, Mathew
multilingual acoustic modeling
universal phoneme set
EXTERNAL
https://publications.idiap.ch/attachments/reports/2011/Imseng_Idiap-RR-19-2011.pdf
PUBLIC
Idiap-RR-19-2011
2011
Idiap
June 2011
We propose a stochastic phoneme space transformation technique that allows the conversion of conditional source phoneme posterior probabilities (conditioned on the acoustics) into target phoneme posterior probabilities. The source and target phonemes can be in any language and phoneme format such as the International Phonetic Alphabet. The novel technique makes use of a Kullback-Leibler divergence based hidden Markov model and can be applied to non-native and accented speech recognition or used to adapt systems to under-resourced languages. In this paper, and in the context of hybrid HMM/MLP recognizers, we successfully apply the proposed approach to non-native English speech recognition on the HIWIRE dataset.