CONF
Motlicek_INTERSPEECH2013_2013/IDIAP
Crosslingual Tandem-SGMM: Exploiting Out-Of-Language Data for Acoustic Model and Feature Level Adaptation
Motlicek, Petr
Imseng, David
Garner, Philip N.
EXTERNAL
https://publications.idiap.ch/attachments/papers/2013/Motlicek_INTERSPEECH2013_2013.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/Motlicek_Idiap-RR-39-2013
Related documents
ISCA - International Speech Communication Association - Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013)
Lyon, France
2013
ISCA
510-514
2308-457X
Recent studies have shown that speech recognizers may benefit from data in languages other than the target language through efficient acoustic model- or feature-level adaptation. Crosslingual Tandem-Subspace Gaussian Mixture Models (SGMM) are successfully able to combine acoustic model- and feature-level adaptation techniques. More specifically, we focus on under-resourced languages (Afrikaans in our case) and perform feature-level adaptation through the estimation of phone class posterior features with a Multilayer Perceptron that was trained
on data from a similar language with large amounts of available speech data (Dutch in our case). The same Dutch data can also be exploited on an acoustic model-level by training globally-shared SGMM parameters in a crosslingual way. The
two adaptation techniques are indeed complementary and result in a crosslingual Tandem-SGMM system that yields relative improvement of about 22% compared to a standard speech recognizer on an Afrikaans phoneme recognition task. Interestingly, eventual score-level combination of the individual SGMM systems yields additional 3% relative improvement.
REPORT
Motlicek_Idiap-RR-39-2013/IDIAP
Crosslingual Tandem-SGMM: Exploiting Out-Of-Language Data for Acoustic Model and Feature Level Adaptation
Motlicek, Petr
Imseng, David
Garner, Philip N.
Acoustic model adaptation
Automatic Speech Recognition
under-resourced languages
EXTERNAL
https://publications.idiap.ch/attachments/reports/2013/Motlicek_Idiap-RR-39-2013.pdf
PUBLIC
Idiap-RR-39-2013
2013
Idiap
Rue Marconi 19, Martigny, Switzerland
November 2013
Recent studies have shown that speech recognizers may benefit from data in languages other than the target language through efficient acoustic model- or feature-level adaptation. Crosslingual Tandem-Subspace Gaussian Mixture Models (SGMM) are successfully able to combine acoustic model- and feature-level adaptation techniques. More specifically, we focus on under-resourced languages (Afrikaans in our case) and perform feature-level adaptation through the estimation of phone class posterior features with a Multilayer Perceptron that was trained
on data from a similar language with large amounts of available speech data (Dutch in our case). The same Dutch data can also be exploited on an acoustic model-level by training globally-shared SGMM parameters in a crosslingual way. The two adaptation techniques are indeed complementary and result in a crosslingual Tandem-SGMM system that yields relative improvement of about 22% compared to a standard speech recognizer on an Afrikaans phoneme recognition task. Interestingly, eventual score-level combination of the individual SGMM systems yields additional 3% relative improvement.