Boosting under-resourced speech recognizers by exploiting out of language data - Case study on Afrikaans

Type of publication:	Conference paper
Citation:	Imseng_SLTU_2012
Publication status:	Published
Booktitle:	Proceedings of the 3rd International Workshop on Spoken Languages Technologies for Under-resourced Languages
Year:	2012
Month:	May
Pages:	60--67
Location:	Cape Town
Abstract:	Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit available multilingual resources, we use posterior features, estimated by multilayer perceptrons that are trained on similar languages. For two different acoustic modeling techniques, Tandem and Kullback-Leibler divergence based HMMs, the proposed multilingual system yields more than 10% relative improvement compared to the corresponding monolingual systems only trained on Afrikaans.
Keywords:	Afrikaans, multilingual speech recognition, Posterior features, under-resourced languages
Projects:	Idiap IM2 SNSF-MULTI
Authors:	Imseng, David Bourlard, Hervé Garner, Philip N.
Crossref by	Imseng_Idiap-RR-15-2012
Added by:	[UNK]
Total mark:	0
Attachments
Imseng_SLTU_2012.pdf
Notes

processing time: 0.0003 seconds.