Speech recognition with speech synthesis models by marginalising over decision tree leaves

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Dines_INTERSPEECH-2_2009
Booktitle:	Proceedings of Interspeech
Year:	2009
Month:	9
Location:	Brighton, U.K.
Crossref:	Dines_Idiap-RR-17-2009: Speech recognition with speech synthesis models by marginalising over decision tree leaves, Dines, John, Saheer, Lakshmi and Liang, Hui, Idiap-RR-17-2009
Abstract:	There has been increasing interest in the use of unsupervised adaptation for the personalisation of text-to-speech (TTS) voices, particularly in the context of speech-to-speech translation. This requires that we are able to generate adaptation transforms from the output of an automatic speech recognition (ASR) system. An approach that utilises unified ASR and TTS models would seem to offer an ideal mechanism for the application of unsupervised adaptation to TTS since transforms could be shared between ASR and TTS. Such unified models should use a common set of parameters. A major barrier to such parameter sharing is the use of differing contexts in ASR and TTS. In this paper we propose a simple approach that generates ASR models from a trained set of TTS models by marginalising over the TTS contexts that are not used by ASR. We present preliminary results of our proposed method on a large vocabulary speech recognition task and provide insights into future directions of this work.
Keywords:	decision trees, speech recognition, speech synthesis, unified models
Projects	EMIME
Authors	Dines, John Saheer, Lakshmi Liang, Hui
Added by:	[UNK]
Total mark:	0
Attachments
Dines_INTERSPEECH-2_2009.pdf
Notes

processing time: 0.0003 seconds.