logo Idiap Research Institute        
 [BibTeX] [Marc21]
Synthetic References for Template-based ASR using Posterior Features
Type of publication: Conference paper
Citation: Soldo_INTERSPEECH_2012
Publication status: Accepted
Booktitle: Proceedings of Interspeech
Year: 2012
Month: September
Location: Portland, Oregon, USA
Abstract: Recently, the use of phoneme class-conditional probabilities as features (posterior features) for template-based ASR has been proposed. These features have been found to generalize well to unseen data and yield better systems than standard spectral-based features. In this paper, motivated by the high quality of current text-to-speech systems and the robustness of posterior features toward undesired variability, we investigate the use of synthetic speech to generate reference templates. The use of synthetic speech in template-based ASR not only allows to address the issue of in-domain data collection but also expansion of vocabulary. Using 75- and 600-word task-independent and speaker-independent setup on Phonebook database, we investigate different synthetic voices produced by the Festival HTS-based synthesizer trained on CMU ARCTIC databases. Our study shows that synthetic speech templates can yield performance comparable to the natural speech templates, especially with synthetic voices that have high intelligibility.
Keywords: Posterior features, speech recognition, synthetic reference templates., template-based approach
Projects Idiap
FP 7
Authors Soldo, Serena
Magimai.-Doss, Mathew
Bourlard, Hervé
Added by: [UNK]
Total mark: 0
Attachments
  • Soldo_INTERSPEECH_2012.pdf
Notes