logo Idiap Research Institute        
 [BibTeX] [Marc21]
Combining Vocal Tract Length Normalization with Linear Transformations in a Bayesian Framework
Type of publication: Idiap-RR
Citation: Saheer_Idiap-RR-11-2012
Number: Idiap-RR-11-2012
Year: 2012
Month: 4
Institution: Idiap
Abstract: Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR- based adaptation techniques, being much closer in quality to that generated by the original average voice model. By contrast, with just a single parameter, VTLN captures very few speaker specific characteristics when compared to the available linear transform based adaptation techniques. This paper proposes that the merits of VTLN can be combined with those of linear transform based adaptation technique in a Bayesian framework, where VTLN is used as the prior information. A novel technique of propa- gating the gender information from the VTLN prior through constrained structural maximum a posteriori linear regression (CSMAPLR) adaptation is presented. Experiments show that the resulting transformation has improved speech quality with better naturalness, intelligibility and improved speaker similarity.
Keywords:
Projects Idiap
EMIME
Authors Saheer, Lakshmi
Yamagishi, Junichi
Garner, Philip N.
Dines, John
Crossref by Saheer_ICASSP_2012
Added by: [ADM]
Total mark: 0
Attachments
  • Saheer_Idiap-RR-11-2012.pdf (MD5: 58efa32eea25ee5d4fdc47cc66b359fb)
Notes