Bias Adaptation for Vocal Tract Length Normalization
Type of publication: Idiap-RR
Citation: Saheer_Idiap-RR-12-2013
Number: Idiap-RR-12-2013
Year: 2013
Month: 4
Institution: Idiap
Abstract: Vocal tract length normalisation (VTLN) is a well known rapid adaptation technique. VTLN as a linear transformation in the cepstral domain results in the scaling and translation factors. The warping factor represents the spectral scaling parameter. While, the translation factor represented by bias term captures more speaker characteristics especially in a rapid adaptation framework without having the risk of over-fitting. This paper presents a complete and comprehensible derivation of the bias transformation for VTLN and implements it in a unified framework for statistical parametric speech synthesis and recognition. The recognition experiments show that bias term improves the rapid adaptation performance and gives additional performance over the cepstral mean normalisation factor. It was observed from the synthesis results that VTLN bias term did not have much effect in combination with model adaptation techniques that already have a bias transformation incorporated.
Projects Idiap
Authors Saheer, Lakshmi
Yamagishi, Junichi
Garner, Philip N.
Dines, John
