logo Idiap Research Institute        
 [BibTeX] [Marc21]
Syllabic Pitch Tuning for Neutral-to-Emotional Voice Conversion
Type of publication: Idiap-RR
Citation: Saheer_Idiap-RR-31-2015
Number: Idiap-RR-31-2015
Year: 2015
Month: 10
Institution: Idiap
Abstract: Prosody plays an important role in both identification and synthesis of emotionalized speech. Prosodic features like pitch are usually estimated and altered at a segmental level based on short windows of speech (where the signal is expected to be quasi-stationary). This results in a frame-wise change of acoustical parameters for synthesizing emotionalized speech. In order to convert a neutral speech to an emotional speech from the same user, it might be better to alter the pitch parameters at the suprasegmental level like at the syllable-level since the changes in the signal are more subtle and smooth. In this paper we aim to show that the pitch transformation in a neutral-to-emotional voice conversion system may result in a better speech quality output if the transformations are performed at the supra-segmental (syllable) level rather than a frame-level change. Subjective evaluation results are shown to demonstrate if the naturalness, speaker similarity and the emotion recognition tasks show any performance difference.
Projects Idiap
Authors Saheer, Lakshmi
Na, Xingyu
Cernak, Milos
Added by: [ADM]
Total mark: 0
  • Saheer_Idiap-RR-31-2015.pdf (MD5: 7896c667ae95cf669ff894ea592eb52a)