logo Idiap Research Institute        
 [BibTeX] [Marc21]
Learning to Translate Low-Resourced Swiss German Dialectal Speech into Standard German Text
Type of publication: Conference paper
Citation: Khosravani_ASRU_2021
Publication status: Accepted
Booktitle: IEEE Automatic Speech Recognition and Understanding Workshop
Year: 2021
Month: December
Publisher: IEEE
Location: Colombia, Cartagena
Abstract: For a low-resourced language like Swiss German with no standard orthography and a significant variation in its written form, spoken language resources are more likely to come with translations than transcriptions. Moreover, the desired output of an automatic transcription system for Swiss German multi-dialectal speech is Standard German. This, in turn, is due to many applications that include our TV Box voice assistant and broadcast media. It follows that a translation is usually required as Swiss German and Standard German have mismatches on all linguistic levels. Unfortunately, there are not enough parallel text corpora available for training a proper translation system, nor enough in-domain speech translation (ST) data for training an ST system. We aim at investigating an end-to-end approach for multi-dialect Swiss German ST using transfer learning. Our ST model is based on an encoder-decoder architecture where we initialize the encoder with a cross-lingual speech representation model which is adapted to in-domain Swiss German speech data. We demonstrate that training the decoder on an out-of-domain ST corpus by preserving the encoder unit and then fine-tuning on in-domain ST data can be more effective than a cascade or vanilla direct ST.
Projects Idiap
Authors Khosravani, Abbas
Garner, Philip N.
Lazaridis, Alexandros
Added by: [UNK]
Total mark: 0
  • Khosravani_ASRU_2021.pdf