logo Idiap Research Institute        
 [BibTeX] [Marc21]
Conversational Speech Recognition Needs Data? Experiments with Austrian German
Type of publication: Conference paper
Citation: Linke_LREC_2022
Publication status: Published
Booktitle: Proceedings of the 13th Language Resources and Evaluation Conference
Year: 2022
Month: June
Pages: 4684--4691
Organization: European Language Resources Association
Address: Marseille, France
Crossref: Idiap-Internal-RR-10-2022
URL: http://www.lrec-conf.org/proce...
Abstract: Conversational speech represents one of the most complex of automatic speech recognition (ASR) tasks owing to the high inter-speaker variation in both pronunciation and conversational dynamics. Such complexity is particularly sensitive to low-resourced (LR) scenarios. Recent developments in self-supervision have allowed such scenarios to take advantage of large amounts of otherwise unrelated data. In this study, we characterise an (LR) Austrian German conversational task. We begin with a non-pre-trained baseline and show that fine-tuning of a model pre-trained using self-supervision leads to improvements consistent with those in the literature; this extends to cases where a lexicon and language model are included. We also show that the advantage of pre-training indeed arises from the larger database rather than the self-supervision. Further, by use of a leave-one-conversation out technique, we demonstrate that robustness problems remain with respect to inter-speaker and inter-conversation variation. This serves to guide where future research might best be focused in light of the current state-of-the-art.
Projects Idiap
Authors Linke, Julian
Garner, Philip N.
Kubin, Gernot
Schuppler, Barbara
Added by: [UNK]
Total mark: 0