Update cookies preferences
 logo Idiap Research Institute        
 [BibTeX] [Marc21]
Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering
Type of publication: Conference paper
Citation: Rangappa_INTERSPEECH_2025
Booktitle: Proc. Interspeech
Year: 2025
Abstract: Fine-tuning pretrained ASR models for specific domains is challenging for small organizations with limited labeled data and computational resources. Here we explore different data selection pipelines and propose a robust approach that improves ASR adaptation by filtering pseudo-labels generated using Whisper (encoder-decoder) and Zipformer (transducer) models. Our approach integrates multiple selection strategies---including word error rate (WER) prediction, named entity recognition (NER), and character error rate (CER) analysis---to extract high-quality training segments. We evaluate our method on Whisper and Zipformer using a 7500-hour baseline, comparing it to a CER-based approach relying on hypotheses from three ASR systems. Fine-tuning on 7500 hours of pseudo-labeled call center data achieves 12.3% WER, while our filtering reduces the dataset to 100 hours (1.4%) with similar performance; a similar trend is observed on Fisher English.
Keywords: Data Selection, speech recognition, whisper, Zipformer
Projects: UNIPHORE
ELOQUENCE
Authors: Rangappa, Pradeep
Carofilis, Andrés
Prakash, Jeena
Kumar, Shashi
Burdisso, Sergio
Madikeri, Srikanth
Villatoro-Tello, Esaú
Sharma, Bidisha
Motlicek, Petr
Hacioğlu, Kadri
Venkatesan, Shankar
Vyas, Saurabh
Stolcke, Andreas
Added by: [UNK]
Total mark: 0
Attachments
  • Rangappa_INTERSPEECH_2025.pdf
Notes