Update cookies preferences
 logo Idiap Research Institute        
 [BibTeX] [Marc21]
Text-only adaptation in LLM-based ASR through text denoising
Type of publication: Conference paper
Citation: Burdisso_ICASSP2026_2026
Publication status: Accepted
Booktitle: ICASSP
Year: 2026
Abstract: Adapting automatic speech recognition (ASR) systems based on large language models (LLMs) to new domains using text-only data is a significant yet underexplored challenge. Standard fine-tuning of the LLM on target-domain text often disrupts the critical alignment between speech and text modalities learned by the projector, degrading performance. We introduce a novel text-only adaptation method that emulates the audio projection task by treating it as a text denoising task. Our approach thus trains the LLM to recover clean transcripts from noisy inputs. This process effectively adapts the model to a target domain while preserving cross-modal alignment. Our solution is lightweight, requiring no architectural changes or additional parameters. Extensive evaluation on two datasets demonstrates up to 22.1% relative improvement, outperforming recent state-of-the-art text-only adaptation methods.
Main Research Program: AI for Everyone
Additional Research Programs: AI for Everyone
Keywords:
Projects: Idiap
UNIPHORE
ELOQUENCE
Authors: Burdisso, Sergio
Villatoro-Tello, Esaú
Carofilis, Andrés
Kumar, Shashi
Hacioğlu, Kadri
Madikeri, Srikanth
Rangappa, Pradeep
E, Manjunath K
Motlicek, Petr
Venkatesan, Shankar
Stolcke, Andreas
Added by: [UNK]
Total mark: 0
Attachments
  • Burdisso_ICASSP2026_2026.pdf
Notes