TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation
| Type of publication: | Conference paper |
| Citation: | Kumar_IEEEASRU2025_2025 |
| Publication status: | Accepted |
| Booktitle: | 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) |
| Year: | 2025 |
| Publisher: | IEEE |
| Abstract: | Token-based multitasking frameworks like TokenVerse require all training utterances to have labels for all tasks, hindering their ability to leverage partially annotated datasets and scale effectively. We propose TokenVerse++, which introduces learnable vectors in the acoustic embedding space of the XLSR-Transducer ASR model for dynamic task activation. This core mechanism enables training with utterances labeled for only a subset of tasks, a key advantage over TokenVerse. We demonstrate this by successfully integrating a dataset with partial labels, specifically for ASR and an additional task, language identification, improving overall performance. TokenVerse++ achieves results on par with or exceeding TokenVerse across multiple tasks, establishing it as a more practical multitask alternative without sacrificing ASR performance. |
| Main Research Program: | Human-AI Teaming |
| Additional Research Programs: |
AI for Everyone |
| Keywords: | language identification, multitask training, named entity recognition, Speaker change detection, speech recognition, XLSR-Transducer |
| Projects: |
UNIPHORE ELOQUENCE |
| Authors: | |
| Added by: | [UNK] |
| Total mark: | 0 |
|
Attachments
|
|
|
Notes
|
|
|
|
|