Keywords:
- Aho-Corasick algorithm
- automatic speech recognition (ASR)
- Contextual Adaptation
- Contextualisation and adaptation of ASR
- Data Selection
- Domain Classification
- F1 score
- finite-state transducers
- GPU decoding
- language identification
- multitask learning
- multitask training
- named entity recognition
- pseudo-labelling
- real-time ASR
- real-time speech recognition
- shallow fusion
- Speaker change detection
- speaker turn detection
- speech recognition
- streaming transducer
- transformer transducer
- whisper
- XLSR-Transducer
- Zipformer
Publications of Karthik Pandia D S sorted by title
F
| Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, , , , , , , , and , Idiap-RR-10-2024 |
|
| Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, , , , , , , , and , in: Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16747–16762, Association for Computational Linguistics (ACL), 2024 |
[DOI] [URL] |
| Feature Switching in the i-vector Framework for Speaker Verification, , , , and , in: Proc. of Interspeech 2014, pages 5, 2014 |
I
| Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , Idiap-RR-02-2023 |
|
| Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , in: Proc. Interspeech 2023, pages 4494--4498, 2023 |
[DOI] [URL] |
M
| Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12592-12596, IEEE, 2024 |
[DOI] [URL] |
S
| Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering, , , , , , , , , , , and , in: 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025), 2025 |
[DOI] [URL] |
T
| TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation, , , , , , , , , , and , in: 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, 2025 |
|
| TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20988–20995, Association for Computational Linguistics (ACL), 2024 |
[DOI] [URL] |
| TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , Idiap-RR-07-2024 |
[URL] |
U
| Unifying Global and Near-Context Biasing in a Single Trie Pass., , , , , , , , , , , and , in: Text, Speech, and Dialogue. TSD 2025. Lecture Notes in Computer Science, Springer, Springer, 2025 |
[DOI] [URL] |