Shashi Kumar - Idiap Publications

Update cookies preferences

First name(s):	Shashi
Last name(s):	Kumar

Évaluation de la reconnaissance automatique de la parole par les grands modèles de langage génératifs, Thibault Bañeras-Roux, Shashi Kumar, Driss Khalil, Petr Motlicek, Sergio Burdisso, Shiran Liu, Mickael Rouvier, Jane Wottawa and Richard Dufour, in: EvalLLM2026 : Atelier sur l'evaluation des modeles generatifs (LLM), le RAG et challenges, 2026

attachment

Reducing Prompt Sensitivity in LLM-based Speech Recognition Through Learnable Projection, Sergio Burdisso, Esaú Villatoro-Tello, Shashi Kumar, Srikanth Madikeri, Andrés Carofilis, Pradeep Rangappa, Manjunath K E, Kadri Hacioğlu, Petr Motlicek and Andreas Stolcke, in: ICASSP 2026, 2026

attachment

Text-only adaptation in LLM-based ASR through text denoising, Sergio Burdisso, Esaú Villatoro-Tello, Andrés Carofilis, Shashi Kumar, Kadri Hacioğlu, Srikanth Madikeri, Pradeep Rangappa, Manjunath K E, Petr Motlicek, Shankar Venkatesan and Andreas Stolcke, in: ICASSP, 2026

attachment

Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering, Andrés Carofilis, Pradeep Rangappa, Srikanth Madikeri, Shashi Kumar, Sergio Burdisso, Jeena Prakash, Esaú Villatoro-Tello, Petr Motlicek, Bidisha Sharma, Kadri Hacioğlu, Shankar Venkatesan, Saurabh Vyas and Andreas Stolcke, in: Interspeech 2025, Rotterdam, The Netherlands, pages 3618--3622, 2025

attachment

[DOI]
[URL]

Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering, Pradeep Rangappa, Andrés Carofilis, Jeena Prakash, Shashi Kumar, Sergio Burdisso, Srikanth Madikeri, Esaú Villatoro-Tello, Bidisha Sharma, Petr Motlicek, Kadri Hacioğlu, Shankar Venkatesan, Saurabh Vyas and Andreas Stolcke, in: Proc. Interspeech, 2025

attachment

Fine-Tuning Pretrained Models with NVIB for Improved Generalisation, Fabio Fehr, Alina Elena Baia, Xiaoguang Chang, Andrei Catalin Coman, Karl El Hajal, Dina El Zein, Shashi Kumar, Juan Zuluaga-Gomez, Andrea Cavallaro, Damien Teney and James Henderson, in: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions, 2025

[URL]

Latent Space Factorization in LoRA, Shashi Kumar, Yacouba Kaloga, John Mitros, Petr Motlicek and Ina Kodrasi, in: 39th Conference on Neural Information Processing Systems, 2025

attachment

[URL]

Leveraging Untranscribed Data for End-to-End Speech and Callsign Recognition in Air-Traffic Communication, Petr Motlicek, Shashi Kumar, Driss Khalil, Amrutha Prasad and Schüpbach Christof, in: SESAR Innovation Days 2025 (https://www.sesarju.eu/SIDS2025), Eurocontrol, Bled, Slovenia, 2025

[URL]

Performance Evaluation of SLAM-ASR: The Good, the Bad, the Ugly, and the Way Forward, Shashi Kumar, Iuliia Thorbecke, Sergio Burdisso, Esaú Villatoro-Tello, Manjunath K E, Kadri Hacioğlu, Pradeep Rangappa, Petr Motlicek, Aravind Ganapathiraju and Andreas Stolcke, in: SALMA Workshop, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, IEEE, 2025

attachment

[URL]

Speech Data Selection for Efficient ASR Fine-Tuning using Domain Classifier and Pseudo-Label Filtering, Pradeep Rangappa, Juan Zuluaga-Gomez, Srikanth Madikeri, Andrés Carofilis, Jeena Prakash, Sergio Burdisso, Shashi Kumar, Esaú Villatoro-Tello, Nigmatulina Iuliia, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, in: 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025), 2025

attachment

[DOI]
[URL]

TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation, Shashi Kumar, Srikanth Madikeri, Esaú Villatoro-Tello, Sergio Burdisso, Pradeep Rangappa, Andrés Carofilis, Petr Motlicek, Karthik Pandia D S, Shankar Venkatesan, Kadri Hacioğlu and Andreas Stolcke, in: 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, 2025

attachment

Unifying Global and Near-Context Biasing in a Single Trie Pass., Thorbecke Iuliia, Esaú Villatoro-Tello, Juan Zuluaga-Gomez, Shashi Kumar, Sergio Burdisso, Pradeep Rangappa, Andrés Carofilis, Srikanth Madikeri, Petr Motlicek, Karthik Pandia D S, Kadri Hacioğlu and Andreas Stolcke, in: Text, Speech, and Dialogue. TSD 2025. Lecture Notes in Computer Science, Springer, Springer, 2025

attachment

[DOI]
[URL]

XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Iuliia Thorbecke, Petr Motlicek, Manjunath K E and Aravind Ganapathiraju, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, IEEE, 2025

attachment

[DOI]
[URL]

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, Thorbecke Iuliia, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Shashi Kumar, Pradeep Rangappa, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, Idiap-RR-10-2024

attachment

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, Iuliia Thorbecke, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Shashi Kumar, Pradeep Rangappa, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, in: Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16747–16762, Association for Computational Linguistics (ACL), 2024

attachment

[DOI]
[URL]

Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, Shashi Kumar, Srikanth Madikeri, Nigmatulina Iuliia, Esaú Villatoro-Tello, Petr Motlicek, Karthik Pandia D S, S. Pavankumar Dubagunta and Aravind Ganapathiraju, in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12592-12596, IEEE, 2024

[DOI]
[URL]

Probability-Aware Word-Confusion-Network-to-Text Alignment Approach for Intent Classification, Esaú Villatoro-Tello, Srikanth Madikeri, Bidisha Sharma, Driss Khalil, Shashi Kumar, Nigmatulina Iuliia, Petr Motlicek and Aravind Ganapathiraju, in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12617-12621, IEEE, 2024

attachment

[DOI]
[URL]

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Iuliia Thorbecke, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20988–20995, Association for Computational Linguistics (ACL), 2024

attachment

[DOI]
[URL]

TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Nigmatulina Iuliia, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, Idiap-RR-07-2024

attachment

[URL]

XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Nigmatulina Iuliia, Petr Motlicek, Manjunath K E and Aravind Ganapathiraju, Idiap-RR-08-2024

attachment

[URL]

processing time: 0.0026 seconds.