Publications of IICT sorted by recency
| Idiap kNN-TTS System for the Blizzard Challenge 2025, , , and , in: Blizzard Challenge Workshop, 2025 |
|
| Unveiling Audio Deepfake Origins: A Deep Metric learning And Conformer Network Approach With Ensemble Fusion, , , and , in: Proceedings of Interspeech, 2025 |
|
| Multimodal Prosody Modeling: A Use Case for Multilingual Sentence Mode Prediction, and , in: Proceedings of Interspeech, 2025 |
|
| Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric Speech, , , and , in: Proceedings of Interspeech, Rotterdam, Netherlands, ISCA, 2025 |
[URL] |
| kNN Retrieval for Simple and Effective Zero-Shot Multi-speaker Text-to-Speech, , , and , in: Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), Albuquerque, New Mexico, ACL, 2025 |
[URL] |
| Unsupervised Rhythm and Voice Conversion of Dysarthric to Healthy Speech for ASR, , , and , in: Proceedings of Workshop on Speech Pathology Analysis and DEtection (SPADE), Hyderabad, India, IEEE, 2025 |
[URL] |
| What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark, , , , and , in: ISCA proceedings, Greece, 2024 |
[DOI] [URL] |
| Exploring generalization to unseen audio data for spoofing: insights from SSL models, , , , , and , in: ISCA Proceedings, Greece, 2024 |
[DOI] [URL] |
| Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems, , , and , in: ISCA proceedings, Greece, pages 4, 2024 |
[DOI] [URL] |
| SSL-TTS: Leveraging Self-Supervised Embeddings and kNN Retrieval for Zero-Shot Multi-speaker TTS, , , and , Idiap-Internal-RR-38-2024 |
[URL] |
| Towards interfacing large language models with ASR systems using confidence measures and prompting, , , , and , in: Proceedings of Interspeech, pages 2980-2984, 2024 |
[DOI] |
| Cross-transfer Knowledge between Speech and Text Encoders to Evaluate Customer Satisfaction, , , , and , in: Proceedings of Interspeech, Kos Island, Greece, ISCA, 2024 |
|
| COMPARING DATA-DRIVEN AND HANDCRAFTED FEATURES FOR DIMENSIONAL EMOTION RECOGNITION, , and , in: International Conference on Acoustics, Speech and Signal Processing, 2024 |
|
| On matching data and model in LF-MMI-based dysarthric speech recognition, , École polytechnique fédérale de Lausanne, 2023 |
[DOI] [URL] |
| Using Commercial ASR Solutions to Assess Reading Skills in Children: A Case Report, , , , , and , in: Proceedings of Interspeech, pages 4573-4577, 2023 |
[DOI] [URL] |
| Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation, and , in: Proceedings of Interspeech, pages 156-160, 2023 |
[DOI] [URL] |