Keyword: "speech recognition" - Idiap Publications

Update cookies preferences

Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering, Pradeep Rangappa, Andrés Carofilis, Jeena Prakash, Shashi Kumar, Sergio Burdisso, Srikanth Madikeri, Esaú Villatoro-Tello, Bidisha Sharma, Petr Motlicek, Kadri Hacioğlu, Shankar Venkatesan, Saurabh Vyas and Andreas Stolcke, in: Proc. Interspeech, 2025

attachment

TokenVerse++: Towards Flexible Multitask Learning with Dynamic Task Activation, Shashi Kumar, Srikanth Madikeri, Esaú Villatoro-Tello, Sergio Burdisso, Pradeep Rangappa, Andrés Carofilis, Petr Motlicek, Karthik Pandia D S, Shankar Venkatesan, Kadri Hacioğlu and Andreas Stolcke, in: 2025 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE, 2025

attachment

Biologically Inspired Spiking Neural Networks for Speech Recognition, Alexandre Bittar, EPFL/EDEE, 2024

attachment

[DOI]

Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, Shashi Kumar, Srikanth Madikeri, Nigmatulina Iuliia, Esaú Villatoro-Tello, Petr Motlicek, Karthik Pandia D S, S. Pavankumar Dubagunta and Aravind Ganapathiraju, in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12592-12596, IEEE, 2024

[DOI]
[URL]

Neurocomputational model of speech recognition for pathological speech detection: a case study on Parkinson’s disease speech detection, Sevada Hovsepyan and Mathew Magimai-Doss, in: Proceedings of Interspeech, Kos Island, Greece, pages 3590-3594, 2024

attachment

[DOI]
[URL]

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Iuliia Thorbecke, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20988–20995, Association for Computational Linguistics (ACL), 2024

attachment

[DOI]
[URL]

TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Nigmatulina Iuliia, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia D S and Aravind Ganapathiraju, Idiap-RR-07-2024

attachment

[URL]

Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training, Mrinmoy Bhattacharjee, Petr Motlicek, Nigmatulina Iuliia, Hartmut Helmke, Oliver Ohneiser, Matthias Kleinert and heiko Ehr, in: Proc. 13th SESAR Innovation Days, Seville, Spain, 2023

attachment

[DOI]
[URL]

Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks, Esaú Villatoro-Tello, Srikanth Madikeri, Juan Zuluaga-Gomez, Bidisha Sharma, Seyyed Saeed Sarfjoo, Nigmatulina Iuliia, Petr Motlicek, Alexei V. Ivanov and Aravind Ganapathiraju, in: Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023

attachment

Efficient Transformer-Based Speech Recognition, Apoorv Vyas, École polytechnique fédérale de Lausanne, 2022

attachment

[DOI]

End-to-end Accented Speech Recognition, Thibault Viglino, Petr Motlicek and Milos Cernak, Idiap-RR-04-2022

attachment

From Undercomplete to Sparse Overcomplete Autoencoders to Improve LF-MMI Speech Recognition, Selen Hande Kabil and Hervé Bourlard, in: Proceedings of Interspeech Conference, 2022

attachment

Low-Level Physiological Implications of End-to-End Learning for Speech Recognition, Louise Coppieters de Gibson and Philip N. Garner, in: Proc. Interspeech 2022, pages 749--753, 2022

attachment

[DOI]

Readback Error Detection by Automatic Speech Recognition and Understanding -- Results of HAAWAII Project for Isavia’s Enroute Airspace, Hartmut Helmke, Karel Ondřej, Shruthi Shetty, Hörður Arilíusson, Teodor S. Simiganoschi, Matthias Kleinert, Oliver Ohneiser, heiko Ehr, Juan Zuluaga-Gomez and Pavel Smrz, in: 11th SESAR Innovation Days, SESAR, pages 9, 2022

attachment

SPARSE AUTOENCODERS TO ENHANCE SPEECH RECOGNITION, Selen Hande Kabil and Hervé Bourlard, Idiap-RR-10-2022

attachment

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model, Apoorv Vyas, Srikanth Madikeri and Hervé Bourlard, Idiap-RR-04-2021

attachment

Handling acoustic variation in dysarthric speech recognition systems through model combination, Enno Hermann and Mathew Magimai-Doss, in: Proceedings of Interspeech, 2021

attachment

Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances, Oliver Ohneiser, Seyyed Saeed Sarfjoo, Hartmut Helmke, Shruthi Shetty, Petr Motlicek, Matthias Kleinert, heiko Ehr and Šarūnas Murauskas, in: Interspeech, 2021

attachment

Comparison of Subword Segmentation Methods for Open-vocabulary ASR using a Difficulty Metric, Abbas Khosravani, Claudiu Musat, Philip N. Garner and Alexandros Lazaridis

attachment

COMPARISON OF SUBWORD SEGMENTATION METHODS FOR OPEN-VOCABULARYEND-TO-END SPEECH RECOGNITION, Abbas Khosravani, Claudiu Musat, Philip N. Garner and Alexandros Lazaridis, Idiap-RR-34-2020

attachment

Open-Vocabulary Keyword Spotting With Audio And Text Embeddings, Niccolò Sacchi, Alexandre Nanchen, Martin Jaggi and Milos Cernak, in: Proceedings of Interspeech 2019, 2019

attachment

[DOI]

Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition, S. Pavankumar Dubagunta and Mathew Magimai-Doss, in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019

attachment

Iterative Learning of Speech Recognition Models for Air Traffic Control, Ajay Srinivasamurthy, Petr Motlicek, Mittul Singh, Youssef Oualil, Matthias Kleinert, heiko Ehr and Hartmut Helmke, in: Proceedings of Interspeech 2018, ISCA, Hyderabad, India, pages 3519-3523, 2018

attachment

[DOI]

Exemplar-based Sparse Representation for Posterior Features, Sara Bahaadini, Afsaneh Asaei and Hervé Bourlard, Idiap-RR-11-2014

attachment

Posterior-based Sparse Representation for Automatic Speech Recognition, Sara Bahaadini, Afsaneh Asaei, David Imseng and Hervé Bourlard, in: Proceeding of Interspeech, 2014

attachment

Comparing different acoustic modeling techniques for multilingual boosting, David Imseng, John Dines, Petr Motlicek, Philip N. Garner and Hervé Bourlard, in: Proceedings of Interspeech, Portland, Oregon, 2012

attachment

Robust triphone mapping for acoustic modeling, Milos Cernak, David Imseng and Hervé Bourlard, in: Proceedings of Interspeech, Portland, Oregon, 2012

attachment

Synthetic References for Template-based ASR using Posterior Features, Serena Soldo, Mathew Magimai-Doss and Hervé Bourlard, in: Proceedings of Interspeech, Portland, Oregon, USA, 2012

attachment

Template-based ASR using Posterior features and synthetic references: comparing different TTS systems, Serena Soldo, Mathew Magimai-Doss and Hervé Bourlard, in: SAPA-SCALE Conference, International Speech Communication Association, 2012

attachment

Model-based Compressive Sensing for Multi-party Distant Speech Recognition, Afsaneh Asaei, Hervé Bourlard and Volkan Cevher, in: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, 2011

attachment

Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis, John Dines, Hui Liang, Lakshmi Saheer, Matthew Gibson, William Byrne, Keiichiro Oura, Keiichi Tokuda, Junichi Yamagishi, Simon King, Mirjam Wester, Teemu Hirsimäki, Reima Karhila and Mikko Kurimo, in: Computer Speech and Language, 2011

attachment

[DOI]
[URL]

Posterior Features for Template-based ASR, Serena Soldo, Mathew Magimai-Doss, Joel Praveen Pinto and Hervé Bourlard, in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011

attachment

Measuring the gap between HMM-based ASR and TTS, John Dines, Junichi Yamagishi and Simon King, in: Proceedings of Interspeech, Brighton, U.K., 2009

attachment

Speech recognition with speech synthesis models by marginalising over decision tree leaves, John Dines, Lakshmi Saheer and Hui Liang, in: Proceedings of Interspeech, Brighton, U.K., 2009

attachment

Verified Speaker Localization Utilizing Voicing Level in Split-bands, Afsaneh Asaei, Mohammad J. Taghizadeh, Marjan Bahrololum and Mohammed Ghanbari, in: Signal Processing, 89(6):1038-1049, 2009

attachment

Ensembles for Sequence Learning, Christos Dimitrakakis, École Polytechnique Fédérale de Lausanne, 2006

attachment

TODE: A Decoder for Continuous Speech Recognition, Darren Moore, Idiap-Com-09-2002

attachment

processing time: 0.0004 seconds.