speech recognition
Publications for keyword "speech recognition"
2024
Biologically Inspired Spiking Neural Networks for Speech Recognition, , EPFL/EDEE, 2024 |
[DOI] |
Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, 2024 |
Neurocomputational model of speech recognition for pathological speech detection: a case study on Parkinson’s disease speech detection, and , in: Proceedings of Interspeech, Kos Island, Greece, pages 3590-3594, 2024 |
[DOI] [URL] |
TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, ACL, 2024 |
[URL] |
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , Idiap-RR-07-2024 |
[URL] |
2023
Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training, , , , , , and , in: Proc. 13th SESAR Innovation Days, Seville, Spain, 2023 |
[DOI] [URL] |
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks, , , , , , , , and , in: Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 |
|
2022
Efficient Transformer-Based Speech Recognition, , École polytechnique fédérale de Lausanne, 2022 |
[DOI] |
End-to-end Accented Speech Recognition, , and , Idiap-RR-04-2022 |
|
From Undercomplete to Sparse Overcomplete Autoencoders to Improve LF-MMI Speech Recognition, and , in: Proceedings of Interspeech Conference, 2022 |
|
Low-Level Physiological Implications of End-to-End Learning for Speech Recognition, and , in: Proc. Interspeech 2022, pages 749--753, 2022 |
[DOI] |
Readback Error Detection by Automatic Speech Recognition and Understanding -- Results of HAAWAII Project for Isavia’s Enroute Airspace, , , , , , , , , and , in: 11th SESAR Innovation Days, SESAR, pages 9, 2022 |
|
SPARSE AUTOENCODERS TO ENHANCE SPEECH RECOGNITION, and , Idiap-RR-10-2022 |
|
2021
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model, , and , Idiap-RR-04-2021 |
|
Handling acoustic variation in dysarthric speech recognition systems through model combination, and , in: Proceedings of Interspeech, 2021 |
|
Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances, , , , , , , and , in: Interspeech, 2021 |
|
2020
Comparison of Subword Segmentation Methods for Open-vocabulary ASR using a Difficulty Metric, , , and |
|
COMPARISON OF SUBWORD SEGMENTATION METHODS FOR OPEN-VOCABULARYEND-TO-END SPEECH RECOGNITION, , , and , Idiap-RR-34-2020 |
|
2019
Open-Vocabulary Keyword Spotting With Audio And Text Embeddings, , , and , in: Proceedings of Interspeech 2019, 2019 |
[DOI] |
Segment-level training of ANNs based on acoustic confidence measures for hybrid HMM/ANN Speech Recognition, and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
|
2018
Iterative Learning of Speech Recognition Models for Air Traffic Control, , , , , , and , in: Proceedings of Interspeech 2018, ISCA, Hyderabad, India, pages 3519-3523, 2018 |
[DOI] |
2014
Exemplar-based Sparse Representation for Posterior Features, , and , Idiap-RR-11-2014 |
|
Posterior-based Sparse Representation for Automatic Speech Recognition, , , and , in: Proceeding of Interspeech, 2014 |
|
2012
Comparing different acoustic modeling techniques for multilingual boosting, , , , and , in: Proceedings of Interspeech, Portland, Oregon, 2012 |
|
Robust triphone mapping for acoustic modeling, , and , in: Proceedings of Interspeech, Portland, Oregon, 2012 |
|
Synthetic References for Template-based ASR using Posterior Features, , and , in: Proceedings of Interspeech, Portland, Oregon, USA, 2012 |
|
Template-based ASR using Posterior features and synthetic references: comparing different TTS systems, , and , in: SAPA-SCALE Conference, International Speech Communication Association, 2012 |
|
2011
Model-based Compressive Sensing for Multi-party Distant Speech Recognition, , and , in: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing, 2011 |
|
Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis, , , , , , , , , , , , and , in: Computer Speech and Language, 2011 |
[DOI] [URL] |
Posterior Features for Template-based ASR, , , and , in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011 |
|
2009
Measuring the gap between HMM-based ASR and TTS, , and , in: Proceedings of Interspeech, Brighton, U.K., 2009 |
|
Speech recognition with speech synthesis models by marginalising over decision tree leaves, , and , in: Proceedings of Interspeech, Brighton, U.K., 2009 |
|
Verified Speaker Localization Utilizing Voicing Level in Split-bands, , , and , in: Signal Processing, 89(6):1038-1049, 2009 |
|
2006
Ensembles for Sequence Learning, , École Polytechnique Fédérale de Lausanne, 2006 |
|
2002
TODE: A Decoder for Continuous Speech Recognition, , Idiap-Com-09-2002 |
|