All publications sorted by author
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 |
H
An Objective Evaluation Framework for Pathological Speech Synthesis, , , , , and , in: Proceedings of ITG Conference on Speech Communication, 2021 |
|
Application of Urban Scale Energy Modelling and Multi-Objective Optimization Techniques for Building Energy Renovation at District Scale, , , and , in: Sustainability, 13(20), 2021 |
[DOI] [URL] |
ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities, , , , and , in: 2024 IEEE International Conference on Image Processing (ICIP), 2024 |
[DOI] [URL] |
Learning from demonstration for semi-autonomous teleoperation, and , in: Autonomous Robots, 43(3):713-726, 2019 |
[DOI] [URL] |
Supervisory teleoperation with online learning and optimal control, and , in: Proc. IEEE Intl Conf. on Robotics and Automation (ICRA), Singapore, pages 1534-1540, IEEE, 2017 |
[URL] |
Learning assistive teleoperation behaviors from demonstration, and , in: Proc. IEEE International Symposium on Safety, Security and Rescue Robotics, pages 258-263, 2016 |
|
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding, and , in: Proc. INTERSPEECH 2023, pages 1109-1113, 2023 |
[DOI] |
The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation, and , in: Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pages 4408-4423, Association for Computational Linguistics, 2023 |
[DOI] |
Deep Learning Approaches for Auditory Perception in Robotics, , École polytechnique fédérale de Lausanne, 2021 |
|
Spatial Attention for Far-Field Speech Recognition with Deep Beamforming Neural Networks, , , , , and , in: 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pages 7499-7503, 2020 |
[DOI] |
Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation, , and , in: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29:1303-1317, 2021 |
[DOI] [URL] |
Multi-task Neural Network for Robust Multiple Speaker Embedding Extraction, , and , in: Proceedings of Interspeech 2021, 2021 |
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-Adversarial Training, , and , Idiap-Com-01-2019 |
|
Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-Adversarial Training, , and , in: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brighton, United Kingdom, pages 770-774, 2019 |
[DOI] |
Deep Neural Networks for Multiple Speaker Detection and Localization, , and , Idiap-RR-02-2018 |
|
Deep Neural Networks for Multiple Speaker Detection and Localization, , and , in: 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, AUSTRALIA, pages 74-79, 2018 |
[DOI] |
Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network, , and , in: Proceedings of Interspeech, pages 312--316, 2018 |
[DOI] |
Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network, , and , Idiap-RR-17-2018 |
|
Human Tracking and Pose Estimation in Open Spaces, , École Polytechnique Fédérale de Lausanne (EPFL), 2014 |
|
Detection-Based Multi-Human Tracking Using a CRF Model, , and , in: The Eleventh IEEE International Workshop on Visual Surveillance, 2011 |
|
Exploiting Long-Term Connectivity and Visual Motion in CRF-based Multi-Person Tracking, , and , Idiap-RR-05-2014 |
|
Exploiting Long-Term Connectivity and Visual Motion in CRF-based Multi-Person Tracking, , and , in: Transactions on Image Processing, 2014 |
|
Exploiting Long-Term Connectivity and Visual Motion in CRF-based Multi-Person Tracking, , and , Idiap-RR-06-2014 |
|
Parameter Estimation and Contextual Adaptation for a Multi-Object Tracking CRF Model, and , in: IEEE Workshop on Performance Evaluation of Tracking and Surveillance, 2013 |
|
Improving Head and Body Pose Estimation through Semi-supervised Manifold Alignment, , , , and , in: International Conference on Image Processing, 2014 |
|
Diversity and neocolonialism in Big Data research: Avoiding extractivism while struggling with paternalism, , , , , , and , in: Big Data & Society, 2023 |
[DOI] |
Automatic Speech Recognition and Understanding for Radar Label Maintenance Support Increases Safety and Reduces Air Traffic Controllers’ Workload, , , , , , , , , , , and , in: Fifteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2023), Eurocontrol (Europe), FAA (U.S.), Savannah, Georgia, USA, 2023 |
[URL] |
Readback Error Detection by Automatic Speech Recognition to Increase ATM Safety, , , , , , , , , , , , , and , in: Fourteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2021), The United States Federal Aviation Administration (FAA), EUROCONTROL, pages 10, 2021 |
[URL] |
Readback Error Detection by Automatic Speech Recognition and Understanding -- Results of HAAWAII Project for Isavia’s Enroute Airspace, , , , , , , , , and , in: 11th SESAR Innovation Days, SESAR, pages 9, 2022 |
|
Measuring Speech Recognition And Understanding Performance in Air Traffic Control Domain Beyond Word Error Rates, , , , , , , , and , in: 11th SESAR Innovation Days, 2021 |
|
Master Thesis: Integration of the Harmonic plus Noise Model (HNM) into the Hidden Markov Model-Based Speech Synthesis System (HTS), , Idiap-RR-69-2006 |
|
The Unstoppable Rise of Computational Linguistics in Deep Learning, , in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, pages 6294-6306, Association for Computational Linguistics, 2020 |
[DOI] [URL] |
A VAE for Transformers with Nonparametric Variational Information Bottleneck, and , in: The Eleventh International Conference on Learning Representations, 2023 |
[URL] |
A Variational AutoEncoder for Transformers with Nonparametric Variational Information Bottleneck, and , in: arxiv, 2022 |
[DOI] [URL] |
Transformers as Graph-to-Graph Models, , , and , in: Big Picture Workshop at EMNLP 2023, 2023 |
Estimation of Global Posteriors and Forward-Backward Training of Hybrid HMM/ANN Systems, , , and , in: EUROSPEECH'97, 1997 |
|
Sparse Probabilistic Classifiers, and , in: International Conference on Machine Learning (ICML), 2007 |
|
Sparse Probabilistic Classifiers, and , Idiap-RR-19-2007 |
|
On matching data and model in LF-MMI-based dysarthric speech recognition, , École polytechnique fédérale de Lausanne, 2023 |
[DOI] [URL] |
Multilingual bottleneck features for subword modeling in zero-resource languages, and , in: Proc. Interspeech, pages 2668-2672, 2018 |
[DOI] |
Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages, , and , in: Computer Speech and Language, 65, 2021 |
[DOI] [URL] |
Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation, and , in: Proceedings of Interspeech, pages 156-160, 2023 |
[DOI] [URL] |
Handling acoustic variation in dysarthric speech recognition systems through model combination, and , in: Proceedings of Interspeech, 2021 |
|
Dysarthric Speech Recognition with Lattice-Free MMI, and , in: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 6109-6113, 2020 |
[DOI] [URL] |
Stochastic techniques in deriving perceptual knowledge, , Idiap-RR-84-2004 |
|
TRAP-TANDEM: Data-driven extraction of temporal features from speech, , in: large part published in Proceedings of ASRU-2003, 2003 |
|
TRAP-TANDEM: Data-driven extraction of temporal features from speech, , Idiap-RR-50-2003 |
|
Some Emerging Concepts in Speech Recognition., and , Idiap-RR-82-2003 |
|
Multi-resolution RASTA filtering for TANDEM-based ASR, and , in: Proceedings of Interspeech 2005, 2005 |
|
Multi-resolution RASTA filtering for TANDEM-based ASR, and , Idiap-RR-18-2005 |
|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | 81 | 82 |