Keywords:
- ADS-B data
- air surveillance data
- Air traffic control
- air traffic control communications
- air traffic controller
- air traffic controller’s workload
- air traffic management
- articulatory features
- Artificial intelligence
- Assistant Based Speech Recognition
- assistant system
- automatic accent classification
- Automatic Speech Recognition
- automatic speech recognition (ASR)
- automatic speech recognition and understanding
- automatic speech understanding
- call sign detection
- Call-sign Detection
- Call-sign Recognition
- chunking
- Common Voice dataset
- Contextual Adaptation
- conversational speech
- Cross-lingual automatic speech recognition (ASR)
- Cross-modal Attentio
- Cross-modal Attention
- Data Selection
- deep neural networks
- diarization
- Domain Classification
- domain-adversarial neural network
- ECAPA-TDNN
- end-to-end ASR
- finite-state transducers
- GDPR
- GPU decoding
- human factors
- Human-Computer Interaction
- human-in-the-loop simulation
- language identification
- Lattice-Free MMI
- legal framework
- Low-Resource ASR
- machine learning
- multi-stream learning
- multitask acoustic modeling
- multitask training
- named entity recognition
- Natural language processing
- online speech recognition
- OpenSky Network
- personal data processing
- pseudo-labelling
- radar label
- Readback Error Detection
- real-time speech recognition
- Robust Automatic Speech Recognition
- saftety
- self-supervised pre-training
- shallow fusion
- signal processing
- situation awareness
- Speaker change detection
- speaker clustering
- speaker role classification
- speaker role detection
- speaker role identification
- speech recognition
- speech understanding
- SpeechBrain
- Spoken Language Understanding
- streaming transducer
- Text-based speaker diarization
- transfer learning
- wav2vec 2.0
- whisper
- Word Consensus Networks
- XLSR-Transducer
- Zipformer
Publications of Juan Zuluaga-Gomez sorted by recency
Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, , , , , , , , and , in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, ACL, 2024 |
|
TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , in: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20988–20995, Association for Computational Linguistics (ACL), 2024 |
[URL] |
Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper, , , , , , , , and , Idiap-RR-10-2024 |
|
XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, , , , , , , and , Idiap-RR-08-2024 |
[URL] |
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , Idiap-RR-07-2024 |
[URL] |
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation, , , , , , and , in: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore, 2023 |
[URL] |
Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding, , , , , , , , , , and , in: Aerospace, 10(10):898, 2023 |
[DOI] [URL] |
An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain, , , , , , and , in: Aerospace, 10(10):876, 2023 |
[DOI] [URL] |
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers, , , , and , in: Aerospace, 10(5), 2023 |
[DOI] [URL] |
Validating Automatic Speech Recognition and Understanding for Pre-Filling Radar Labels-Increasing Safety While Reducing Air Traffic Controllers' Workload, , , , , , , and , in: Aerospace, 10(6):538, 2023 |
[DOI] |
Automatic Speech Recognition and Understanding for Radar Label Maintenance Support Increases Safety and Reduces Air Traffic Controllers’ Workload, , , , , , , , , , , and , in: Fifteenth USA/Europe Air Traffic Management Research and Development Seminar (ATM2023), Eurocontrol (Europe), FAA (U.S.), Savannah, Georgia, USA, 2023 |
[URL] |
HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition, , , and , in: Proc. Interspeech 2023, Ireland, 2023 |
|
Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , in: Proc. Interspeech 2023, 2023 |
|
CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice, , , and , in: Proc. Interspeech 2023, 2023 |
[URL] |
Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , Idiap-RR-02-2023 |
|
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks, , , , , , , , and , in: Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 |
|
Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator, , , , , and , in: 12th SESAR Innovation Days, 2022 |
|
Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition, , , , , , and , in: 12th SESAR Innovation Days, 2022 |
|
IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model, , , , , , and , in: The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022), 2022 |
[URL] |
IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach, , , , , , and , in: The 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE @ EMNLP 2022), 2022 |
[URL] |
Readback Error Detection by Automatic Speech Recognition and Understanding -- Results of HAAWAII Project for Isavia’s Enroute Airspace, , , , , , , , , and , in: 11th SESAR Innovation Days, SESAR, pages 9, 2022 |
|
How Does Pre-trained Wav2Vec 2.0 Perform on Domain-Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications, , , , , , , , and , in: 2023 IEEE Spoken Language Technology Workshop (SLT), IEEE, 2023 |
[URL] |
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications, , , , , , and , in: 2023 IEEE Spoken Language Technology Workshop (SLT), IEEE, 2023 |
[URL] |
IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model, , , , , , and , Idiap-RR-12-2022 |
|
IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach, , , , , , and , Idiap-RR-13-2022 |
|
A two-step approach to leverage contextual data: speech recognition in air-traffic communications, , , , and , in: Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 |
|
Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition, , , , , and , in: Electronics, 10(24):1-15, 2021 |
[DOI] [URL] |
Automatic processing pipeline for collecting and annotating air-traffic voice communication data, , , , , , , , , and , in: Proceedings of 9th OpenSky Symposium 2020, OpenSky Network, Brussels, Belgium, pages 1-9, MDPI, 2021 |
|
BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications, , , , , , and , Idiap-RR-15-2021 |
Grammar Based Identification Of Speaker Role For Improving ATCO And Pilot ASR, , , , , , and , Idiap-RR-22-2021 |
|
Improving callsign recognition with air-surveillance data in air-traffic communication, , , and , Idiap-RR-20-2021 |
[URL] |
Boosting of contextual information in ASR for air-traffic call-sign recognition, , , , , , , and , in: Interspeech 2021, 2021 |
|
Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, , , , , , and , in: Interspeech 2021, 2021 |
[URL] |
Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, , , , , , and , Idiap-RR-14-2021 |
[URL] |
Automatic Call Sign Detection: Matching Air Surveillance Data with Air Traffic Spoken Communications, , , , , , , , , , , , , , , and , in: Proceedings of 8th OpenSky Symposium 2020, OpenSky Network, pages 1-10, MDPI, 2020 |
[DOI] [URL] |
Automatic Speech Recognition Benchmark for Air-Traffic Communications, , , , and , in: Proc. Interspeech 2020, pages 2297-2301, 2020 |
[DOI] |