Keywords:
- ADS-B data
- Air traffic control
- air traffic control communications
- air traffic management
- Assistant Based Speech Recognition
- Automatic Speech Recognition
- automatic speech recognition and understanding
- Call-sign Recognition
- chunking
- Contextual Adaptation
- contextual biasing
- Cross-modal Alignment
- Cross-modal Attentio
- Cross-modal Attention
- Customization of model
- diarization
- domain adaptation
- F1 score
- finite-state transducers
- GPU decoding
- Human-Computer Interaction
- Integration of prior knowledge
- Intent Classification
- knowledge distillation
- language identification
- model adaptation
- multitask acoustic modeling
- multitask learning
- multitask training
- named entity recognition
- Natural language processing
- online speech recognition
- OpenSky Network
- rare word recognition
- Rare-word integration
- real-time speech recognition
- Robust Automatic Speech Recognition
- self-supervised pre-training
- signal processing
- Speaker change detection
- speaker clustering
- speaker role classification
- speaker role detection
- speaker role identification
- speaker turn detection
- speech recognition
- Spoken Language Understanding
- Text-based speaker diarization
- wav2vec 2.0
- Word Consensus Networks
- Word-Confusion-Networks
- XLSR-Transducer
Publications of Nigmatulina Iuliia
2024
CONTEXTUAL BIASING METHODS FOR IMPROVING RARE WORD DETECTION IN AUTOMATIC SPEECH RECOGNITION, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Korea, 2024 |
|
Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12592-12596, IEEE, 2024 |
[DOI] [URL] |
Probability-Aware Word-Confusion-Network-to-Text Alignment Approach for Intent Classification, , , , , , , and , in: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024, Seoul, Republic of Korea, pages 12617-12621, IEEE, 2024 |
[DOI] [URL] |
TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR, , , , , , , , and , Idiap-RR-07-2024 |
[URL] |
XLSR-Transducer: Streaming ASR for Self-Supervised Pretrained Models, , , , , , , and , Idiap-RR-08-2024 |
[URL] |
2023
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers, , , , and , in: Aerospace, 10(5), 2023 |
[DOI] [URL] |
An Automatic Speaker Clustering Pipeline for the Air Traffic Communication Domain, , , , , , and , in: Aerospace, 10(10):876, 2023 |
[DOI] [URL] |
Automatic Speech Analysis Framework for ATC Communication in HAAWAII, , , , , and , in: 13th SESAR Innovation Days, 2023 |
|
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications, , , , , , and , in: 2023 IEEE Spoken Language Technology Workshop (SLT), IEEE, 2023 |
[URL] |
Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training, , , , , , and , in: Proc. 13th SESAR Innovation Days, Seville, Spain, 2023 |
[DOI] [URL] |
Effectiveness of Text, Acoustic, and Lattice-based representations in Spoken Language Understanding tasks, , , , , , , , and , in: Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, 2023 |
|
How Does Pre-trained Wav2Vec 2.0 Perform on Domain-Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications, , , , , , , , and , in: 2023 IEEE Spoken Language Technology Workshop (SLT), IEEE, 2023 |
[URL] |
Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , Idiap-RR-02-2023 |
|
Implementing contextual biasing in GPU decoder for online ASR, , , , , , and , in: Proc. Interspeech 2023, 2023 |
|
Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding, , , , , , , , , , and , in: Aerospace, 10(10):898, 2023 |
[DOI] [URL] |
2022
A two-step approach to leverage contextual data: speech recognition in air-traffic communications, , , , and , in: Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 |
|
Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition, , , , , , and , in: 12th SESAR Innovation Days, 2022 |
|
Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator, , , , , and , in: 12th SESAR Innovation Days, 2022 |
|
2021
Automatic processing pipeline for collecting and annotating air-traffic voice communication data, , , , , , , , , and , in: Proceedings of 9th OpenSky Symposium 2020, OpenSky Network, Brussels, Belgium, pages 1-9, MDPI, 2021 |
|
BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications, , , , , , and , Idiap-RR-15-2021 |
Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, , , , , , and , Idiap-RR-14-2021 |
[URL] |
Contextual Semi-Supervised Learning: An Approach To Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems, , , , , , and , in: Interspeech 2021, 2021 |
[URL] |
Grammar Based Identification Of Speaker Role For Improving ATCO And Pilot ASR, , , , , , and , Idiap-RR-22-2021 |
|
Improving callsign recognition with air-surveillance data in air-traffic communication, , , and , Idiap-RR-20-2021 |
[URL] |