Automatic Speech Recognition Benchmark for Air-Traffic Communications
Type of publication: | Conference paper |
Citation: | Motlicek_INTERSPEECH_2020 |
Publication status: | Accepted |
Booktitle: | Proc. Interspeech 2020 |
Year: | 2020 |
Month: | October |
Pages: | 2297-2301 |
Crossref: | http://dx.doi.org/10.21437/Interspeech.2020-2173 |
DOI: | 10.21437/Interspeech.2020-2173 |
Abstract: | Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environments. Currently, voice communication and Controller Pilot Data Link Communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-speech method mandatory for oceanic messages and limited for some domestically issues. ASR systems on ATCo environments inherit increasing complexity due to accents from non-English speakers, cockpit noise, speaker-dependent biases and small in-domain ATC databases for training. In this paper, we review the last advances related to ASR on ATCo communication. Then, we introduce CleanSky EC H2020 ATCO2, a project that aims to develop a platform to collect, organize and automatically pre-process ATCo data from air space. We apply transfer learning from out-of-domain corpus coupled with adaptation on seven command-related corpora. The acoustic modelling is based on conventional TDNN-HMMs trained using lattice-free MMI objective function. The developed ASR achieves relative improvement in word error rates of 29% when using transfer learning and an additional 36% when adapting the model with seven command-related databases, these results obtained from EC H2020 SESAR project MALORCA Vienna database. |
Keywords: | Air traffic control, Automatic Speech Recognition, deep neural networks, Lattice-Free MMI, transfer learning |
Projects |
Idiap EC H2020- ATCO2 |
Authors | |
Added by: | [UNK] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|