logo Idiap Research Institute        
 [BibTeX] [Marc21]
CONTEXTUAL BIASING METHODS FOR IMPROVING RARE WORD DETECTION IN AUTOMATIC SPEECH RECOGNITION
Type of publication: Conference paper
Citation: Bhattacharjee_ICASSP_2024
Publication status: Accepted
Booktitle: Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP) 2024
Year: 2024
Month: April
Location: Seoul, Korea
Abstract: In specialized domains like Air Traffic Control (ATC), a notable challenge in porting a deployed Automatic Speech Recognition (ASR) system from one airport to another is the alteration in the set of crucial words that must be accurately detected in the new environment. Typically, such words have limited occurrences in training data, making it impractical to retrain the ASR system. This paper explores innovative word-boosting techniques to improve the detection rate of such rare words in the ASR hypotheses for the ATC domain. Two acoustic models are investigated: a hybrid CNN-TDNNF model trained from scratch and a pre-trained wav2vec2-based XLSR model fine-tuned on a common ATC dataset. The word boosting is done in three ways. First, an out-of-vocabulary word addition method is explored. Second, G-boosting is explored, which amends the language model before building the decoding graph. Third, the boosting is performed on the fly during decoding using lattice re-scoring. The results indicate that the G-boosting method performs best and provides an approximately 30-43% relative improvement in recall of the boosted words. Moreover, a relative improvement of up to 48% is obtained upon combining G-boosting and lattice-rescoring.
Keywords:
Projects Idiap
Authors Bhattacharjee, Mrinmoy
Iuliia, Nigmatulina
Prasad, Amrutha
Rangappa, Pradeep
Madikeri, Srikanth
Motlicek, Petr
Helmke, Hartmut
Kleinert, Matthias
Added by: [UNK]
Total mark: 0
Attachments
  • Bhattacharjee_ICASSP_2024.pdf
Notes