CONF
eurospeech01/IDIAP
MAP Combination of Multi-Stream HMM or HMM/ANN Experts
Morris, Andrew
Hagen, Astrid
Bourlard, Hervé
missing data
multi-band
multi-band combination
multi-stream
robust ASR
EXTERNAL
https://publications.idiap.ch/attachments/reports/2001/morris-2001-eurospeech.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/morris-rr-01-14
Related documents
Proc. Eurospeech
2001
Aalborg, Denmark
Automatic speech recognition (ASR) performance falls dramatically with the level of mismatch between training and test data. The human ability to recognise speech when a large proportion of frequencies are dominated by noise has inspired the "missing data" and "multi-band" approaches to noise robust ASR. "Missing data" ASR identifies low SNR spectral data in each data frame and then ignores it. Multi-band ASR trains a separate model for each position of missing data, estimates a reliability weight for each model, then combines model outputs in a weighted sum. A problem with both approaches is that local data reliability estimation is inherently inaccurate and also assumes that all of the training data was clean. In this article we present a model in which adaptive multi-band expert weighting is incorporated naturally into the maximum a posteriori (MAP) decoding process.
REPORT
morris-RR-01-14/IDIAP
MAP Combination of Multi-Stream HMM or HMM/ANN Experts
Morris, Andrew
Hagen, Astrid
Bourlard, Hervé
missing data
multi-band
multi-band combination
multi-stream
robust ASR
EXTERNAL
https://publications.idiap.ch/attachments/reports/2001/rr01-14.pdf
PUBLIC
Idiap-RR-14-2001
2001
IDIAP
Automatic speech recognition (ASR) performance falls dramatically with the level of mismatch between training and test data. The human ability to recognise speech when a large proportion of frequencies are dominated by noise has inspired the "missing data" and "multi-band" approaches to noise robust ASR. "Missing data" ASR identifies low SNR spectral data in each data frame and then ignores it. Multi-band ASR trains a separate model for each position of missing data, estimates a reliability weight for each model, then combines model outputs in a weighted sum. A problem with both approaches is that local data reliability estimation is inherently inaccurate and also assumes that all of the training data was clean. In this article we present a model in which adaptive multi-band expert weighting is incorporated naturally into the maximum a posteriori (MAP) decoding process.