CONF
ikbal-rr-04-19p/IDIAP
Entropy Based Combination of Tandem Representations for Noise Robust ASR
Ikbal, Shajith
Misra, Hemant
Sivadas, Sunil
Hermansky, Hynek
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2004/rr04-19.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/ikbal-rr-04-19
Related documents
Proceedings of the INTERSPEECH-ICSLP-04
2004
Jeju Island, Korea
October 2004
To appear
In this paper, we present an entropy based method to combine tandem representations of the recently proposed Phase AutoCorrelation (PAC) based features and Mel-Frequency Cepstral Coefficients (MFCC) features. PAC based features, derived from a nonlinear transformation of autocorrelation coefficients and shown to be noise robust, improve their robustness to additive noise in their tandem representation. On the other hand, MFCC features in their tandem representation show a significant improvement in recognition performance on clean speech. An entropy based combination method investigated in this paper adaptively gives a higher weighting to the representation of MFCC features in clean speech and to the representation of PAC based features in noisy speech, thus yielding a robust recognition performance in all conditions.
REPORT
ikbal-rr-04-19/IDIAP
Entropy Based Combination of Tandem Representations for Noise Robust ASR
Ikbal, Shajith
Misra, Hemant
Sivadas, Sunil
Hermansky, Hynek
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2004/rr04-19.pdf
PUBLIC
Idiap-RR-19-2004
2004
IDIAP
Martigny, Switzerland
In this paper, we present an entropy based method to combine tandem representations of the recently proposed Phase AutoCorrelation (PAC) based features and Mel-Frequency Cepstral Coefficients (MFCC) features. PAC based features, derived from a nonlinear transformation of autocorrelation coefficients and shown to be noise robust, improve their robustness to additive noise in their tandem representation. On the other hand, MFCC features in their tandem representation show a significant improvement in recognition performance on clean speech. An entropy based combination method investigated in this paper adaptively gives a higher weighting to the representation of MFCC features in clean speech and to the representation of PAC based features in noisy speech, thus yielding a robust recognition performance in all conditions.