CONF Garner_ASRU_2009/IDIAP SNR Features for Automatic Speech Recognition Garner, Philip N. EXTERNAL https://publications.idiap.ch/attachments/papers/2009/Garner_ASRU_2009.pdf PUBLIC https://publications.idiap.ch/index.php/publications/showcite/Garner_Idiap-RR-25-2009 Related documents Proceedings of the IEEE workshop on Automatic Speech Recognition and Understanding Merano, Italy 2009 December 2009 When combined with cepstral normalisation techniques, the features normally used in Automatic Speech Recognition are based on Signal to Noise Ratio (SNR). We show that calculating SNR from the outset, rather than relying on cepstral normalisation to produce it, gives features with a number of practical and mathematical advantages over power-spectral based ones. In a detailed analysis, we derive Maximum Likelihood and Maximum a-Posteriori estimates for SNR based features, and show that they can outperform more conventional ones, especially when subsequently combined with cepstral variance normalisation. We further show anecdotal evidence that SNR based features lend themselves well to noise estimates based on low-energy envelope tracking. REPORT Garner_Idiap-RR-25-2009/IDIAP SNR Features for Automatic Speech Recognition Garner, Philip N. EXTERNAL https://publications.idiap.ch/attachments/reports/2009/Garner_Idiap-RR-25-2009.pdf PUBLIC Idiap-RR-25-2009 2009 Idiap September 2009 When combined with cepstral normalisation techniques, the features normally used in Automatic Speech Recognition are based on Signal to Noise Ratio (SNR). We show that calculating SNR from the outset, rather than relying on cepstral normalisation to produce it, gives features with a number of practical and mathematical advantages over power-spectral based ones. In a detailed analysis, we derive Maximum Likelihood and Maximum a-Posteriori estimates for SNR based features, and show that they can outperform more conventional ones, especially when subsequently combined with cepstral variance normalisation. We further show anecdotal evidence that SNR based features lend themselves well to noise estimates based on low-energy envelope tracking.