ARTICLE
Ganapathy_JASA-EL_2008/IDIAP
Modulation Frequency Features For Phoneme Recognition In Noisy Speech
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
EXTERNAL
https://publications.idiap.ch/attachments/papers/2008/Ganapathy_JASA-EL_2008.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/Ganapathy_Idiap-RR-70-2008
Related documents
Journal of Acoustical Society of America - Express Letters
2008
November 2008
In this letter, a new feature extraction technique based on modulation spectrum derived from syllable-length segments of sub-band temporal envelopes is proposed. These sub-band envelopes are derived from auto-regressive modelling of Hilbert envelopes of the signal in critical bands, processed by both a static (logarithmic) and a dynamic (adaptive loops) compression. These features are then used for machine recognition of phonemes in telephone speech. Without degrading the performance in clean conditions, the proposed features show significant improvements compared to other state-of-the-art speech analysis techniques. In addition to the overall phoneme recognition rates, the performance with broad phonetic classes is reported.
REPORT
Ganapathy_Idiap-RR-70-2008/IDIAP
Modulation Frequency Features For Phoneme Recognition In Noisy Speech
Ganapathy, Sriram
Thomas, Samuel
Hermansky, Hynek
EXTERNAL
https://publications.idiap.ch/attachments/reports/2008/Ganapathy_Idiap-RR-70-2008.pdf
PUBLIC
Idiap-RR-70-2008
2008
Idiap
October 2008
In this paper, a new feature extraction technique based on modulation spectrum derived from syllable-length segments of sub-band temporal envelopes is proposed. These sub-band envelopes are derived from auto-regressive modelling of Hilbert envelopes of the signal in critical bands, processed by both a static (logarithmic) and a dynamic (adaptive loops) compression. These features are then used for machine recognition of phonemes in telephone speech. Without degrading the performance in clean conditions, the proposed features show significant improvements compared to other state-of-the-art speech analysis techniques. In addition to the overall phoneme recognition rates, the performance with broad phonetic classes is reported.