CONF
magimai04icslp/IDIAP
Modelling Auxiliary Features in Tandem Systems
Magimai.-Doss, Mathew
Stephenson, Todd Andrew
Ikbal, Shajith
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2004/mathew-icslp2004.pdf
PUBLIC
https://publications.idiap.ch/index.php/publications/showcite/magimai04a
Related documents
Proceedings of ICSLP
2004
South Korea
IDIAP-RR 04-21
Tandem systems transform the cepstral features into posterior probabilities of subword units using artificial neural networks (ANNs,',','),
which are processed to form input features for conventional speech recognition systems. They have been shown to perform better than conventional speech recognition systems using cepstral features. Recent studies have shown that modelling cepstral features with auxiliary sources of knowledge leads to improvement in the performance of speech recognition systems. In this paper, we study two approaches to incorporate auxiliary knowledge sources such as pitch frequency, short-term energy, etc. (referred to as auxiliary features,',','),
in a tandem-based automatic speech recognition system. In the first approach, we model the auxiliary features in the process of training an ANN, which is later used to extract tandem-features. In the second approach, we extract the tandem-features from an ANN trained with cepstral features only and then model them jointly with auxiliary features. Recognition studies conducted on a connected word recognition task under clean and noisy conditions show that the performance of the tandem system can be improved by incorporating auxiliary features.
REPORT
magimai04a/IDIAP
Modelling Auxiliary Features in Tandem Systems
Magimai.-Doss, Mathew
Stephenson, Todd Andrew
Ikbal, Shajith
Bourlard, Hervé
EXTERNAL
https://publications.idiap.ch/attachments/reports/2004/rr04-21.pdf
PUBLIC
Idiap-RR-21-2004
2004
IDIAP
Tandem systems transform the cepstral features into posterior probabilities of subword units using artificial neural networks (ANNs,',','),
which are processed to form input features for conventional speech recognition systems. They have been shown to perform better than conventional speech recognition systems using cepstral features. Recent studies have shown that modelling cepstral features with auxiliary sources of knowledge leads to improvement in the performance of speech recognition systems. In this paper, we study two approaches to incorporate auxiliary knowledge sources such as pitch frequency, short-term energy, etc. (referred to as auxiliary features,',','),
in a tandem-based automatic speech recognition system. In the first approach, we model the auxiliary features in the process of training an ANN, which is later used to extract tandem-features. In the second approach, we extract the tandem-features from an ANN trained with cepstral features only and then model them jointly with auxiliary features. Recognition studies conducted on a connected word recognition task under clean and noisy conditions show that the performance of the tandem system can be improved by incorporating auxiliary features.