Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	tsamuel:eusipco:2008
Booktitle:	EUSIPCO 2008
Year:	2008
Note:	IDIAP-RR 08-05
Crossref:	tsamuel:rr08-05: Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain, Thomas, Samuel, Ganapathy, Sriram and Hermansky, Hynek, Idiap-RR-05-2008
Abstract:	Frequency Domain Linear Prediction (FDLP) provides an efficient way to represent temporal envelopes of a signal using auto-regressive models. For the input speech signal, we use FDLP to estimate temporal trajectories of sub-band energy by applying linear prediction on the cosine transform of sub-band signals. The sub-band FDLP envelopes are used to extract spectral and temporal features for speech recognition. The spectral features are derived by integrating the temporal envelopes in short-term frames and the temporal features are formed by converting these envelopes into modulation frequency components. These features are then combined in the phoneme posterior level and used as the input features for a hybrid HMM-ANN based phoneme recognizer. The proposed spectro-temporal features provide a phoneme recognition accuracy of $69.1 \%$ (an improvement of $4.8 \%$ over the Perceptual Linear Prediction (PLP) base-line) for the TIMIT database.
Userfields:	ipdmembership={speech},
Keywords:
Projects	Idiap
Authors	Thomas, Samuel Ganapathy, Sriram Hermansky, Hynek
Added by:	[UNK]
Total mark:	0
Attachments
tsamuel-eusipco-2008.pdf tsamuel-eusipco-2008.ps.gz
Notes

processing time: 0.0002 seconds.