Improving Speech Recognition Using a Data-Driven Approach

Type of publication:	Conference paper
Citation:	aradilla-rr-05-66b
Booktitle:	Proceedings of Interspeech, 2005
Number:	66
Year:	2005
Address:	Martigny, Switzerland
Note:	IDIAP-RR 05-66
Crossref:	aradilla-rr-05-66: Improving Speech Recognition Using a Data-Driven Approach, Aradilla, Guillermo, Vepa, Jithendra and Bourlard, Hervé, Idiap-RR-66-2005
Abstract:	In this paper, we investigate the possibility of enhancing state-of-the-art HMM-based speech recognition systems using data-driven techniques, where whole set of training utterances is used as reference models and recognition is then performed through the well-known template matching technique, DTW. This approach allows us to better capture the temporal dynamics of the speech signal while avoiding some of the HMM assumptions such as the piecewise stationarity. Potentially, such data-driven techniques also allow us to better exploit meta-data and environmental information, such as speaker, gender, accent and noise conditions. However, we cannot entirely abandon HMMs, which are very powerful and scalable models. Thus, we investigate one way to combine and take advantage of both the approaches, combining scores of HMMs and reference templates. Experiments on the Numbers95 database showed that this combination yields 22\% relative improvement in word error rate over the baseline HMM performance. Applying K-means clustering to the acoustic vectors speeds up the decoding, while still retaining a significant improvement in the recognition accuracy.
Userfields:	ipdinar={2005}, ipdmembership={speech}, language={English},
Keywords:
Projects:	Idiap
Authors:	Aradilla, Guillermo Vepa, Jithendra Bourlard, Hervé
Added by:	[UNK]
Total mark:	0
Attachments
rr05-66.pdf rr05-66.ps.gz
Notes

processing time: 0.0002 seconds.