Towards Robust and Adaptive Speech Recognition Models

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Idiap-RR
Citation:	bourlard-rr-02-47
Number:	Idiap-RR-47-2002
Year:	2002
Institution:	IDIAP
Address:	Martigny, Switzerland
Note:	Published: Mathematical Foundations of Speech Processing and Recognition, IMA, Eds. R. Rosenfeld and M. Ostendorf
Abstract:	In this paper, we discuss a family of new Automatic Speech Recognition (ASR) approaches, which somewhat deviate from the usual ASR approaches but which have recently been shown to be more robust to nonstationary noise, without requiring specific adaptation or ``multi-style'' training. More specifically, we will motivate and briefly describe new approaches based on multi-stream and subband ASR. These approaches extend the standard hidden Markov model (HMM) based approach by assuming that the different (frequency) streams representing the speech signal are processed by different (independent) ``experts'', each expert focusing on a different characteristic of the signal, and that the different stream likelihoods (or posteriors) are combined at some (temporal) stage to yield a global recognition output. As a further extension to multi-stream ASR, we will finally introduce a new approach, referred to as HMM2, where the HMM emission probabilities are estimated via state specific feature based HMMs responsible for merging the stream information and modeling their possible correlation.
Userfields:	ipdinar={2002}, ipdmembership={speech}, language={English},
Keywords:
Projects	Idiap
Authors	Bourlard, Hervé Bengio, Samy Weber, Katrin
Added by:	[UNK]
Total mark:	0
Attachments
rr02-47.pdf rr02-47.ps.gz
Notes

processing time: 0.0003 seconds.