Developing and Enhancing Posterior Based Speech Recognition Systems

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Idiap-RR
Citation:	hamed-rr05-23
Number:	Idiap-RR-23-2005
Year:	2005
Institution:	IDIAP
Abstract:	Local state or phone posterior probabilities are often investigated as local scores (e.g., hybrid HMM/ANN systems) or as transformed acoustic features (e.g., ``Tandem'') to improve speech recogni tion systems. In this paper, we present initial results towards boosting these approaches by improving posterior estimat es, using acoustic context (e.g., as available in the whole utterance,',','), as well as possible prior information (such as topological constraints). In the present work, the enhanced posterior distribution is associated with the ``gamma'' distribution typically used in standard HMMs training, and estimated from local likelihoods (GMM) or local posteriors (ANN). This approach results in a family of new HMM based systems, where only posterior probabilities are used, while also providing a new, principled, approach towards a hierarchical use/integration of these posteriors, from the frame level up to the phone and word levels, and integrating the appropriate context and prior knowledge in each level. In the present work, we used the resulting posteriors as local scores in a Viter bi decoder. On the OGI Numbers'95 database, this resulted in improved recognition performance, compared to a state-of-the-art hybrid HMM/ANN system.
Userfields:	ipdmembership={speech},
Keywords:
Projects	Idiap
Authors	Ketabdar, Hamed Vepa, Jithendra Bengio, Samy Bourlard, Hervé
Crossref by	hamed00
Added by:	[UNK]
Total mark:	0
Attachments
rr05-23.pdf rr05-23.ps.gz
Notes

processing time: 0.0003 seconds.