Gradient estimates of return distributions

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	dimitrakakis:pascal:2005
Booktitle:	PASCAL Workshop on Principled Methods of Trading Exploration and Exploitation
Year:	2005
Note:	IDIAP-RR 05-29
Crossref:	dimitrakakis:rr05-29: Gradient estimates of return, Dimitrakakis, Christos and Bengio, Samy, Idiap-RR-29-2005
Abstract:	We present a general method for maintaining estimates of the distribution of parameters in arbitrary models. This is then applied to the estimation of probability distributions over actions in value-based reinforcement learning. While this approach is similar to other techniques that maintain a confidence measure for action-values, it nevertheless offers an insight into current techniques and hints at potential avenues of further research.
Userfields:	ipdmembership={learning},
Keywords:
Projects	Idiap
Authors	Dimitrakakis, Christos Bengio, Samy
Added by:	[UNK]
Total mark:	0
Attachments
dimitrakakis-pascal-2005.pdf dimitrakakis-pascal-2005.ps.gz
Notes

processing time: 0.0003 seconds.