logo Idiap Research Institute        
 [BibTeX] [Marc21]
Gradient estimates of return distributions
Type of publication: Conference paper
Citation: dimitrakakis:pascal:2005
Booktitle: PASCAL Workshop on Principled Methods of Trading Exploration and Exploitation
Year: 2005
Note: IDIAP-RR 05-29
Crossref: dimitrakakis:rr05-29:
Abstract: We present a general method for maintaining estimates of the distribution of parameters in arbitrary models. This is then applied to the estimation of probability distributions over actions in value-based reinforcement learning. While this approach is similar to other techniques that maintain a confidence measure for action-values, it nevertheless offers an insight into current techniques and hints at potential avenues of further research.
Userfields: ipdmembership={learning},
Projects Idiap
Authors Dimitrakakis, Christos
Bengio, Samy
Added by: [UNK]
Total mark: 0
  • dimitrakakis-pascal-2005.pdf
  • dimitrakakis-pascal-2005.ps.gz