REPORT dimitrak-bengio_04-72/IDIAP Estimates of Parameter Distributions for Optimal Action Selection Dimitrakakis, Christos Bengio, Samy EXTERNAL https://publications.idiap.ch/attachments/reports/2004/rr-04-72.pdf PUBLIC Idiap-RR-72-2004 2004 IDIAP We present a general method for maintaining estimates of the distribution of parameters in arbitrary models. This is then applied to the estimation of probability distribution over actions in value-based reinforcement learning. While this approach is similar to other techniques that maintain a confidence measure for action-values, it nevertheless offers a new insight into current techniques and reveals potential avenues of further research.