Gradient estimates of return
Type of publication: | Idiap-RR |
Citation: | dimitrakakis:rr05-29 |
Number: | Idiap-RR-29-2005 |
Year: | 2005 |
Institution: | IDIAP |
Note: | Published in PASCAL Workshop in Principled Methods of Trading Exploration and Exploitation, London, UK, 2005 |
Abstract: | The exploration-exploitation trade-off that arises when one considers simple point estimates of expected returns no longer appears when full distributions are considered. This work develops a simple gradient-based approach for mainting such distributions and investigates methods for using them to direct exploration. |
Userfields: | ipdmembership={learning}, |
Keywords: | |
Projects |
Idiap |
Authors | |
Crossref by |
dimitrakakis:pascal:2005 |
Added by: | [UNK] |
Total mark: | 0 |
Attachments
|
|
Notes
|
|
|