Gradient estimates of return
| Type of publication: | Idiap-RR |
| Citation: | dimitrakakis:rr05-29 |
| Number: | Idiap-RR-29-2005 |
| Year: | 2005 |
| Institution: | IDIAP |
| Note: | Published in PASCAL Workshop in Principled Methods of Trading Exploration and Exploitation, London, UK, 2005 |
| Abstract: | The exploration-exploitation trade-off that arises when one considers simple point estimates of expected returns no longer appears when full distributions are considered. This work develops a simple gradient-based approach for mainting such distributions and investigates methods for using them to direct exploration. |
| Userfields: | ipdmembership={learning}, |
| Keywords: | |
| Projects: |
Idiap |
| Authors: | |
| Crossref by |
dimitrakakis:pascal:2005 |
| Added by: | [UNK] |
| Total mark: | 0 |
|
Attachments
|
|
|
Notes
|
|
|
|
|