Reinforcement learning of trajectory distributions: Applications in assisted teleoperation and motion planning

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Ewerton_IROS_2019
Publication status:	Published
Booktitle:	IEEE International Conference on Intelligent Robots and Systems
Year:	2019
Abstract:	The majority of learning from demonstration approaches do not address suboptimal demonstrations or cases when drastic changes in the environment occur after the demonstrations were made. For example, in real teleoperation tasks, the demonstrations provided by the user are often suboptimal due to interface and hardware limitations. In tasks involving co-manipulation and manipulation planning, the environment often changes due to unexpected obstacles rendering previous demonstrations invalid. This paper presents a reinforcement learning algorithm that exploits the use of relevance functions to tackle such problems. This paper introduces the Pearson correlation as a measure of the relevance of policy parameters in regards to each of the components of the cost function to be optimized. The method is demonstrated in a static environment where the quality of the teleoperation is compromised by the visual interface (operating a robot in a three-dimensional task by using a simple 2D monitor). Afterward, we tested the method on a dynamic environment using a real 7-DoF robot arm where distributions are computed online via Gaussian Process regression.
Keywords:
Projects	Idiap
Authors	Ewerton, Marco Maeda, Guilherme Koert, Dorothea Kolev, Zlatko Takahashi, Masaki Peters, Jan
Added by:	[UNK]
Total mark:	0
Attachments

Notes

processing time: 0.0003 seconds.