<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
	<record>
		<datafield tag="980" ind1=" " ind2=" ">
			<subfield code="a">REPORT</subfield>
		</datafield>
		<datafield tag="970" ind1=" " ind2=" ">
			<subfield code="a">Ewerton_Idiap-RR-03-2021/IDIAP</subfield>
		</datafield>
		<datafield tag="245" ind1=" " ind2=" ">
			<subfield code="a">An Attention Mechanism for Deep Q-Networks with Applications in Robotic Pushing</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Ewerton, Marco</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Calinon, Sylvain</subfield>
		</datafield>
		<datafield tag="700" ind1=" " ind2=" ">
			<subfield code="a">Odobez, Jean-Marc</subfield>
		</datafield>
		<datafield tag="856" ind1="4" ind2="0">
			<subfield code="i">EXTERNAL</subfield>
			<subfield code="u">http://publications.idiap.ch/attachments/reports/2021/Ewerton_Idiap-RR-03-2021.pdf</subfield>
			<subfield code="x">PUBLIC</subfield>
		</datafield>
		<datafield tag="088" ind1=" " ind2=" ">
			<subfield code="a">Idiap-RR-03-2021</subfield>
		</datafield>
		<datafield tag="260" ind1=" " ind2=" ">
			<subfield code="c">2021</subfield>
			<subfield code="b">Idiap</subfield>
		</datafield>
		<datafield tag="771" ind1="2" ind2=" ">
			<subfield code="d">April 2021</subfield>
		</datafield>
		<datafield tag="520" ind1=" " ind2=" ">
			<subfield code="a">Humans effortlessly solve push tasks in everyday life but unlocking these capabilities remains a research challenge in robotics. Physical models are often inaccurate or unattainable. State-of-the-art data-driven approaches learn to compensate for these inaccuracies or get rid of the approximated physical models altogether. Nevertheless, data-driven approaches such as Deep Q-Networks (DQNs) get frequently stuck in local optima in large state-action spaces. We propose an attention mechanism for DQNs to improve their sampling efficiency and demonstrate in simulation experiments with a UR5 robot arm that such a mechanism helps the DQN learn faster and achieve higher performance in a push task involving objects with unknown dynamics.</subfield>
		</datafield>
	</record>
</collection>