CONF FunesMora_ICMI_DC_2013/IDIAP 3D Head Pose and Gaze Tracking and Their Application to Diverse Multimodal Tasks Funes Mora, Kenneth Alberto Gaze estimation HCI Head pose tracking HHI HRI Speech Doctoral consortium of the 15th ACM International Conference on Multimodal Interaction Sydney, Australia 2013 10.1145/2522848.2532192 doi In this PhD thesis the problem of 3D head pose and gaze tracking from minimal user cooperation is addressed. By exploiting characteristics of RGB-D sensors, contributions have been made related to consequent problems of the lack of cooperation: in particular, head pose and inter-person appearance variability; in addition to low resolution handling. The resulting system enabled diverse multimodal applications. In particular, recent work combined multiple RGB-D sensors to detect gazing events in dyadic interactions. The research plan consists of: i) Improving the robustness, accuracy and usability of the head pose and gaze tracking system; ii) To use additional multimodal cues, such as speech and dynamic context, to train and adapt gaze models in an unsupervised manner; iii) To extend the application of 3D gaze estimation to diverse multimodal applications. This includes visual focus of attention tasks involving multiple visual targets, e.g. people in a meeting-like setup.

</datafield>

<subfield code="a">FunesMora_ICMI_DC_2013/IDIAP</subfield>

</datafield>

<subfield code="a">3D Head Pose and Gaze Tracking and Their Application to Diverse Multimodal Tasks</subfield>

</datafield>

<subfield code="a">Funes Mora, Kenneth Alberto</subfield>

</datafield>

<subfield code="a">Gaze estimation</subfield>

</datafield>

</datafield>

<subfield code="a">Head pose tracking</subfield>

</datafield>

</datafield>

</datafield>

<subfield code="a">Speech</subfield>

</datafield>

<subfield code="a">Doctoral consortium of the 15th ACM International Conference on Multimodal Interaction</subfield>

<subfield code="c">Sydney, Australia</subfield>

</datafield>

</datafield>

</datafield>

<subfield code="a">In this PhD thesis the problem of 3D head pose and gaze tracking from minimal user cooperation is addressed. By exploiting characteristics of RGB-D sensors, contributions have been made related to consequent problems of the lack of cooperation: in particular, head pose and inter-person appearance variability; in addition to low resolution handling. The resulting system enabled diverse multimodal applications. In particular, recent work combined multiple RGB-D sensors to detect gazing events in dyadic interactions. The research plan consists of: i) Improving the robustness, accuracy and usability of the head pose and gaze tracking system; ii) To use additional multimodal cues, such as speech and dynamic context, to train and adapt gaze models in an unsupervised manner; iii) To extend the application of 3D gaze estimation to diverse multimodal applications. This includes visual focus of attention tasks involving multiple visual targets, e.g. people in a meeting-like setup.</subfield>

</datafield>

</record>

</collection>