Reference-based vs. task-based evaluation of human language technology

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Popescu-Belis_LREC_2008
Booktitle:	LREC 2008 ELRA Workshop on Evaluation
Year:	2008
Location:	Marrakech, Morocco
Organization:	ELRA
Abstract:	This paper starts from the ISO distinction of three types of evaluation procedures â€“ internal, external and in use â€“ and proposes to match these types to the three types of human language technology (HLT) systems: analysis, generation, and interactive. The paper explains why internal evaluation is not suitable to measure the qualities of HLT systems, and shows that reference-based external evaluation is best adapted to â€˜analysisâ€™ systems, task-based evaluation to â€˜interactiveâ€™ systems, while â€˜generationâ€™ systems can be subject to both types of evaluation. In particular, some limits of reference-based external evaluation are shown in the case of generation systems. Finally, the paper shows that contextual evaluation, as illustrated by the FEMTI framework for MT evaluation, is an effective method for getting reference-based evaluation closer to the users of a system.
Keywords:
Projects	Idiap IM2
Authors	Popescu-Belis, Andrei
Added by:	[UNK]
Total mark:	0
Attachments
Popescu-Belis_LREC_2008.pdf
Notes

processing time: 0.0003 seconds.