BLESS: Benchmarking Large Language Models on Sentence Simplification

We use cookies

This website uses cookies and other tracking technologies to improve your browsing experience for the following purposes: to enable basic functionality of the website, to provide a better experience on the website, to measure your interest in our products and services and to personalize marketing interactions, to deliver ads that are more relevant to you.

[BibTeX] [Marc21]

Type of publication:	Conference paper
Citation:	Kew_EMNLP2023_2023
Publication status:	Accepted
Booktitle:	Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Year:	2023
Month:	December
Location:	Singapore
Abstract:	We present BLESS, a comprehensive performance benchmark of the most recent state-of-the-art large language models (LLMs) on the task of text simplification (TS). We examine how well off-the-shelf LLMs can solve this challenging task, assessing a total of 44 models, differing in size, architecture, pre-training methods, and accessibility, on three test sets from different domains (Wikipedia, news, and medical) under a few-shot setting. Our analysis considers a suite of automatic metrics as well as a large-scale quantitative investigation into the types of common edit operations performed by the different models. Furthermore, we perform a manual qualitative analysis on a subset of model outputs to better gauge the quality of the generated simplifications. Our evaluation indicates that the best LLMs, despite not being trained on TS, perform comparably with state-of-the-art TS baselines. Additionally, we find that certain LLMs demonstrate a greater range and diversity of edit operations. Our performance benchmark will be available as a resource for the development of future TS methods and evaluation metrics.
Keywords:	evaluation, LLM, NLP, Text Simplification
Projects	Idiap
Authors	Kew, Tannon Chi, Alison Vásquez-Rodríguez, Laura Agrawal, Sweta Aumiller, Dennis Alva-Manchego, Fernando Shardlow, Matthew
Added by:	[UNK]
Total mark:	0
Attachments
Kew_EMNLP2023_2023.pdf
Notes

processing time: 0.0009 seconds.