Uncategorized

[2312.12450] Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions



Download a PDF of the paper titled Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions, by Federico Cassano and 7 other authors

Download PDF
HTML (experimental)

Abstract:A significant amount of research is focused on developing and evaluating large language models for a variety of code synthesis tasks. These include synthesizing code from natural language instructions, synthesizing tests from code, and synthesizing explanations of code. In contrast, the behavior of instructional code editing with LLMs is understudied. These are tasks in which the model is instructed to update a block of code provided in a prompt. The editing instruction may ask for a feature to added or removed, describe a bug and ask for a fix, ask for a different kind of solution, or many other common code editing tasks.

We introduce a carefully crafted benchmark of code editing tasks and use it evaluate several cutting edge LLMs. Our evaluation exposes a significant gap between the capabilities of state-of-the-art open and closed models. For example, even GPT-3.5-Turbo is 8.8% better than the best open model at editing code.

We also introduce a new, carefully curated, permissively licensed training set of code edits coupled with natural language instructions. Using this training set, we show that we can fine-tune open Code LLMs to significantly improve their code editing capabilities.

Submission history

From: Federico Cassano [view email]
[v1]
Mon, 11 Dec 2023 02:27:45 UTC (1,345 KB)
[v2]
Thu, 21 Dec 2023 13:43:41 UTC (1,345 KB)



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *