arxiv.org
https://arxiv.org/pdf/2511.04486
EDIT-Bench: Evaluating LLM Abilities to Perform Real-World Instructed Code Edits
A benchmark for evaluating LLM code editing capabilities built on real-world edit contexts and instructions collected in-the-wild from 500 developers. EDIT stands for Evaluation of Developer Instructed Tasks.
https://waynechi.com/edit-bench/

Seonglae Cho