DAMONJUApr 30, 2026arXiv:2604.27296

To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

Wei Cheng, Yongchang Cao, Chen Shen, Binhua Li, Jue Chen, Yongbin Li, Wei Hu

AI Summary

This paper analyzes the inefficiencies of full-code generation and conventional diff formats for LLM-based code editing. It introduces BlockDiff and FuncDiff, structure-aware diff formats that represent changes as block-level rewrites of syntactically coherent units. The authors then propose AdaEdit, an adaptive edit strategy that trains LLMs to dynamically choose the most token-efficient format (diff or full code), achieving over 30% reduction in latency and cost without sacrificing accuracy on long-code editing tasks.

Key Contribution

LLMs can edit code 30% faster and cheaper by learning to adaptively switch between generating full code and structure-aware diffs, without sacrificing accuracy.

Abstract

Large Language Models (LLMs) are increasingly used for code editing, yet the prevalent full-code generation paradigm suffers from severe efficiency bottlenecks, posing challenges for interactive coding assistants that demand low latency and cost. Despite the predominant focus on scaling model capabilities, the edit format itself has been largely overlooked in model training. In this paper, we begin with a systematic study of conventional diff formats and reveal that fragile offsets and fragmented hunks make generation highly unnatural for LLMs. To address it, we introduce BlockDiff and FuncDiff, two structure-aware diff formats that represent changes as block-level rewrites of syntactically coherent units such as control structures and functions. Furthermore, we propose AdaEdit, a general adaptive edit strategy that trains LLMs to dynamically choose the most token-efficient format between a given diff format and full code. Extensive experiments demonstrate that AdaEdit paired with structure-aware diff formats consistently matches the accuracy of full-code generation, while reducing both latency and cost by over 30% on long-code editing tasks.

Code Generation & Program Synthesis Inference & Quantization Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

To Diff or Not to Diff? Structure-Aware and Adaptive Output Formats for Efficient LLM-based Code Editing

Related Papers