Search papers, labs, and topics across Lattice.
The paper introduces MemoCoder, a multi-agent framework designed to improve LLM-based code generation by iteratively debugging and adapting to diverse problem structures. MemoCoder uses a "Fixing Knowledge Set" to store and retrieve successful repairs, guided by a Mentor Agent that identifies error patterns and refines fixing strategies. Experiments on MBPP, HumanEval, and LiveCodeBench show that MemoCoder outperforms zero-shot prompting and self-repair strategies, achieving improvements of 3.1%-12.1% in Pass@10 and 1.4%-14.5% in Pass@50.
LLMs can learn to code *much* better when they collaborate as agents, persistently learning from past fixes and error patterns under the supervision of a "Mentor Agent."
With the widespread adoption of Large Language Models (LLMs) such as GitHub Copilot and ChatGPT, developers increasingly rely on AI-assisted tools to support code generation. While LLMs can generate syntactically correct solutions for well-structured programming tasks, they often struggle with challenges that require iterative debugging, error handling, or adaptation to diverse problem structures. Existing approaches such as fine-tuning or self-repair strategies either require costly retraining or lack mechanisms to accumulate and reuse knowledge from previous attempts. To address these limitations, we propose MemoCoder, a multi-agent framework that enables collaborative problem solving and persistent learning from past fixes. At the core of MemoCoder is a Fixing Knowledge Set, which stores successful repairs and supports retrieval for future tasks. A central Mentor Agent supervises the repair process by identifying recurring error patterns and refining high-level fixing strategies, providing a novel supervisory role that guides the self-repair loop. We evaluate MemoCoder across three public benchmarks -- MBPP, HumanEval, and LiveCodeBench -- spanning a range of problem complexities. Experimental results show that MemoCoder consistently outperforms both zero-shot prompting and a Self-Repair strategy, with improvements ranging from 3.1% to 12.1% in Pass@10 and from 1.4% to 14.5% in Pass@50, demonstrating its effectiveness in iterative refinement and knowledge-guided code generation.