Apr 14, 2026arXiv:2604.12913

CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference

AI Summary

CoDe-R is introduced, a two-stage framework to refine decompiler output from stripped executables using LLMs. It uses Semantic Cognitive Enhancement (SCE) to recover high-level algorithmic intent alongside code and a Dynamic Dual-Path Fallback (DDPF) mechanism to balance semantic recovery and syntactic stability during inference. CoDe-R achieves state-of-the-art re-executability on the HumanEval-Decompile benchmark for 1.3B parameter models, exceeding 50% average re-executability.

Key Contribution

LLMs can now decompile stripped binaries with >50% re-executability using a surprisingly lightweight 1.3B parameter model, closing the gap with expert-level performance.

Abstract

Binary decompilation is a critical reverse engineering task aimed at reconstructing high-level source code from stripped executables. Although Large Language Models (LLMs) have recently shown promise, they often suffer from"logical hallucinations"and"semantic misalignment"due to the irreversible semantic loss during compilation, resulting in generated code that fails to re-execute. In this study, we propose Cognitive Decompiler Refinement with Robustness (CoDe-R), a lightweight two-stage code refinement framework. The first stage introduces Semantic Cognitive Enhancement (SCE), a Rationale-Guided Semantic Injection strategy that trains the model to recover high-level algorithmic intent alongside code. The second stage introduces a Dynamic Dual-Path Fallback (DDPF) mechanism during inference, which adaptively balances semantic recovery and syntactic stability via a hybrid verification strategy. Evaluation on the HumanEval-Decompile benchmark demonstrates that CoDe-R (using a 1.3B backbone) establishes a new State-of-the-Art (SOTA) in the lightweight regime. Notably, it is the first 1.3B model to exceed an Average Re-executability Rate of 50.00%, significantly outperforming the baseline and effectively bridging the gap between efficient models and expert-level performance. Our code is available at https://github.com/Theaoi/CoDe-R.

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference

Related Papers