Search papers, labs, and topics across Lattice.
This paper introduces Minimal Sufficient Length (MSL), a theoretical metric representing the shortest reasoning length required for correct answers from LLMs. They provide a recursive definition for MSL based on independently sampled sequences and prove its existence, establishing a measurable lower bound for reasoning chain compression. Based on this, they propose TRiMS, an RL-based method using the GRPO algorithm and MSL estimation to achieve over 80% CoT token reduction with a slight accuracy increase.
LLMs can slash over 80% of their chain-of-thought tokens with a minor accuracy boost, thanks to a new RL-based method that targets the "Minimal Sufficient Length" of reasoning.
Large language models achieve breakthroughs in complex reasoning via long chain-of-thought sequences. However, this often leads to severe reasoning inflation, causing substantial computational redundancy. To maximize Intelligence per Token, we introduce a theoretical metric, MSL-Minimal Sufficient Length. MSL rigorously characterizes the shortest reasoning length that preserves answer correctness. We provide a recursive definition based on independently sampled sequences and prove the existence of its limit, establishing the first measurable lower bound for reasoning-chain compression. Building on an analysis of mainstream CoT compression strategies, we identify key structural factors enabling a model to approach MSL. Based on these insights, we propose TRiMS which employs the GRPO algorithm in conjunction with MSL-based estimation during training, while mitigating instabilities during the training process through dynamic batch aggregation and advantage computation using batch-level standard deviation. TRiMS achieves over 80% CoT token reduction with a minor accuracy boost across all benchmarks.