Search papers, labs, and topics across Lattice.
The paper introduces Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM), which interleaves autoregressive generation with a latent diffusion-based planning phase to refine semantic plans before token generation. This allows for global planning in a continuous space, addressing the limitations of token-by-token decision-making in standard autoregressive models. Experiments demonstrate that STAR-LDM outperforms comparable models on language understanding benchmarks and exhibits superior narrative coherence and commonsense reasoning, as judged by LLMs.
By pausing to "think" with latent diffusion, STAR-LDM achieves superior language understanding, narrative coherence, and controllable generation compared to standard autoregressive models of similar size.
The Stop-Think-AutoRegress Language Diffusion Model (STAR-LDM) integrates latent diffusion planning with autoregressive generation. Unlike conventional autoregressive language models limited to token-by-token decisions, STAR-LDM incorporates a"thinking"phase that pauses generation to refine a semantic plan through diffusion before continuing. This enables global planning in continuous space prior to committing to discrete tokens. Evaluations show STAR-LDM significantly outperforms similar-sized models on language understanding benchmarks and achieves $>70\%$ win rates in LLM-as-judge comparisons for narrative coherence and commonsense reasoning. The architecture also allows straightforward control through lightweight classifiers, enabling fine-grained steering of attributes without model retraining while maintaining better fluency-control trade-offs than specialized approaches.