Search papers, labs, and topics across Lattice.
The paper introduces McDiffuSE, a Monte Carlo Tree Search (MCTS) framework, to optimize slot infilling order in Masked Diffusion Models (MDMs) for improved mathematical and code reasoning. McDiffuSE uses look-ahead simulations to evaluate partial completions and systematically explore the space of generation orders. Experiments demonstrate that McDiffuSE achieves significant improvements over autoregressive baselines and standard plan-and-infill methods, particularly on MBPP and MATH500 benchmarks, highlighting the importance of non-sequential generation orders.
Diffusion language models get a nearly 20% boost on MBPP by strategically planning the order in which they fill in the blanks.
While plan-and-infill decoding in Masked Diffusion Models (MDMs) shows promise for mathematical and code reasoning, performance remains highly sensitive to slot infilling order, often yielding substantial output variance. We introduce McDiffuSE, a framework that formulates slot selection as decision making and optimises infilling orders through Monte Carlo Tree Search (MCTS). McDiffuSE uses look-ahead simulations to evaluate partial completions before commitment, systematically exploring the combinatorial space of generation orders. Experiments show an average improvement of 3.2% over autoregressive baselines and 8.0% over baseline plan-and-infill, with notable gains of 19.5% on MBPP and 4.9% on MATH500. Our analysis reveals that while McDiffuSE predominantly follows sequential ordering, incorporating non-sequential generation is essential for maximising performance. We observe that larger exploration constants, rather than increased simulations, are necessary to overcome model confidence biases and discover effective orderings. These findings establish MCTS-based planning as an effective approach for enhancing generation quality in MDMs.