Search papers, labs, and topics across Lattice.
This paper introduces a probabilistic framework for learning insertion order in variable-length insertion models, addressing limitations of existing non-monotonic sequence generation methods that are order-agnostic and fixed-length. By establishing a bijective correspondence between insertion trajectories and permutations, the authors enable an exact reparameterization of data likelihood, leading to the development of the Insertion Process (IP) model. Experiments show that IP significantly enhances modeling quality and generalization in tasks like goal-conditioned planning and molecular string generation, outperforming traditional left-to-right approaches.
Learning insertion order in sequence generation can drastically improve modeling quality and generalization, challenging the dominance of fixed-canvas methods.
Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive modeling by allowing tokens to be generated in non-fixed and prescribed orders. Despite their practical advantages, most existing non-monotonic models are order-agnostic and rely on a fixed-length grid, limiting their ability to support variable-length generation and adaptive insertion order. In this work, we introduce a probabilistic framework for learning insertion order in variable-length insertion models. We formalize a bijective correspondence between insertion trajectories and permutations, which enables an exact reparameterization of the data likelihood as a sum over permutations. Building on this result, we propose the Insertion Process (IP), a stochastic generative model that jointly learns where to insert, what to insert, and when to terminate, trained via permutation-based variational inference. Unlike prior fixed-canvas approaches, IP natively supports variable-length generation and learns data-driven preferences over insertion orders. Experiments on goal-conditioned planning and molecular string generation demonstrate that learning insertion order improves both modeling quality and generalization in domains without a canonical left-to-right structure.