Mar 1, 2026arXiv:2603.01243

Suffix-Constrained Greedy Search Algorithms for Causal Language Models

Ayoub Hammal, Pierre Zweigenbaum, Caio Corro

AI Summary

This paper introduces suffix-constrained generation, a method for producing LLM responses with guaranteed trivially parseable final answers by enforcing strict templates. They propose several greedy search algorithms to implement this constraint during generation. Experiments on multiple datasets demonstrate that this approach ensures deterministic answer extraction without performance degradation, and in some cases, even improves results.

Key Contribution

Guaranteeing trivially parseable LLM outputs via suffix-constrained generation not only simplifies information extraction but can also improve performance on prediction tasks.

Abstract

Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even improving them.

Inference & Quantization Natural Language Processing Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...