AI2University of California Los AngelesApr 19, 2026arXiv:2604.17290

Probabilistic Programs of Thought

Poorva Garg, Renato Lui Geh, Daniel Israel, Todd Millstein, Kyle Richardson, Guy Van den Broeck

AI Summary

This paper introduces a novel framework called probabilistic programs of thought that enhances the efficiency of code generation and mathematical reasoning tasks by leveraging the distribution of next-token probabilities from LLMs. Instead of generating multiple samples through expensive GPU computations, the proposed method allows for the representation of exponentially many deterministic programs from a single generated program, significantly reducing the computational burden. The results demonstrate that this approach yields improved performance across various benchmarks while minimizing the number of required LLM generations.

Key Contribution

By transforming LLM outputs into probabilistic programs, this approach slashes the computational cost of generating multiple code samples without sacrificing quality.

Abstract

LLMs are widely used for code generation and mathematical reasoning tasks where they are required to generate structured output. They either need to reason about code, generate code for a given specification, or reason using programs of thought. The typical approach to code generation is to prompt the model and generate samples until an appropriate program is obtained. Within this process, sampling $n$ programs from the language model requires $n$ GPU compute-intensive generations which becomes prohibitively expensive for larger values of $n$. In this work, we address this limitation by exposing the LLM's distribution within the generated programs themselves. We propose a novel test-time framework we dub probabilistic programs of thought to obtain more samples from the model with fewer LLM generations. Given a program generated by a model and the associated next-token probabilities, we build a probabilistic program that compactly represents exponentially many deterministic programs. Since performing probabilistic reasoning in this probabilistic program is much cheaper, our approach allows sampling new programs without any additional GPU compute and little CPU overhead. We instantiate our approach on benchmarks for code generation, code understanding and mathematical reasoning and report improvements in performance with fewer generations from the LLM.

Code Generation & Program Synthesis Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Probabilistic Programs of Thought

Related Papers