Apr 7, 2026arXiv:2604.05859

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

Uljad Berdica, Fernando Acero, Anton Ipsen, Parisa Zehtabi, Michael Cashmore

AI Summary

This paper introduces LLMP-UCB, a contextual bandit algorithm that uses LLMs to derive uncertainty estimates via repeated inference for decision-making problems with textual and numerical contexts. Surprisingly, experiments show that lightweight numerical bandits using text embeddings (dense or Matryoshka) perform comparably or better than LLM-based methods, at a significantly lower computational cost. The authors propose a geometric diagnostic based on arm embeddings to guide the choice between LLM-driven reasoning and lightweight numerical bandits.

Key Contribution

LLMs aren't always necessary: simple text embeddings in numerical bandits can match or beat LLM-based contextual bandits at a fraction of the cost.

Abstract

We study Contextual Multi-Armed Bandits (CMABs) for non-episodic sequential decision making problems where the context includes both textual and numerical information (e.g., recommendation systems, dynamic portfolio adjustments, offer selection; all frequent problems in finance). While Large Language Models (LLMs) are increasingly applied to these settings, utilizing LLMs for reasoning at every decision step is computationally expensive and uncertainty estimates are difficult to obtain. To address this, we introduce LLMP-UCB, a bandit algorithm that derives uncertainty estimates from LLMs via repeated inference. However, our experiments demonstrate that lightweight numerical bandits operating on text embeddings (dense or Matryoshka) match or exceed the accuracy of LLM-based solutions at a fraction of their cost. We further show that embedding dimensionality is a practical lever on the exploration-exploitation balance, enabling cost--performance tradeoffs without prompt complexity. Finally, to guide practitioners, we propose a geometric diagnostic based on the arms' embedding to decide when to use LLM-driven reasoning versus a lightweight numerical bandit. Our results provide a principled deployment framework for cost-effective, uncertainty-aware decision systems with broad applicability across AI use cases in financial services.

Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

Related Papers