Search papers, labs, and topics across Lattice.
TableMind++ extends the TableMind programmatic agent for table reasoning by incorporating uncertainty-aware inference to mitigate LLM hallucinations. It addresses epistemic uncertainty through memory-guided plan pruning, which validates plans against historical trajectories, and aleatoric uncertainty via confidence-based action refinement, which monitors token-level probabilities for syntactic noise. Dual-weighted trajectory aggregation then synthesizes a robust consensus from multiple reasoning paths, leading to state-of-the-art performance on diverse benchmarks.
Table reasoning gets a reliability boost: TableMind++ uses uncertainty estimates to prune flawed plans and refine actions, outperforming prior models by synthesizing robust reasoning paths.
Table reasoning requires models to jointly perform semantic understanding and precise numerical operations. Most existing methods rely on a single-turn reasoning paradigm over tables which suffers from context overflow and weak numerical sensitivity. To address these limitations, we previously proposed TableMind as a tuning-based autonomous programmatic agent that simulates human-like interaction within a lightweight large language model (LLM). TableMind internalizes planning, action, and reflection through a two-stage training strategy involving supervised fine-tuning (SFT) on filtered high-quality data and reinforcement learning (RL) via a multi-perspective reward and the Rank-Aware Policy Optimization (RAPO) algorithm. While TableMind establishes a solid foundation for programmatic agents, the inherent stochasticity of LLMs remains a critical challenge that leads to hallucinations. In this paper, we extend this foundation to TableMind++ by introducing a novel uncertainty-aware inference framework to mitigate hallucinations. Specifically, we propose memory-guided plan pruning to retrieve historical trajectories for validating and filtering out logically flawed plans to address epistemic uncertainty. To ensure execution precision, we introduce confidence-based action refinement which monitors token-level probabilities to detect and self-correct syntactic noise for aleatoric uncertainty mitigation. Finally, we employ dual-weighted trajectory aggregation to synthesize a robust consensus from multiple reasoning paths. Extensive experiments on diverse benchmarks demonstrate that TableMind++ consistently outperforms previous baselines and proprietary models to validate the effectiveness of integrating autonomous training with uncertainty quantification. Our code is available.