SNUApr 23, 2026arXiv:2604.21495

Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning

AI Summary

The paper introduces TaNOS, a continual pre-training framework designed to improve the generalization of numerical reasoning in large language models (LLMs) across diverse expert-domain tables. TaNOS employs header anonymization, operation sketches, and self-supervised pretraining to decouple domain semantics from numerical operation structure. Experiments show that TaNOS significantly outperforms supervised fine-tuning baselines and proprietary models like GPT-5 and Gemini-2.5-Pro, particularly in domain-shift scenarios, demonstrating enhanced robustness.

Key Contribution

Forget memorizing table headers: TaNOS unlocks surprisingly robust numerical reasoning by pre-training on operation sketches and correctness-guaranteed programs.

Abstract

Numerical reasoning over expert-domain tables often exhibits high in-domain accuracy but limited robustness to domain shift. Models trained with supervised fine-tuning (SFT) on specific datasets tend to rely on header-operation shortcuts rather than structural reasoning. We introduce TaNOS, a continual pre-training framework comprising three components: (i) header anonymization to reduce lexical memorization, (ii) operation sketches that provide minimal structural cues, and (iii) self-supervised pretraining that constructs correctness-guaranteed program-question pairs from given tables in a program-first manner. By decoupling domain semantics and numerical operation structure, TaNOS improves the transferability of numerical reasoning. Applied to an 8B instruction-tuned model, TaNOS achieves 80.13% execution accuracy on FinQA with only 10% train data, outperforming SFT baseline (73.97%) with full train data and proprietary models such as GPT-5, Gemini-2.5-Pro. Furthermore, in the domain-shift experiments, TaNOS displays nearly-negligible cross-domain gap (<2pp) when standard SFT shows over 10pp gap. These results suggest that structural guidance with operation sketches, header-agnostic representations, and correctness-guaranteed self-supervision can improve the robustness of numerical reasoning across diverse expert-domain tables.

Data Curation & Synthetic Data Natural Language Processing Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning

Related Papers