Mar 7, 2026arXiv:2603.07223

Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training

Chuxue Cao, Honglin Lin, Zhanping Zhong, Xin Gao, Mengzhang Cai, Conghui He, Sirui Han, Lijun Wu

AI Summary

The authors investigate the impact of data quality and difficulty on LLM performance in the finance domain. They introduce ODA-Fin-SFT-318k, a high-quality Chain-of-Thought dataset created via multi-stage distillation and verification, and ODA-Fin-RL-12k, a dataset curated for difficult but verifiable tasks. Experiments using SFT and RL pipelines demonstrate that high-quality CoT distillation improves the SFT foundation, while difficulty-aware sampling enhances RL generalization, leading to state-of-the-art performance on financial benchmarks with an 8B model.

Key Contribution

Forget scaling laws, targeted data engineering—specifically multi-stage distillation and difficulty-aware sampling—allows an 8B model to outperform larger open-source financial LLMs.

Abstract

Large Language Models (LLMs) have demonstrated strong general capabilities, yet their deployment in finance remains challenging due to dense domain-specific terminology, stringent numerical reasoning requirements, and low tolerance for factual errors. We conduct a controlled empirical study showing that in specialized vertical domains, performance is largely determined by the quality and difficulty/verifiability profile of post-training data. We introduce \textbf{ODA-Fin-SFT-318k}, constructed via multi-stage distillation and verification to produce high-quality Chain-of-Thought supervision, and \textbf{ODA-Fin-RL-12k}, curated for hard-but-verifiable tasks that balance reward precision and task diversity. Using standard SFT and RL pipelines, we show that high-quality CoT distillation establishes a robust foundation during SFT, while difficulty- and verifiability-aware sampling improves RL generalization. Evaluated on nine benchmarks spanning general financial tasks, sentiment analysis, and numerical reasoning, our ODA-Fin-RL-8B consistently surpasses open-source state-of-the-art (SOTA) financial LLMs of comparable size. We release our ODA-Fin-SFT-318k and ODA-Fin-RL-12k datasets, along with trained models to advance data-centric financial AI research.

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References42

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training

Related Papers