TrentoApr 30, 2026arXiv:2604.27618

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

Naomi Esposito, A. Tricarico, Anthony Tricarico, Luisa Porzio, Alì Aghazadeh Ardebili, Ali Aghazadeh Ardebili, Massimo Stella, M. Stella

AI Summary

The paper introduces Math Education Digital Shadows (MEDS), a dataset of 28,000 personas from 14 LLMs designed to map LLM reasoning and biases in mathematics across human- and AI-like conditions. MEDS includes psychological/sociodemographic metadata, math tasks (open interviews, psychometric tests, cognitive networks, and high-school math questions), and reasoning/confidence scores. Analysis reveals schema integrity, consistent personas, and family-specific peculiarities like negative math attitudes and overconfidence, making it a valuable resource for improving AI tutors.

Key Contribution

LLMs exhibit surprisingly human-like biases and overconfidence in math, revealed by a new dataset mapping their mathematical reasoning across diverse personas.

Abstract

To enhance LLMs'impact on math education, we need data on their mathematical prowess and biases across prompts. To fill this gap, we introduce MEDS (Math Education Digital Shadows) as a dataset mapping how large language models reason about and report mathematics across human- and AI-like conditions. MEDS involves 28,000 personas from 14 LLMs (from families like Mistral, Qwen, DeepSeek, Granite, Phi and Grok) shadowing either humans or AI assistants. Each record/shadow includes a set of prompts along with psychological/sociodemographic persona metadata and four types of math tasks: (i) open math interview, (ii) three psychometric tests about math perceptions with explanations, (iii) cognitive networks capturing math attitudes, and (iv) 18 high-school math test questions together with their reasoning and confidence scores. MEDS differs from traditional score-only math benchmarks because it integrates concepts of self-efficacy, math anxiety, and cognitive network science besides math proficiency scores. Data validation shows that the sampled LLMs exhibit schema integrity and consistent personas, together with family-specific peculiarities like human-like negative math attitudes, logical fallacies, and math overconfidence. MEDS will benefit learning analytics experts, cognitive scientists, and developers of safer AI tutors in mathematics.

Eval Frameworks & Benchmarks Open-Source Models & Weights Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

Related Papers