CMU MLNortheasternNotre DameWaterlooJun 4, 2026arXiv:2606.06388

Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

Jiaju Chen, Yuxuan Lu, Jiayi Su, Chaoran Chen, Songlin Xiao, Zheng Zhang, Yunyao Li, Jian Zhao, Tongshuang Wu, Toby Jia-Jun Li, Dakuo Wang, Bingsheng Yao

AI Summary

This paper introduces ALMANAC, a novel dataset designed to enhance agent collaboration by providing action-level mental model annotations derived from the Map Task, a well-established dyadic routing task. By capturing 2,987 collaboration actions along with detailed annotations of participants' reasoning, partner intentions, and team goals, the dataset addresses the critical gap in authentic human collaboration data necessary for training more competent collaborative agents. Benchmarking six LLMs on their ability to predict human behavior and mental models reveals ALMANAC's effectiveness in evaluating and improving the collaborative capabilities of AI agents.

Key Contribution

ALMANAC reveals that agents can significantly improve their collaborative competence by learning from detailed human mental model annotations.

Abstract

Recent advances in LLM agents have enabled complex cognitive capabilities, such as multi-step reasoning, planning, and tool use, that increasingly position these agents as human collaborators. Effective collaboration, however, requires collaborators to continuously maintain and align mental models of their own reasoning,partners' intentions, and shared goals during the collaborative process. Today's agents rarely develop such capabilities since they are primarily optimized for task completion, and the community lacks authentic human collaboration data with action-level mental model annotations that could guide agents toward process-level collaborative competence. To bridge this gap, we present ALMANAC, a dataset of Action-Level Mental model ANnotations for Agent Collaboration built from the Map Task, a classic dyadic routing task from social science. ALMANAC contains 2,987 collaboration actions, each paired with theory-informed mental model annotations that record the participants' self-reasoning, perceived partner intent, and perceived team goal. We benchmark six LLMs on predicting humans' next-turn behavior and mental models. Our results demonstrate ALMANAC's utility in evaluating models' ability to simulate human collaborative behaviors and infer their underlying mental models.

Scalable Oversight & Alignment Theory Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

Related Papers