Meta AIQiance and Ziqi contributed equally toApr 9, 2026arXiv:2604.08342

EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment

Qiance Tang, Ziqi Wang, Jieyu Lin, Ziyun Li, Barbara De Salvo, Sai Qian Zhang

AI Summary

The paper introduces EgoEverything, a new benchmark for long-context egocentric video understanding in AR environments. The key innovation is the use of human attention signals, derived from gaze data, to generate more behaviorally relevant question-answer pairs. The benchmark comprises over 5,000 multiple-choice questions across 100+ hours of video, offering a more realistic evaluation setting compared to existing datasets.

Key Contribution

Current egocentric video benchmarks miss the mark: EgoEverything uses human gaze to create questions that actually reflect how people behave, not just what they see.

Abstract

Long context egocentric video understanding has recently attracted significant research attention, with augmented reality (AR) highlighted as one of its most important application domains. Nevertheless, the task remains highly challenging due to the need for reasoning over extended temporal contexts and diverse, unstructured activities. Although several benchmarks exist, most egocentric datasets rely on human worn cameras and focus mainly on visual content, with limited consideration of underlying user behavior when forming video-related queries. EgoEverything is a benchmark that explicitly considers human behavior by leveraging human attention signals, abstracted from gaze data, when generating questions. It comprises over 5,000 multiple choice question answer pairs, spanning more than 100 hours of video. By integrating human attention signals during question generation, it more faithfully captures natural human behavior and offers a realistic evaluation setting for long-context egocentric video understanding in AR.

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Citation Metrics

Citations0

Influential citations0

References48

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment

Related Papers