Search papers, labs, and topics across Lattice.
The authors introduce LifeDialBench, a new benchmark for evaluating memory systems in continuous lifelogging scenarios, comprising EgoMem (based on real egocentric videos) and LifeMem (using simulated virtual communities). To prevent temporal leakage, they propose an online evaluation protocol that respects temporal causality, evaluating systems in a streaming fashion. Experiments show that surprisingly, complex memory systems are outperformed by a simple RAG baseline, suggesting that high-fidelity context preservation is crucial for lifelogging applications.
Current memory systems, despite their complexity, are surprisingly worse than naive RAG when applied to continuous lifelogging scenarios, revealing a critical need for better context preservation.
Nowadays, wearable devices can continuously lifelog ambient conversations, creating substantial opportunities for memory systems. However, existing benchmarks primarily focus on online one-on-one chatting or human-AI interactions, thus neglecting the unique demands of real-world scenarios. Given the scarcity of public lifelogging audio datasets, we propose a hierarchical synthesis framework to curate \textbf{\textsc{LifeDialBench}}, a novel benchmark comprising two complementary subsets: \textbf{EgoMem}, built on real-world egocentric videos, and \textbf{LifeMem}, constructed using simulated virtual community. Crucially, to address the issue of temporal leakage in traditional offline settings, we propose an \textbf{Online Evaluation} protocol that strictly adheres to temporal causality, ensuring systems are evaluated in a realistic streaming fashion. Our experimental results reveal a counterintuitive finding: current sophisticated memory systems fail to outperform a simple RAG-based baseline. This highlights the detrimental impact of over-designed structures and lossy compression in current approaches, emphasizing the necessity of high-fidelity context preservation for lifelog scenarios. We release our code and data at https://github.com/qys77714/LifeDialBench.