NUSHKUSTSJTUApr 13, 2026arXiv:2604.11628

Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation

Yuqian Wu, Zhengjun Huang, Junle Chen, Qingxiang Liu, Kai Wang, Xiaofang Zhou, Yuxuan Liang

AI Summary

This paper investigates the limitations of existing conversational memory systems, identifying the Signal Sparsity Effect as a key bottleneck. They decompose this effect into Decisive Evidence Sparsity and Dual-Level Redundancy, showing how these phenomena degrade performance in long conversations. To address this, they propose a minimalist retrieval-and-generation framework, \method, which uses Turn Isolation Retrieval and Query-Driven Pruning to improve signal density and achieve state-of-the-art results.

Key Contribution

Forget complex memory architectures: simple retrieval and generation, when carefully tuned for signal density, can outperform sophisticated methods in conversational agents.

Abstract

Existing conversational memory systems rely on complex hierarchical summarization or reinforcement learning to manage long-term dialogue history, yet remain vulnerable to context dilution as conversations grow. In this work, we offer a different perspective: the primary bottleneck may lie not in memory architecture, but in the \textit{Signal Sparsity Effect} within the latent knowledge manifold. Through controlled experiments, we identify two key phenomena: \textit{Decisive Evidence Sparsity}, where relevant signals become increasingly isolated with longer sessions, leading to sharp degradation in aggregation-based methods; and \textit{Dual-Level Redundancy}, where both inter-session interference and intra-session conversational filler introduce large amounts of non-informative content, hindering effective generation. Motivated by these insights, we propose \method, a minimalist framework that brings conversational memory back to basics, relying solely on retrieval and generation via Turn Isolation Retrieval (TIR) and Query-Driven Pruning (QDP). TIR replaces global aggregation with a max-activation strategy to capture turn-level signals, while QDP removes redundant sessions and conversational filler to construct a compact, high-density evidence set. Extensive experiments on multiple benchmarks demonstrate that \method achieves robust performance across diverse settings, consistently outperforming strong baselines while maintaining high efficiency in tokens and latency, establishing a new minimalist baseline for conversational memory.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation

Related Papers