Search papers, labs, and topics across Lattice.
This paper introduces HingeMem, a boundary-guided long-term memory for dialogue systems that uses event segmentation theory to index memories based on changes in person, time, location, and topic. HingeMem employs query-adaptive retrieval to determine both which memory elements to retrieve and the retrieval depth based on the query type, improving retrieval robustness and efficiency. Experiments on LOCOMO using LLMs from 0.6B to production-tier models demonstrate a 20% relative improvement over baselines and a 68% reduction in question answering token cost compared to HippoRAG2.
Forget fixed Top-k retrieval: HingeMem's query-adaptive retrieval dynamically decides *what* and *how much* to retrieve from long-term memory, boosting dialogue performance by 20% while slashing token costs.
Long-term memory is critical for dialogue systems that support continuous, sustainable, and personalized interactions. However, existing methods rely on continuous summarization or OpenIE-based graph construction paired with fixed Top-\textit{k} retrieval, leading to limited adaptability across query categories and high computational overhead. In this paper, we propose HingeMem, a boundary-guided long-term memory that operationalizes event segmentation theory to build an interpretable indexing interface via boundary-triggered hyperedges over four elements: person, time, location, and topic. When any such element changes, HingeMem draws a boundary and writes the current segment, thereby reducing redundant operations and preserving salient context. To enable robust and efficient retrieval under diverse information needs, HingeMem introduces query-adaptive retrieval mechanisms that jointly decide (a) \textit{what to retrieve}: determine the query-conditioned routing over the element-indexed memory; (b) \textit{how much to retrieve}: control the retrieval depth based on the estimated query type. Extensive experiments across LLM scales (from 0.6B to production-tier models; \textit{e.g.}, Qwen3-0.6B to Qwen-Flash) on LOCOMO show that HingeMem achieves approximately $20\%$ relative improvement over strong baselines without query categories specification, while reducing computational cost (68\%$\downarrow$ question answering token cost compared to HippoRAG2). Beyond advancing memory modeling, HingeMem's adaptive retrieval makes it a strong fit for web applications requiring efficient and trustworthy memory over extended interactions.