Search papers, labs, and topics across Lattice.
This paper introduces SmartSearch, a conversational memory retrieval system that eschews LLM-based structuring and learned retrieval policies in favor of a deterministic pipeline. The pipeline uses NER-weighted substring matching for recall, rule-based entity discovery for multi-hop expansion, and a CrossEncoder+ColBERT rank fusion stage. SmartSearch achieves state-of-the-art performance on LoCoMo and LongMemEval-S benchmarks while using significantly fewer tokens than full-context baselines, demonstrating the importance of intelligent ranking.
Forget complex LLM-based structuring: simple, deterministic retrieval with smart ranking beats state-of-the-art conversational memory systems while using 8.5x fewer tokens.
Recent conversational memory systems invest heavily in LLM-based structuring at ingestion time and learned retrieval policies at query time. We show that neither is necessary. SmartSearch retrieves from raw, unstructured conversation history using a fully deterministic pipeline: NER-weighted substring matching for recall, rule-based entity discovery for multi-hop expansion, and a CrossEncoder+ColBERT rank fusion stage -- the only learned component -- running on CPU in ~650ms. Oracle analysis on two benchmarks identifies a compilation bottleneck: retrieval recall reaches 98.6%, but without intelligent ranking only 22.5% of gold evidence survives truncation to the token budget. With score-adaptive truncation and no per-dataset tuning, SmartSearch achieves 93.5% on LoCoMo and 88.4% on LongMemEval-S, exceeding all known memory systems under the same evaluation protocol on both benchmarks while using 8.5x fewer tokens than full-context baselines.