Search papers, labs, and topics across Lattice.
5
7
8
9
Even when RAG models detect poisoned information, they still act on it, but a new architecture can close this "monitoring-control gap" and slash attack success by 92%.
Aggregate benchmark scores can be misleading: models with statistically indistinguishable atomic knowledge can exhibit composition behavior differences exceeding 40 percentage points.
RAG systems can *know* the evidence contradicts their actions, yet still fail to act safely, revealing a dangerous monitoring-control gap that current evaluations miss.
RAG's hallucination problem might be solved by simply listening to the whispers in the generator's residual stream.
LLM-tool integrations are riddled with security holes, but MCP-GUARD offers a practical, multi-layered defense that achieves 96% accuracy in detecting adversarial prompts.