Search papers, labs, and topics across Lattice.
This study introduces SearchGEO, a framework designed to evaluate the vulnerability of LLM-based search agents to endorsement corruption caused by manipulated web content. By analyzing 13 different LLM backends across 308 cases, the authors reveal significant variability in attack success rates, with some models like Claude-Sonnet-4.6 showing no vulnerability while others like Gemini-3-Flash exhibit a 31.4% success rate. The findings highlight critical differences in how various models handle endorsement under adversarial conditions, suggesting that recommendation reliability should be prioritized in safety assessments of LLMs.
LLM search agents can be easily manipulated, with endorsement corruption rates varying dramatically across different models, raising serious concerns about their reliability.
Large language model (LLM)-based search agents synthesize open-web content into actionable recommendations on behalf of users, creating a risk that attacker-published pages are transformed into endorsed claims. We introduce SearchGEO, a controlled evaluation framework for measuring endorsement corruption in LLM-based web-search agents, combining a web-evidence manipulation pipeline, a five-mode attack taxonomy, and multiple output-level metrics. We evaluate 13 LLM backends on 308 cases each. Results show that vulnerability patterns vary across backends: overall attack success rate (ASR) ranges from 0.0% on Claude-Sonnet-4.6 to 31.4% on Gemini-3-Flash, the strongest attack mode differs by model family, and the same deployment scaffold could amplify or decrease ASR on different backends. An auxiliary agent-skill probe, where endorsement becomes an install command, exposes a sharp split among otherwise robust backends: Claude over-rejects while GPT over-trusts. These findings argue for treating recommendation reliability under adversarial search content as a first-class dimension of backend safety evaluation.