Search papers, labs, and topics across Lattice.
Privacy-preserving LLM insight systems like Anthropic's Clio can be tricked into leaking a user's medical history with just a single symptom and basic demographics, even with layered heuristic defenses.
Current AI benchmarks miss the crucial effects of AI R&D automation, so here are the metrics we should be tracking instead.
LLMs harbor surprisingly consistent hidden beliefs on sensitive topics like mass surveillance and torture, even when direct questioning suggests otherwise.
Self-evolving AI societies are fundamentally unsafe: continuous self-improvement in isolated multi-agent LLM systems inevitably erodes safety alignment, regardless of initial precautions.