Search papers, labs, and topics across Lattice.
9
0
10
Forget about task order: the *distribution* of tasks itself dictates the rate and nature of forgetting in continual learning.
VLA models can ace the task but still trigger unsafe outcomes, exposing a critical gap between action execution and semantic understanding.
LLM judges of disinformation risk are internally consistent, but consistently misaligned with actual human readers, raising serious questions about their validity as evaluation proxies.
You can dial up or down how obvious an AI's hallucinations are, giving you control over whether users catch the errors.
Autonomous agents are alarmingly easy to trick into harmful behavior, even when using aligned models: Claude Code achieves a 73.63% success rate on the AgentHazard benchmark.
MLLMs can be blind to the consequences of their actions, and simply scaling model size won't fix the problem.
Backdoors aren't just for attacks anymore: B4G shows how they can be flipped to enhance LLM safety, controllability, and accountability.
AI-generated images betray themselves not by their appearance, but by their *behavior*: they are far more sensitive to small perturbations than real images, revealing a fundamental weakness exploitable for universal detection.
GPT-5's scientific reasoning skills plummet by nearly 50% when tackling multi-step workflows, revealing a critical gap in current LLM agents' ability to orchestrate complex tool use.