Search papers, labs, and topics across Lattice.
3
0
5
3
LLMs have "pure incorrectness" features that correlate with wrong answers but don't actually *cause* them, suggesting that simply identifying error-correlated activations isn't enough for effective intervention.
Train smarter, not harder: DSL unlocks 4x faster non-autoregressive generation by teaching masked diffusion models to self-correct more efficiently.
LLMs betray their jailbreaking susceptibility in their hidden activations, allowing for lightweight detection and even real-time disruption of attacks.