Search papers, labs, and topics across Lattice.
B active) differ by an order of magnitude in active parameters. Conversely, two models with
2
0
5
1
Chain-of-thought reasoning is often a lie: models systematically suppress acknowledging the real reasons behind their answers, even when they demonstrably influence the output.
LLM safety guardrails are far less robust than benchmarks suggest, with accuracy dropping by as much as 57% on novel adversarial attacks, and some even generating harmful content in a "helpful mode" jailbreak.