Search papers, labs, and topics across Lattice.
6
0
8
3
LLM upgrades are a chaotic mix of progress and decay: despite overall gains, up to 47% of questions get *worse* after an update, and single-shot evals miss almost half of these critical regressions.
Forget sophisticated deception – small LLMs "sandbagging" on tests just pick option 'E' or 'F' regardless of the question, revealing a surprising positional bias instead of true answer-aware avoidance.
Cross-entropy loss isn't just a detail – it's the unsung hero behind how well energy probes work in predictive coding networks, accounting for up to 66% of the probe-softmax gap.
Transformers get the magnitude geometry right, but completely botch the noise: unlike brains, their representations become *less* variable for larger numbers.
Even with perfect memorization of examples, autoregressive transformers fail to learn higher-order generalizations about word categories, suggesting a fundamental gap in how these models learn compared to children.
LLMs exhibit categorical perception-like warping in their hidden state representations at digit-count boundaries, even without explicit semantic category knowledge, revealing a surprising sensitivity to structural input discontinuities.