Search papers, labs, and topics across Lattice.
2
0
5
Agents can be quantitatively assessed for their behavioral traits, revealing a surprising accuracy in predicting their actions based on skill file edits.
LLM endpoints can appear "healthy" according to traditional metrics while undergoing subtle behavioral shifts detectable by monitoring output distributions, highlighting a critical gap in current reliability practices.