Search papers, labs, and topics across Lattice.
The University of Texas at Austin
2
0
4
8
VLMs may ace the color coverage test, but they flunk the "do as I say, not as I do" test, routinely ignoring their own stated reasoning rules in ways that humans don't.
AI models can detect injected thoughts, but they often have no idea *what* those thoughts are, relying on content-agnostic anomaly detection and then guessing common concepts.