Search papers, labs, and topics across Lattice.
2
0
4
2
The hardest AI tasks remain largely unsolved, with current models achieving only a 2.6% success rate on economically valuable workflows.
Censored LLMs offer a surprisingly natural and effective environment for stress-testing methods that aim to elicit truthfulness and detect deception.