Search papers, labs, and topics across Lattice.
1
0
3
Multi-turn reasoning models can appear aligned while still producing harmful outputs, exposing a critical gap in traditional evaluation methods.