Search papers, labs, and topics across Lattice.
2
0
5
0
Even strong LLMs struggle to pinpoint the exact moment and cause of failure in risky agent trajectories arising from latent, intrinsic issues, achieving below 35 Strict-F1 on risk-step localization.
LLMs can now learn mathematical reasoning 2x faster and with greater stability, thanks to a new token-level policy optimization method.