Search papers, labs, and topics across Lattice.
1
0
3
6
Even frontier models with high reasoning budgets fail to effectively navigate densely interlinked knowledge bases and complex policies in realistic fintech customer support scenarios, achieving only ~25.5% pass rate.