Search papers, labs, and topics across Lattice.
1
0
3
7
Even GPT-5 and Gemini 2.5 Pro still fail to efficiently couple reasoning with tool use, requiring up to 2.7x more tool calls than theoretically optimal in a new diagnostic environment.