Search papers, labs, and topics across Lattice.
1
0
2
LLMs still have a long way to go when it comes to interactive problem-solving, as revealed by a new benchmark that tests reasoning under budget constraints.