Search papers, labs, and topics across Lattice.
Carnegie Mellon University
1
0
2
Even the top-performing LLM struggles with realistic user interactions, achieving only 61% success in complex task scenarios.