Search papers, labs, and topics across Lattice.
University of Science and Technology of China
1
0
2
Code agents struggle with evolving user requirements, revealing a 38-point gap in performance across leading LLMs when faced with iterative feedback.