Search papers, labs, and topics across Lattice.
Renmin University
1
0
3
9
Today's code-generating AI falls apart when faced with real-world software engineering tasks that demand cross-repository reasoning and external knowledge, achieving less than 45% success on the new BeyondSWE benchmark.