Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
2
0
5
LLM code generation benchmarks are likely overestimating model capabilities: adversarial test suite scaling reveals substantial weaknesses in even state-of-the-art models.
Autoformalization gets a major upgrade: DSR's neuro-symbolic approach leverages operator trees to outperform end-to-end LLMs, proving that structured representations are key to bridging human and formal mathematics.