Search papers, labs, and topics across Lattice.
2
0
3
14
Current code-editing benchmarks are so out of touch with real-world developer workflows that they risk misleading progress on LLMs for code.
Turns out, almost all AI agent tool descriptions are "smelly," and while fixing them improves performance, it also introduces a tricky efficiency trade-off that can be solved by carefully choosing which components to include.