Search papers, labs, and topics across Lattice.
1
0
3
LLMs struggle with long-context code QA, losing significant accuracy when answer formats are changed or irrelevant information is added, revealing a brittleness masked by standard benchmarks.