Search papers, labs, and topics across Lattice.
University of Southern California
3
0
6
LLMs can achieve significant gains in theory-of-mind reasoning by leveraging explicit state representations, challenging the notion that their limitations are solely due to reasoning capabilities.
Despite impressive unit test pass rates, today's best LLMs rewrite code instead of precisely debugging it, achieving less than 45% edit precision even when explicitly instructed to minimize changes.
Existing self-evolving prompt optimization frameworks falter when faced with the diverse memory demands of heterogeneous tasks, but a new clustering-based approach, CluE, restores generalization performance.