Search papers, labs, and topics across Lattice.
3
0
4
Automatic attribution metrics for LLMs can flip rankings across datasets, leading to misleading evaluations that could cost researchers dearly in decision-making.
Exact-match retrieval metrics can mislead assessments of policy utility, as retrieved clauses perform nearly as well as gold-standard ones in decision-making tasks.
Direct token-level self-distillation can backfire, but Sibling-Guided Credit Distillation redefines credit assignment to enhance long-horizon tool-use without amplifying harmful behaviors.