Search papers, labs, and topics across Lattice.
3
0
5
0
LLMs can convincingly *say* they're conscientious, but ActTraitBench reveals they often *act* otherwise, exposing a critical gap between knowledge and behavior that scales *worse* with model size.
Fine-grained control over reward signals unlocks significant gains in multi-trait essay scoring, outperforming standard policy optimization techniques.
Forget monolithic models: a mixture-of-experts approach using clustered semantic domains boosts definition modeling by 7% BLEU, proving that specialization wins.