Search papers, labs, and topics across Lattice.
Fudan University
2
0
3
Model-generated skills can actually hurt agent performance, and bigger models don't necessarily make for better skill extractors or consumers.
Today's best language models can barely make sense of your messy group chats and fragmented digital life, achieving only 19% accuracy on a new benchmark of real-world reasoning.