Search papers, labs, and topics across Lattice.
Harbin Institute of Technology, Shenzhen
6
0
8
10
Today's best GUI agents choke on real-world, multi-application workflows, achieving less than 21% success rate, revealing a critical gap in their ability to coordinate across applications and perform conditional reasoning.
Reasoning across languages doesn't have to break the bank: a new framework slashes token costs by over 50% while maintaining accuracy, especially boosting performance in low-resource languages.
LLMs still struggle to reason in context when cultural and linguistic nuances are involved, achieving only 44% accuracy on a new grounded benchmark spanning 14 languages.
LLMs can now navigate the ever-expanding universe of external tools with significantly improved accuracy and generalization, thanks to a new agentic framework that proactively retrieves and grounds tool execution.
Traditional text embedding benchmarks fail to capture the nuances of long-horizon memory retrieval, but this new benchmark reveals that bigger models don't always win, and performance on standard tasks doesn't guarantee success in complex, context-dependent memory scenarios.
Get 3.6x faster long-context LLM inference with LycheeCluster's hierarchical KV indexing, which avoids the semantic fragmentation of naive chunking.