Search papers, labs, and topics across Lattice.
Meituan, Zhejiang University
2
0
4
GUI agents can learn world knowledge more efficiently by internalizing causal relationships during mid-training, rather than relying on implicit learning through action annotations or reward signals in post-training.
Forget monolithic models: a lightweight RL policy can dynamically orchestrate ensembles of frozen experts to outperform GPT-5 and Gemini-2.5-Pro on multimodal tasks, even generalizing to unseen models and skills.