Search papers, labs, and topics across Lattice.
5
0
8
0
Surpassing existing methods, Orchestra-o1 achieves a 10.3% accuracy improvement on the OmniGAIA benchmark by enabling seamless collaboration across multiple modalities.
A single late fusion layer is enough to maintain multimodal performance, challenging the need for vision tokens to traverse all layers of a Transformer.
Mismatched SFT data hurting your LLM's reasoning? DART uses RL to transform it into perfectly aligned training examples, boosting generalization and efficiency.
Forget monolithic models: a lightweight RL policy can dynamically orchestrate ensembles of frozen experts to outperform GPT-5 and Gemini-2.5-Pro on multimodal tasks, even generalizing to unseen models and skills.
LLM agents can internalize skills via in-context RL, achieving zero-shot autonomous behavior without the token overhead and retrieval noise of traditional methods.