Search papers, labs, and topics across Lattice.
3
0
8
Multimodal agents can now reason, plan, and execute actions more effectively by integrating perception as a core component, not just an auxiliary interface.
Serving LoRA adapters at scale doesn't have to crush your latency SLOs: InfiniLoRA disaggregates LoRA execution to achieve 3x higher throughput and dramatically improved tail latency.
Models can learn to self-differentiate between tasks requiring rigorous planning versus direct generation in creative writing, unlocking a new level of meta-cognitive ability.