Search papers, labs, and topics across Lattice.
2
0
5
MiniMax-M2 proves that massive parameter counts don't always translate to better agentic performance; strategic activation of a smaller subset can unlock frontier-level intelligence.
Serving LoRA adapters at scale doesn't have to crush your latency SLOs: InfiniLoRA disaggregates LoRA execution to achieve 3x higher throughput and dramatically improved tail latency.