Search papers, labs, and topics across Lattice.
2
0
5
Multi-chiplet architectures can unlock significant speedups and memory savings for low-batch MoE inference by dynamically scheduling expert computations across high-bandwidth die-to-die links.
Scaling LLM-based multi-agent systems doesn't just need better prompts or models, but a whole new software engineering approach focused on managing runtime entropy.