Search papers, labs, and topics across Lattice.
1
0
3
2
Achieve LLaMA-level reasoning accuracy with 44% lower latency and 73% lower API costs by strategically offloading work from large to small models only when needed.