Search papers, labs, and topics across Lattice.
1
0
3
Deferring to a larger LLM only when a smaller LLM is uncertain can match the performance of the larger model alone, while slashing inference costs.