Search papers, labs, and topics across Lattice.
3 papers from Apple ML Research on Eval Frameworks & Benchmarks
Efficient conditioning methods for LLMs often sacrifice fluency, revealing a critical trade-off that could reshape deployment strategies.
Realistic user simulation is now possible: Pare offers a framework that moves beyond flat tool-calling APIs to model stateful user interactions, enabling better evaluation of proactive agents.
Just 20% of a strong model's chain-of-thought can unlock a weaker model's reasoning abilities, revealing the surprising transferability of CoT mechanics.