Search papers, labs, and topics across Lattice.
4 papers from Apple ML Research on Natural Language Processing
Forget coarse sequence-level hacks: LenVM lets you precisely dial in token generation length, boosting a 7B model's length accuracy from 30.9 to 64.8 and crushing closed-source rivals.
Forget painstakingly collecting user data – PersonaTrace lets you bootstrap realistic digital footprints with LLMs, and models trained on this synthetic data actually generalize better to real-world tasks.
Fine-tuning a specialized LLM to generate textual relevance labels for search ranking not only beats larger pre-trained models, but also drives significant real-world gains in App Store conversion rates, especially for tail queries.
Sticking to a single HTML-to-text extractor in your LLM pretraining pipeline could be leaving 71% of the data on the table.