Search papers, labs, and topics across Lattice.
3
1
4
9
LLMs can be systematically debugged and improved by treating training data as code, allowing for targeted "patches" that fix concept-level gaps and reasoning errors.
LLM datasets aren't independent islands: tracing their lineage reveals hidden redundancy, benchmark contamination, and opportunities for more diverse training data.
Forget simplistic synthetic data: ChartVerse generates complex charts and reliable reasoning data from scratch, enabling an 8B model to outperform its 30B teacher in chart reasoning.