Search papers, labs, and topics across Lattice.
IBM Research
2
0
4
Stop re-running full benchmarks: Calibrate new LLM datasets against existing suites with just 100 "anchor" questions and still get highly accurate performance predictions.
AI agents are far better at automating data engineering tasks than previously thought, but flawed benchmarks are obscuring their true potential.