Yotam Perlitz

Faculty of Data and Decision Science, Technion, IBM Research

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (3)Tool Use & Agents (2)Data Curation & Synthetic Data (1)Training Efficiency & Optimization (1)

Frequent co-authors

Asaf Yehudai (2)Tomer Keren (1)Asaf Yehudai (1)Roi Reichert (1)

Papers (3)

May 27, 2026

3w ago·also Faculty of Data and Decision Science, HUJI, IBM Research

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Agents that excel on traditional benchmarks may crumble under the pressure of newly synthesized tasks, revealing the limitations of current evaluation methods.

Tomer Keren, Asaf Yehudai, Asaf Yehudai +2

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Tool Use & Agents

Apr 14, 2026

AI2Apr 14, 2026·also MIT CSAIL, Faculty of Data and Decision Science, HUJI, IBM Research +1

Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration

Stop re-running full benchmarks: Calibrate new LLM datasets against existing suites with just 100 "anchor" questions and still get highly accurate performance predictions.

Asaf Yehudai, Yotam Perlitz, Leshem Choshen

Eval Frameworks & Benchmarks Training Efficiency & Optimization

Mar 31, 2026

Mar 31, 2026·also ETH, Faculty of Data and Decision Science, Technion, UIUC

ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities

AI agents are far better at automating data engineering tasks than previously thought, but flawed benchmarks are obscuring their true potential.

Andrea Giovannini, Tengjun Jin, Yotam Perlitz

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Search

Yotam Perlitz

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)