Search papers, labs, and topics across Lattice.
2
0
4
3
Long-form video generation struggles with transitions, scoring only 0.256 on transition quality even when prompt fulfillment is high (0.71), revealing a critical bottleneck exposed by the new DirectorBench diagnostic benchmark.
Saturated LLM benchmarks can be revived without creating new datasets: a self-improving LLM judge in an elimination tournament recovers ranking signal and breaks ties.