Search papers, labs, and topics across Lattice.
2
0
3
No existing model can effectively ground the spatial structure of student reasoning in multi-page handwritten homework, revealing a significant gap in automated assessment capabilities.
Current image difference captioning benchmarks fail to capture semantic consistency and penalize hallucinations, but DiffCap-Bench offers a robust alternative that aligns with human expert judgments and predicts downstream utility for image editing.