Search papers, labs, and topics across Lattice.
Instituto de Telecomunicações, Φ Fondazione Bruno Kessler, Instituto Superior Técnico, Carnegie Mellon University Υ University of Maryland, T TransPerfect
CMU Machine Learning2
0
5
Only half of speech translation interactions are rated as usable, revealing critical usability gaps that standard evaluations overlook.
Even with objective, programmatically verifiable rubrics, LLM judges are 50% more likely to incorrectly favor their own outputs, revealing a persistent self-preference bias that skews LLM evaluations.