Search papers, labs, and topics across Lattice.
University of Alabama
2
0
4
LLMs' susceptibility to invisible-character prompt injections that flip paper review scores reveals a critical vulnerability in their application to academic peer review.
LLM judges in healthcare show promise for scalable evaluation, but their reliability swings wildly across tasks, demanding careful design and validation before trusting their verdicts.