Search papers, labs, and topics across Lattice.
3
0
2
5
LLMs are surprisingly good at pinpointing what's *wrong* with student writing, even outperforming human graders in identifying relative weaknesses.
LLMs beat rule-based systems at understanding nuanced grammar in language learners, but good old-fashioned rules still win on pure syntax.
LLMs aren't equally reliable as NLG evaluators, but a Bradley-Terry extension called BT-sigma can learn judge reliability from pairwise comparisons alone, improving ranking accuracy without human supervision.