Lemao Liu

Translation metrics can exhibit significant cross-lingual scoring bias, meaning they unfairly penalize or reward translations depending on the language, even when the quality is the same.

Jingxuan Liu, Zhi Qu, Jin Tei +3

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Apr 13, 2026

Pengcheng LaboratoryApr 13, 2026·also DAMO, Astronautics WeChat AI, CAS, CUHK +3

Judge Like Human Examiners: A Weighted Importance Multi-Point Evaluation Framework for Generative Tasks with Long-form Answers

Human-like evaluation of long-form generative AI is now possible, thanks to a new framework that breaks down reference answers into weighted, context-aware scoring points.

Guoxin Yu, Chulun Zhou, Lemao Liu +4

Eval Frameworks & Benchmarks Natural Language Processing

Search

Lemao Liu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)