Lattice AI Research

Research focus

Eval Frameworks & Benchmarks (2)Natural Language Processing (1)Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)

Frequent co-authors

Andrew Feng (1)Yu-Wei Luo (1)Lin Fan (1)Yilin Zhou (1)

Papers (2)

Apr 21, 2026

Tsinghua AIApr 21, 2026

HoWToBench: Holistic Evaluation for LLM's Capability in Human-level Writing using Tree of Writing

LLM-as-a-judge can be made far more reliable by explicitly modeling the aggregation weights of sub-features in a tree structure, achieving near-human agreement on complex writing tasks.

Andrew Feng, Cunxiang Wang, Yu-Wei Luo +4

Eval Frameworks & Benchmarks Natural Language Processing Reasoning & Chain-of-Thought

Mar 5, 2026

Tsinghua AIMar 5, 2026·also Westlake, Zhipu

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Current judge models for instruction-following are surprisingly unreliable, but a new benchmark exposes their flaws and offers a path to better alignment.

Bosi Wen, Bosi Wen, Yilin Niu +9

Eval Frameworks & Benchmarks RLHF & Preference Learning

Search

Cunxiang Wang

Research focus

Frequent co-authors

Papers (2)