Junyi Jessy Li

The University of Texas at Austin

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (5)Natural Language Processing (4)Constitutional AI & AI Ethics (1)Interpretability & Mechanistic Interp (1)

Frequent co-authors

Hongli Zhan (2)Javier Hernandez (2)Jina Suh (2)Joydeep Biswas (1)

Papers (5)

Apr 15, 2026

2w ago·also Oregon State, Sony AI, UAlberta, UMich

AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

AI-generated peer reviews aren't just viable at scale, they're preferred by researchers over human reviews for technical accuracy and actionable feedback.

Joydeep Biswas, Sheila Schoepp, Gautham Vasan +10

Eval Frameworks & Benchmarks Natural Language Processing

Apr 13, 2026

2w ago·also Microsoft Research, UW

Discourse Diversity in Multi-Turn Empathic Dialogue

LLMs are twice as likely as humans to repeat the same support tactic in a conversation, but a simple RL reward for tactic novelty can fix it.

Hongli Zhan, Emma S. Gueorguieva, Javier Hernandez +2

Eval Frameworks & Benchmarks Natural Language Processing

Apr 9, 2026

3w ago·also Microsoft Research

AI generates well-liked but templatic empathic responses

Turns out, LLMs aren't actually empathic, they're just really good at regurgitating a well-liked empathy template.

Emma S. Gueorguieva, Emma Gueorguieva, Hongli Zhan +8

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Apr 7, 2026

3w ago·also UT Austin

When to Call an Apple Red: Humans Follow Introspective Rules, VLMs Don't

VLMs may ace the color coverage test, but they flunk the "do as I say, not as I do" test, routinely ignoring their own stated reasoning rules in ways that humans don't.

Jonathan Nemitz, Carsten Eickhoff, Junyi Jessy Li +3

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Multimodal Models

Mar 10, 2026

Meta AIMar 10, 2026·also UT Austin

CREATE: Testing LLMs for Associative Creativity

LLMs struggle to generate diverse and specific connections between concepts, even with high token budgets and "thinking" prompts, revealing a gap in creative associative reasoning.

Manya Wadhwa, Tiasa Singha Roy, Harvey Lederman +3

Eval Frameworks & Benchmarks Natural Language Processing Reasoning & Chain-of-Thought

Search

Junyi Jessy Li

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)