Shijie Zhang

Gemini Embedding 2's unified multimodal embeddings beat specialized models across diverse tasks and even generalize zero-shot to niche fields like astronomy and culinary arts.

Madhuri Shanbhogue, Madhuri Shanbhogue, Zhe Li +160

Eval Frameworks & Benchmarks Multimodal Models Recommendation & Information Retrieval

Feb 26, 2026

Feb 26, 2026·also Honor Device Co., PKU

Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning

By explicitly modeling and calibrating a model's intrinsic uncertainty, EGPO unlocks significant gains in reasoning performance for RL-trained language models.

Qiannian Zhao, Cheng Yang, Jinhao Jing +7

Reasoning & Chain-of-Thought RLHF & Preference Learning

Feb 25, 2026

Feb 25, 2026·also Tsinghua AI, BUPT, PKU, University of Science and Technology

Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling

LLMs can now explore knowledge graphs on their own, discovering better reasoning paths and outperforming even closed-source models on question answering.

Shiqi Yan, Ruiqi Zhou, Zhengxi Yao +5

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents