Jinyang Wu

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (3)Multimodal Models (2)Reasoning & Chain-of-Thought (2)Training Efficiency & Optimization (2)

Frequent co-authors

Fan Zhang (1)Vireo Zhang (1)Shengju Qian (1)Haoxuan Li (1)

Papers (5)

Jun 10, 2026

Fan Zhang +101w ago·also Research Institute, TU Munich, Unicom Data Intelligence

Orchestra-o1: Omnimodal Agent Orchestration

Surpassing existing methods, Orchestra-o1 achieves a 10.3% accuracy improvement on the OmniGAIA benchmark by enabling seamless collaboration across multiple modalities.

Fan Zhang, Vireo Zhang, Shengju Qian +8

Multimodal Models Tool Use & Agents

Jun 8, 2026

1w ago

Late-Layer Fusion is Enough: Dual-Path Vision Token Routing for Multimodal Large Language Models under Visual Saturation

A single late fusion layer is enough to maintain multimodal performance, challenging the need for vision tokens to traverse all layers of a Transformer.

Siyuan Liu, Jinyang Wu

Computer Vision Multimodal Models

May 26, 2026

3w ago·also Tsinghua AI, Beijing Advanced Innovation Center for Future, Hangzhou International Innovation, NTU

Learning to Adapt SFT Data for Better Reasoning Generalization

Mismatched SFT data hurting your LLM's reasoning? DART uses RL to transform it into perfectly aligned training examples, boosting generalization and efficiency.

Lisong Sun, Li Wang, Jinyang Wu +3

Data Curation & Synthetic Data Reasoning & Chain-of-Thought Training Efficiency & Optimization

May 21, 2026

Tsinghua AIMay 21, 2026·also CUHK, Meituan, NTU, Tongji +1

Maestro: Reinforcement Learning to Orchestrate Hierarchical Model-Skill Ensembles

Forget monolithic models: a lightweight RL policy can dynamically orchestrate ensembles of frozen experts to outperform GPT-5 and Gemini-2.5-Pro on multimodal tasks, even generalizing to unseen models and skills.

Jinyang Wu, Guocheng Zhai, Ruihan Jin +6

Natural Language Processing Reasoning & Chain-of-Thought Tool Use & Agents

Apr 2, 2026

Zhengxi Lu +12Apr 2, 2026·also Meituan, ZJU

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

LLM agents can internalize skills via in-context RL, achieving zero-shot autonomous behavior without the token overhead and retrieval noise of traditional methods.

Zhengxi Lu, Zhiyuan Yao, Zhiyuan Yao +10

RLHF & Preference Learning Tool Use & Agents Training Efficiency & Optimization

Search

Jinyang Wu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)