Manyuan Zhang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (4)Tool Use & Agents (3)Multimodal Models (3)Architecture Design (Transformers, SSMs, MoE) (2)

Frequent co-authors

Kaituo Feng (3)Hongyu Li (3)Yilei Jiang (2)Kaixuan Fan (2)

Papers (6)

Apr 20, 2026

Yilei Jiang +10Apr 20, 2026

OpenGame: Open Agentic Coding for Games

LLMs can now build playable games from scratch, thanks to a new framework that teaches them to scaffold stable architectures and systematically debug integration errors, not just patch syntax.

Yilei Jiang, Jinyuan Hu, Qianyin Xiao +8

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Mar 31, 2026

Shuang Chen +18Mar 31, 2026

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

By tightly coupling reasoning, searching, and generation, Unify-Agent demonstrates that agent-based modeling can substantially improve world knowledge grounding in image synthesis, rivaling closed-source models.

Shuang Chen, Quanxin Shou, Hangting Chen +16

Computer Vision Multimodal Models Tool Use & Agents

Mar 30, 2026

Kaituo Feng +8Mar 30, 2026

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Image generation takes a leap towards real-world knowledge by training an agent that actively searches for and integrates external information, substantially boosting performance on knowledge-intensive tasks.

Kaituo Feng, Manyuan Zhang, Yunlong Lin +6

Computer Vision Multimodal Models Tool Use & Agents

Mar 29, 2026

Meituan LongCat Team +89Mar 29, 2026·also Central South University, LongCat Team, Meituan

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

LongCat-Next shatters the language-centric paradigm by unifying text, vision, and audio into a single autoregressive model with minimal modality-specific design, finally reconciling understanding and generation in discrete vision modeling.

Meituan LongCat Team, Mei Xiao, Bin Xiao +87

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Natural Language Processing

Mar 27, 2026

Tianyun Liu +5Mar 27, 2026·also Corresponding author

AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing

Achieve photorealistic and structurally consistent weather editing for autonomous driving videos without the massive datasets typically required by generative models.

Tianyun Liu, Weitao Xiong, Kunming Luo +3

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Mar 19, 2026

Yue Gong +10Mar 19, 2026

RPiAE: A Representation-Pivoted Autoencoder Enhancing Both Image Generation and Editing

Representation-Pivoted Autoencoders enable diffusion models to generate and edit images with higher fidelity by learning a compressed latent space that preserves the semantics of pre-trained visual representations.

Yue Gong, Hongyu Li, Shanyuan Liu +8

Architecture Design (Transformers, SSMs, MoE)Computer Vision