Xiao Wang

Chinese Academy of Sciences, University of Chinese Academy of Sciences

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (4)Architecture Design (Transformers, SSMs, MoE) (3)Natural Language Processing (2)Robotics & Embodied AI (2)

Frequent co-authors

Yizhe Zeng (1)Yunpeng Li (1)Juxin Xiao (1)Yuling Liu (1)

Papers (9)

Apr 8, 2026

2w ago·also BUPT

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

LLMs can be backdoored to "think well but answer wrong," even while generating seemingly correct reasoning traces, making attacks far harder to detect.

Yizhe Zeng, Yunpeng Li, Juxin Xiao +2

Reasoning & Chain-of-Thought Red-Teaming & Adversarial Robustness

Mar 5, 2026

Mar 5, 2026·also CAS

UniPAR: A Unified Framework for Pedestrian Attribute Recognition

Forget training separate models for each pedestrian attribute dataset – a single Transformer can now handle RGB images, video sequences, and even event streams with comparable accuracy to specialized methods.

Minghe Xu, Rouying Wu, Jiarui Xu +6

Computer Vision

Mar 4, 2026

Mar 4, 2026·also Beihang, UB

MistyPilot: An Agentic Fast-Slow Thinking LLM Framework for Misty Social Robots

Social robots can now autonomously orchestrate complex tasks with improved efficiency and emotional alignment, thanks to a novel fast-slow thinking LLM framework.

Xiao Wang, Jingchen Sun, Ifeoma Nwogu +2

Natural Language Processing Robotics & Embodied AI Tool Use & Agents

Mar 2, 2026

Mar 2, 2026·also Beihang, CAS, China Telecom Bestpay

Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning

GTokenLLMs suffer from a text-dominant bias, but RGLM offers a way to fix this by reconstructing graph information directly from the LLM's graph token outputs.

Zhongjian Zhang, Xiao Wang, Mengmei Zhang +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing

Mar 2, 2026·also CAS

Zero-shot Low-Field MRI Enhancement via Diffusion-Based Adaptive Contrast Transport

Achieve state-of-the-art low-field MRI enhancement by explicitly modeling and correcting the intensity distribution shift between low-field and high-field domains using a differentiable Sinkhorn optimal transport module within a diffusion framework.

Muyu Liu, Chenhe Du, Xuanyu Tian +5

Computer Vision Scientific Discovery & Drug Design

Feb 25, 2026

Anhui University (AHU)Feb 25, 2026·also Beihang, CAS, School of Computer Science and Technology

RGB-Event HyperGraph Prompt for Kilometer Marker Recognition based on Pre-trained Foundation Models

Event cameras can significantly boost the robustness of pre-trained OCR models for kilometer marker recognition in challenging metro environments, even under GNSS-denied conditions.

Xiaoyu Xian, Shiao Wang, Xiao Wang +1

Computer Vision Multimodal Models Robotics & Embodied AI

Anhui UniversityFeb 25, 2026·also Beihang, CAS, NJU

NESTOR: A Nested MOE-based Neural Operator for Large-Scale PDE Pre-Training

A nested Mixture-of-Experts architecture lets neural operators pre-trained on diverse PDEs transfer more effectively to downstream tasks.

Dengdi Sun, Xiaoya Zhou, Xiao Wang +3

Architecture Design (Transformers, SSMs, MoE)Scientific Discovery & Drug Design Training Efficiency & Optimization

Feb 15, 2026

Feb 15, 2026·also CAS, HIT

REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

Unlock SOTA performance in long-horizon search tasks with REDSearcher, a framework that slashes the cost of training by strategically synthesizing complex tasks and boosting core LLM capabilities *before* reinforcement learning.

Zheng Chu, Xiao Wang, Jack Hong +8

Recommendation & Information Retrieval Tool Use & Agents Training Efficiency & Optimization

Feb 15, 2026·also CAS

UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^{128} for Unified Multimodal Large Language Model

A single tokenizer, UniWeTok, now handles both high-fidelity image reconstruction and complex semantic understanding for multimodal LLMs, outperforming existing methods with far less training data.

Shaobin Zhuang, Yuang Ai, Weijia Mao +3

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models