Xin Wang

Predicting drug synergy for novel compounds just got a whole lot better with a new GraphLLM that bridges the gap between molecular structure and semantic understanding.

Xin Wang, Linxin Xiao, Yang Yao +1

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Scientific Discovery & Drug Design

May 25, 2026

May 25, 2026·also ASU, UCF, UNC, Vienna

Detecting Unfaithful Chain-of-Thought via Circuit-Guided Internal-External Discrepancy

Spotting unfaithful reasoning in LLMs just got easier: a new method efficiently compares a model's internal computations against its stated rationale.

Zhen Tan, Song Wang, Pingjun Hong +2

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought

May 25, 2026·also Fuzhou University Affiliated Provincial, Macao Polytechnic University, Netherlands Cancer Institute, The Netherlands Cancer Institute

SAFE-Diff: Scale-Aware Attention and Feature-Dispersive Diffusion with Uncertainty Estimation for Contrast-Enhanced Breast MRI Synthesis

Synthesized breast MRIs can now better mimic real-world lesion complexity thanks to a diffusion model that explicitly handles multi-scale features and heterogeneous enhancement.

Tianyu Zhang, Xinglong Liang, Jarek van Dijk +13

Computer Vision Data Curation & Synthetic Data Scientific Discovery & Drug Design

May 25, 2026·also TikTok

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Distilling black-box video generators into autoregressive models doesn't require teacher scores or complex alignment—just cleverly paired rollouts and a discriminator.

Shengju Qian, Zirui Zhu, Xin Wang +1

Architecture Design (Transformers, SSMs, MoE)Computer Vision Inference & Quantization

May 22, 2026

May 22, 2026·also Beihang, ByteDance, Case Western, CUHK +4

SkillEvolBench: Benchmarking the Evolution from Episodic Experience to Procedural Skills

LLM agents struggle to generalize from experience to reusable skills, often performing worse than simply replaying past trajectories, revealing a critical gap in current abstraction methods.

Yingtie Lei, Zhongwei Wan, Jiankun Zhang +10

Eval Frameworks & Benchmarks Robotics & Embodied AI Tool Use & Agents

Apr 22, 2026

Paul Dobre +2Apr 22, 2026

FurnSet: Exploiting Repeats for 3D Scene Reconstruction

Reconstructing 3D scenes from a single view gets a boost by explicitly recognizing and leveraging repeated object instances, like chairs and tables, to inform and refine the reconstruction.

Paul Dobre, Xin Wang, Hongzhou Yang

Architecture Design (Transformers, SSMs, MoE)Computer Vision

He Yang Yuan +5Apr 22, 2026·also SJTU

Towards Secure Logging: Characterizing and Benchmarking Logging Code Security Issues with LLMs

LLMs are surprisingly bad at fixing real-world logging security vulnerabilities, despite being moderately effective at detecting them.

He Yang Yuan, Xin Wang, Kundi Yao +3

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Apr 16, 2026

Zhiyuan Zhai +5Apr 16, 2026·also CUHK

Does RL Expand the Capability Boundary of LLM Agents? A PASS@(k,T) Analysis

RL unlocks genuinely new tool-use capabilities in LLMs by enabling compositional strategies that surpass what's achievable through mere re-sampling, challenging the notion that RL only improves reliability.

Zhiyuan Zhai, Zhiyuan Zhai, Wenjing Yan +3

Eval Frameworks & Benchmarks RLHF & Preference Learning Tool Use & Agents

Zhiyuan Zhai +3Apr 16, 2026

Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization

Stop wasting compute: a learned policy can intelligently allocate LLM inference budgets, boosting accuracy by up to 12.8% compared to uniform allocation.

Zhiyuan Zhai, Bingcong Li, Bingnan Xiao +1

Eval Frameworks & Benchmarks Inference & Quantization Reasoning & Chain-of-Thought

Apr 14, 2026

Chengyin Hu +6Apr 14, 2026·also Defense Innovation Institute, Intelligent Game and Decision Laboratory

Challenging Vision-Language Models with Physically Deployable Multimodal Semantic Lighting Attacks

VLMs can be easily fooled in the real world by strategically manipulating lighting, causing them to misinterpret scenes and hallucinate nonsensical captions.

Chengyin Hu, Qike Zhang, Xin Wang +4

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Apr 13, 2026

Apr 13, 2026·also Fuzhou University Affiliated Provincial, Macao Polytechnic University, Netherlands Cancer Institute, The Netherlands Cancer Institute

LoGo-MR: Screening Breast MRI for Cancer Risk Prediction by Efficient Omni-Slice Modeling

Breast cancer risk prediction from MRI just got a whole lot faster and more interpretable, thanks to a novel 2.5D approach that beats both 2D and 3D models.

Xin Wang, Yuan Gao, George Yiasemis +12

Computer Vision Scientific Discovery & Drug Design

Apr 9, 2026

Yassine El Kheir +17Apr 9, 2026

DeepFense: A Unified, Modular, and Extensible Framework for Robust Deepfake Audio Detection

The best deepfake audio detectors are surprisingly biased by audio quality, speaker gender, and language, undermining their real-world reliability.

Yassine El Kheir, Y. E. Kheir, Arnab Das +15

Eval Frameworks & Benchmarks Open-Source Models & Weights Speech & Audio

Mar 19, 2026

Context Bootstrapped Reinforcement Learning

Injecting demonstrations with a carefully annealed probability can drastically improve exploration in RLVR, even for tasks requiring novel reasoning or domain-specific knowledge.

Saaket Agashe, Jayanth Srinivasa, Gaowen Liu +4

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents+1

Mar 19, 2026

SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization

Forget struggling with cryptic SQL: a new LLM fine-tuned with human preferences generates comments so good, they beat Qwen3-14B by up to 13% on standard metrics.

Lei Yu, Jingyuan Zhang, Xin Wang +4

Code Generation & Program Synthesis Natural Language Processing RLHF & Preference Learning

Search

Xin Wang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (16)