Dingkang Yang

ByteDance {zhs, ylliu, xbai}@hust.edu.cn, jingquntang@bytedance.com https://github.com/CIawevy/TextPecker

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (2)Multimodal Models (2)Computer Vision (1)Data Curation & Synthetic Data (1)

Frequent co-authors

Hanshen Zhu (1)Yuliang Liu (1)An-Lan Wang (1)Anlan Wang (1)

Papers (2)

Feb 24, 2026

Feb 24, 2026·also HUST

TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering

Even state-of-the-art text-to-image models like Qwen-Image can be significantly improved in structural fidelity and semantic alignment of rendered text using a novel RL strategy that rewards structural anomaly quantification.

Hanshen Zhu, Yuliang Liu, An-Lan Wang +6

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Sep 17, 2025

Sep 17, 2025·also Southwest Minzu

SAIL-VL2 Technical Report

Open-sourcing SAIL-VL2 gives the multimodal community a new SOTA vision-language model under 4B parameters, driven by innovations in data curation, progressive training, and sparse MoE architectures.

Weijie Yin, Yongjie Ye, Fangxun Shu +116

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Multimodal Models

Search

Dingkang Yang

Research focus

Frequent co-authors

Papers (2)