Qike Zhang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (5)Computer Vision (4)Red-Teaming & Adversarial Robustness (3)Data Curation & Synthetic Data (1)

Frequent co-authors

Chengyin Hu (5)Jiaju Han (4)Xuemeng Sun (4)Jiahuan Long (4)

Papers (5)

Jul 7, 2026

Cong Su +92w ago·also China University of Petroleum-Beijing at Karamay, Shenzhen Research Institute of Big Data

AirflowAttack: Thermal-Airflow Adversarial Perturbations against Infrared Remote-Sensing Vision-Language Models

AirflowAttack reveals that adversarial perturbations can not only deceive infrared VLMs but also enhance their false confidence in erroneous classifications.

Cong Su, Jiajun Han, Jiaju Han +7

Multimodal Models Red-Teaming & Adversarial Robustness

China University of Petroleum-Beijing at Karamay2w ago·also Guizhou University, Shenzhen Research Institute of Big Data

MonoIR-RS: Infrared Remote Sensing Vision-Language Learning with CLIP and VLM Adaptation

Infrared-aware adaptation can boost CLIP performance by over 12 points, transforming how models interpret thermal imagery.

Jiaju Han, Jiajun Han, Ma Yaqi +9

Computer Vision Multimodal Models

Jun 15, 2026

China University of Petroleum-Beijing at KaramayJun 15, 2026·also Shenzhen Research Institute of Big Data, TJU, UESTC

FusionRS: A Large-Scale RGB-Infrared Remote Sensing Dataset for Dual-Modal Vision-Language Foundation Models

Infrared data, often overlooked, can dramatically enhance vision-language models, as shown by FusionRS's ability to improve dual-modal understanding and captioning performance.

Jiaju Han, Ben Zhang, Xuemeng Sun +5

Computer Vision Data Curation & Synthetic Data Multimodal Models

Apr 14, 2026

Chengyin Hu +6Apr 14, 2026·also Defense Innovation Institute, Intelligent Game and Decision Laboratory

Challenging Vision-Language Models with Physically Deployable Multimodal Semantic Lighting Attacks

VLMs can be easily fooled in the real world by strategically manipulating lighting, causing them to misinterpret scenes and hallucinate nonsensical captions.

Chengyin Hu, Qike Zhang, Xin Wang +4

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Mar 30, 2026

Chengyin Hu +5Mar 30, 2026·also China University of Petroleum-Beijing at Karamay, Shenzhen Research Institute of Big Data

XSPA: Crafting Imperceptible X-Shaped Sparse Adversarial Perturbations for Transferable Attacks on VLMs

VLMs can be devastatingly fooled by modifying less than 2% of image pixels in a fixed, X-shaped pattern, causing them to fail spectacularly across diverse tasks like classification, captioning, and question answering.

Chengyin Hu, Jiaju Han, Xuemeng Sun +3