DAMOSYSUApr 21, 2026arXiv:2604.19218

Thinking Before Matching: A Reinforcement Reasoning Paradigm Towards General Person Re-Identification

Quan Zhang, Jingze Wu, Xiaohua Xie, Jianhuang Lai, Hongbo Chen

AI Summary

This paper introduces ReID-R, a novel reasoning-driven paradigm for person re-identification that incorporates chain-of-thought (CoT) to achieve explicit identity understanding. The approach uses a two-stage process: first, discriminative reasoning warm-up via CoT in a label-free manner, and second, efficient reinforcement learning with non-trivial sampling to construct scene-generalizable data. Experiments show that ReID-R achieves competitive identity discrimination using only 20.9% of the data scale compared to existing methods, while also providing high-quality interpretations.

Key Contribution

Achieve state-of-the-art person re-identification with only 20% of the data by explicitly teaching the model to "think" before matching identities.

Abstract

Learning identity-discriminative representations with multi-scene generality has become a critical objective in person re-identification (ReID). However, mainstream perception-driven paradigms tend to identify fitting from massive annotated data rather than identity-causal cues understanding, which presents a fragile representation against multiple disruptions. In this work, ReID-R is proposed as a novel reasoning-driven paradigm that achieves explicit identity understanding and reasoning by incorporating chain-of-thought into the ReID pipeline. Specifically, ReID-R consists of a two-stage contribution: (i) Discriminative reasoning warm-up, where a model is trained in a CoT label-free manner to acquire identity-aware feature understanding; and (ii) Efficient reinforcement learning, which proposes a non-trivial sampling to construct scene-generalizable data. On this basis, ReID-R leverages high-quality reward signals to guide the model toward focusing on ID-related cues, achieving accurate reasoning and correct responses. Extensive experiments on multiple ReID benchmarks demonstrate that ReID-R achieves competitive identity discrimination as superior methods using only 14.3K non-trivial data (20.9% of the existing data scale). Furthermore, benefit from inherent reasoning, ReID-R can provide high-quality interpretation for results.

Computer Vision Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References52

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Thinking Before Matching: A Reinforcement Reasoning Paradigm Towards General Person Re-Identification

Related Papers