Central South UniversityFitX Technology (Hong Kong) LimitedHITApr 20, 2026arXiv:2604.17914

Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors

Yingjie Feng, Jiaze Wang, Anfeng Liu, Zhuotao Tian

AI Summary

This paper introduces TranCLR, a self-supervised contrastive learning framework for skeleton-based action recognition that addresses the limitations of binary contrastive objectives by explicitly modeling the continuous geometry of the action space. They propose Action Transitional Anchor Construction (ATAC) to model transitional states and Multi-Level Geometric Manifold Calibration (MGMC) to adaptively calibrate the action manifold. Experiments on NTU RGB+D, NTU RGB+D 120 and PKU-MMD datasets show that TranCLR achieves superior accuracy and calibration performance, learning continuous and uncertainty-aware skeleton representations.

Key Contribution

Stop treating human motion as a series of discrete actions: TranCLR learns smoother, more accurate skeleton representations by explicitly modeling the continuous transitions between actions.

Abstract

Self-supervised contrastive learning has emerged as a powerful paradigm for skeleton-based action recognition by enforcing consistency in the embedding space. However, existing methods rely on binary contrastive objectives that overlook the intrinsic continuity of human motion, resulting in fragmented feature clusters and rigid class boundaries. To address these limitations, we propose TranCLR, a Transitional anchor-based Contrastive Learning framework that captures the continuous geometry of the action space. Specifically, the proposed Action Transitional Anchor Construction (ATAC) explicitly models the geometric structure of transitional states to enhance the model's perception of motion continuity. Building upon these anchors, a Multi-Level Geometric Manifold Calibration (MGMC) mechanism is introduced to adaptively calibrate the action manifold across multiple levels of continuity, yielding a smoother and more discriminative representation space. Extensive experiments on the NTU RGB+D, NTU RGB+D 120 and PKU-MMD datasets demonstrate that TranCLR achieves superior accuracy and calibration performance, effectively learning continuous and uncertainty-aware skeleton representations. The code is available at https://github.com/Philchieh/TranCLR.

Computer Vision

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Beyond Binary Contrast: Modeling Continuous Skeleton Action Spaces with Transitional Anchors

Related Papers