May 26, 2026arXiv:2605.27080

Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning

AI Summary

This paper introduces Disentangled Subspace Contrastive Learning (DSCL), a semi-supervised approach for appearance-based gaze estimation that leverages unlabeled data to improve domain generalization. DSCL uses Jacobian regularization to disentangle feature representations into gaze component-specific subspaces (pitch and yaw) and then applies contrastive learning within each subspace using ordinal ranking. Experiments on multiple benchmarks demonstrate that DSCL achieves competitive performance with significantly reduced labeled data (20%, 10%, and even 5%) in both in-domain and cross-domain settings.

Key Contribution

Gaze estimation models can now achieve comparable accuracy with 80-95% less labeled data, thanks to a semi-supervised approach that disentangles gaze components and learns robust representations via contrastive learning.

Abstract

Appearance-based gaze estimation always suffers from poor generalization due to limited annotated samples and insufficient dataset diversity. Leading approaches adopt weakly supervised learning to generate large-scale pseudo-labeled data from unconstrained real-world scenarios, aiming to mitigate the domain shifts. In this work, we devise a simple yet effective semi-supervised learning architecture that leverages unlabeled data to enhance domain generalization, thereby reducing reliance on labor-intensive manual annotations. Our key insight is to impose Jacobian regularization to disentangle feature representations into discriminative subspaces dedicated to specific gaze components, such as pitch and yaw angles. We further exploit the intrinsic ordinal ranking within each subspace for contrastive learning, enabling the model to learn robust gaze representations from a small set of labeled samples and an abundance of unlabeled ones. This ultimately yields our Disentangled Subspace Contrastive Learning (DSCL) framework. Extensive experiments on multiple benchmarks verify that the proposed DSCL is plug-and-play, achieving competitive performance using only 20\%, 10\%, and even 5\% of the annotated data under both in-domain and cross-domain evaluation settings. The public code is available at \href{https://github.com/da60266/DSCL}{https://github.com/da60266/DSCL}.

Computer Vision Data Curation & Synthetic Data Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Semi-Supervised Gaze Estimation via Disentangled Subspace Contrastive Learning

Related Papers