Search papers, labs, and topics across Lattice.
This paper investigates the impact of domain shift on skeleton-based action recognition when transitioning from controlled 3D skeleton data to unconstrained 2D pose estimation in real-world gym environments. They introduce a new Gym2D dataset and use UCF101 to evaluate the performance of a Skeleton Transformer, finding a significant drop in accuracy during zero-shot transfer. The study reveals that standard uncertainty estimation methods fail to reliably detect out-of-distribution samples, but a finetuned gating mechanism can improve calibration and enable safer selective classification.
Even with high AUROC scores for OOD detection, skeleton-based action recognition models can remain confidently incorrect when faced with domain shift, highlighting the limitations of standard uncertainty measures for safe deployment.
The practical deployment gap -- transitioning from controlled multi-view 3D skeleton capture to unconstrained monocular 2D pose estimation -- introduces a compound domain shift whose safety implications remain critically underexplored. We present a systematic study of this severe domain shift using a novel Gym2D dataset (style/viewpoint shift) and the UCF101 dataset (semantic shift). Our Skeleton Transformer achieves 63.2% cross-subject accuracy on NTU-120 but drops to 1.6% under zero-shot transfer to the Gym domain and 1.16% on UCF101. Critically, we demonstrate that high Out-Of-Distribution (OOD) detection AUROC does not guarantee safe selective classification. Standard uncertainty methods fail to detect this performance drop: the model remains confidently incorrect with 99.6% risk even at 50% coverage across both OOD datasets. While energy-based scoring (AUROC >= 0.91) and Mahalanobis distance provide reliable distributional detection signals, such high AUROC scores coexist with poor risk-coverage behavior when making decisions. A lightweight finetuned gating mechanism restores calibration and enables graceful abstention, substantially reducing the rate of confident wrong predictions. Our work challenges standard deployment assumptions, providing a principled safety analysis of both semantic and geometric skeleton recognition deployment.