Search papers, labs, and topics across Lattice.
BioGait-VLM, a novel tri-modal Vision-Language-Biomechanics framework, is introduced to improve the generalization and interpretability of video-based clinical gait analysis by incorporating temporal evidence distillation and biomechanical tokenization. The biomechanical tokenization branch projects 3D skeleton sequences into language-aligned semantic tokens, enabling the model to reason about joint mechanics independently of visual shortcuts. Evaluated on a unified 8-class gait taxonomy (including a new Degenerative Cervical Myelopathy cohort), BioGait-VLM achieves state-of-the-art recognition accuracy and improved clinical plausibility, as confirmed by a blinded expert study.
By explicitly modeling joint mechanics with language-aligned tokens, BioGait-VLM prevents gait analysis models from overfitting to visual shortcuts and unlocks improved generalization and interpretability.
Video-based Clinical Gait Analysis often suffers from poor generalization as models overfit environmental biases instead of capturing pathological motion. To address this, we propose BioGait-VLM, a tri-modal Vision-Language-Biomechanics framework for interpretable clinical gait assessment. Unlike standard video encoders, our architecture incorporates a Temporal Evidence Distillation branch to capture rhythmic dynamics and a Biomechanical Tokenization branch that projects 3D skeleton sequences into language-aligned semantic tokens. This enables the model to explicitly reason about joint mechanics independent of visual shortcuts. To ensure rigorous benchmarking, we augment the public GAVD dataset with a high-fidelity Degenerative Cervical Myelopathy (DCM) cohort to form a unified 8-class taxonomy, establishing a strict subject-disjoint protocol to prevent data leakage. Under this setting, BioGait-VLM achieves state-of-the-art recognition accuracy. Furthermore, a blinded expert study confirms that biomechanical tokens significantly improve clinical plausibility and evidence grounding, offering a path toward transparent, privacy-enhanced gait assessment.