Apr 22, 2026arXiv:2604.20357

SignDATA: Data Pipeline for Sign Language Translation

AI Summary

SignDATA is introduced as a config-driven preprocessing toolkit to standardize heterogeneous sign language corpora, addressing the challenges of inconsistent annotation schemas and fragmented preprocessing pipelines. It offers two end-to-end recipes: one for pose-based processing using MediaPipe or MMPose backends, and another for video-based processing with signer-cropped video packaging. The toolkit's validation includes backend comparisons, preprocessing ablations, and privacy-aware video generation, making extractor choice, normalization policy, and privacy tradeoffs explicit and empirically comparable.

Key Contribution

Standardizing sign language data preprocessing with SignDATA enables reproducible research and explicit control over extractor choice, normalization, and privacy.

Abstract

Sign-language datasets are difficult to preprocess consistently because they vary in annotation schema, clip timing, signer framing, and privacy constraints. Existing work usually reports downstream models, while the preprocessing pipeline that converts raw video into training-ready pose or video artifacts remains fragmented, backend-specific, and weakly documented. We present SignDATA, a config-driven preprocessing toolkit that standardizes heterogeneous sign-language corpora into comparable outputs for learning. The system supports two end-to-end recipes: a pose recipe that performs acquisition, manifesting, person localization, clipping, cropping, landmark extraction, normalization, and WebDataset export, and a video recipe that replaces pose extraction with signer-cropped video packaging. SignDATA exposes interchangeable MediaPipe and MMPose backends behind a common interface, typed job schemas, experiment-level overrides, and per-stage checkpointing with config- and manifest-aware hashes. We validate the toolkit through a research-oriented evaluation design centered on backend comparison, preprocessing ablations, and privacy-aware video generation on datasets. Our contribution is a reproducible preprocessing layer for sign-language research that makes extractor choice, normalization policy, and privacy tradeoffs explicit, configurable, and empirically comparable.Code is available at https://github.com/balaboom123/signdata-slt.

Computer Vision Data Curation & Synthetic Data Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SignDATA: Data Pipeline for Sign Language Translation

Related Papers