Search papers, labs, and topics across Lattice.
The paper introduces SyriSign, a new dataset of 1500 video samples covering 150 unique lexical signs for Syrian Arabic Sign Language (SyArSL), addressing the lack of resources for this low-resource language. The dataset is designed to facilitate text-to-SyArSL translation, aiming to improve communication accessibility for the Syrian DHH community. Experiments using MotionCLIP, T2M-GPT, and SignCLIP show the potential of generative approaches for sign representation but also highlight the limitations imposed by the dataset's size.
The first publicly available dataset for Syrian Arabic Sign Language (SyArSL) opens the door for machine translation research to improve accessibility for a historically underserved community.
Sign language is the primary approach of communication for the Deaf and Hard-of-Hearing (DHH) community. While there are numerous benchmarks for high-resource sign languages, low-resource languages like Arabic remain underrepresented. Currently, there is no publicly available dataset for Syrian Arabic Sign Language (SyArSL). To overcome this gap, we introduce SyriSign, a dataset comprising 1500 video samples across 150 unique lexical signs, designed for text-to-SyArSL translation tasks. This work aims to reduce communication barriers in Syria, as most news are delivered in spoken or written Arabic, which is often inaccessible to the deaf community. We evaluated SyriSign using three deep learning architectures: MotionCLIP for semantic motion generation, T2M-GPT for text-conditioned motion synthesis, and SignCLIP for bilingual embedding alignment. Experimental results indicate that while generative approaches show strong potential for sign representation, the limited dataset size constrains generalization performance. We will release SyriSign publicly, hoping it serves as an initial benchmark.