Mar 30, 2026arXiv:2603.28033

Transfer Learning for an Endangered Slavic Variety: Dependency Parsing in Pomak Across Contact-Shaped Dialects

AI Summary

This paper investigates cross-dialect transfer learning for dependency parsing in Pomak, an endangered Slavic language. They quantify the performance drop when transferring a parser trained on Greek-variety Pomak to the Turkish variety due to phonological and morphosyntactic differences. They then demonstrate that fine-tuning on a newly created 650-sentence Turkish-variety Pomak corpus, combined with cross-variety transfer learning, substantially improves parsing accuracy.

Key Contribution

Even a small, targeted dataset can bridge the gap in cross-dialect transfer learning for low-resource languages, significantly boosting dependency parsing accuracy.

Abstract

This paper presents new resources and baselines for Dependency Parsing in Pomak, an endangered Eastern South Slavic language with substantial dialectal variation and no widely adopted standard. We focus on the variety spoken in Turkey (Uzunköprü) and ask how well a dependency parser trained on the existing Pomak Universal Dependencies treebank, which was built primarily from the variety that is spoken in Greece, transfers across dialects. We run two experimental phases. First, we train a parser on the Greek-variety UD data and evaluate zero-shot transfer to Turkish-variety Pomak, quantifying the impact of phonological and morphosyntactic differences. Second, we introduce a new manually annotated Turkish-variety Pomak corpus of 650 sentences and show that, despite its small size, targeted fine-tuning substantially improves accuracy; performance is further boosted by cross-variety transfer learning that combines the two dialects.

Data Curation & Synthetic Data Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Transfer Learning for an Endangered Slavic Variety: Dependency Parsing in Pomak Across Contact-Shaped Dialects

Related Papers