Search papers, labs, and topics across Lattice.
3
0
5
0
Even state-of-the-art multilingual models struggle to tag parts-of-speech in Tajik when trained on isolated words, highlighting the critical role of syntactic context.
Unlock Tajik NLP: a new open-source toolkit delivers a comprehensive pipeline for processing Cyrillic-script Tajik text, complete with datasets and pre-trained embeddings.
Forget scaling laws: QLoRA-tuned Mistral 7B crushes other architectures for low-resource Tajik text generation, highlighting the importance of architecture choice in PEFT.