Adelaide UniversityPKUApr 20, 2026arXiv:2604.18106

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion

Chen Zhang, Jiuheng Lin, Zhiyuan Liao, Yansong Feng

AI Summary

This paper introduces TriMix, a novel logit fusion framework that enhances the adaptation of large language models to low-resource languages by dynamically integrating outputs from a small continually pretrained model, high-resource language instruction tuning, and large models. The approach addresses the limitations of Proxy Tuning, which often fails in low-resource contexts due to the overpowering influence of large models on weaker low-resource language competencies. Experimental results demonstrate that TriMix significantly outperforms existing methods across multiple model families and low-resource languages, highlighting the importance of leveraging specialized smaller models for effective language adaptation.

Key Contribution

TriMix reveals that prioritizing small, specialized models can dramatically improve low-resource language adaptation, overturning the assumption that bigger models always lead the way.

Abstract

Adapting large language models (LLMs) to low-resource languages (LRLs) is constrained by the scarcity of task data and computational resources. Although Proxy Tuning offers a logit-level strategy for introducing scaling effects, it often fails in LRL settings because the large model's weak LRL competence might overwhelm the knowledge of specialized smaller models. We thus propose TriMix, a test-time logit fusion framework that dynamically balances capabilities from three different sources: LRL competence from a continually pretrained small model, task competence from high-resource language instruction tuning, and the scaling benefits of large models. It is data- and compute-efficient, requiring no LRL task annotations, and only continual pretraining on a small model. Experiments across four model families and eight LRLs show that TriMix consistently outperforms single-model baselines and Proxy Tuning. Our analysis reveals that prioritizing the small LRL-specialized model's logits is crucial for success, challenging the prevalent large-model-dominant assumption.

Natural Language Processing Scaling Laws & Emergent Abilities Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion

Related Papers