Meta AIThai Nguyen UniversityThai Nguyen University of InformationFeb 1, 2026

Parameter-efficient fine-tuning of small language models for code generation: a comparative study of Gemma, Qwen 2.5 and Llama 3.2

Van-Viet Nguyen, The-Vinh Nguyen, Huu-Khanh Nguyen, Duc-Quang Vu

AI Summary

This paper investigates the effectiveness of parameter-efficient fine-tuning (PEFT) techniques, specifically LoRA and QLoRA, for adapting small language models (SLMs) with fewer than 3 billion parameters to code generation tasks. The study fine-tunes Gemma, Qwen 2.5, and Llama 3 models on the CodeAlpaca-20k dataset. Results demonstrate that PEFT significantly improves the performance of SLMs, surpassing larger baseline models like Phi-3 Mini 4K base in ROUGE-L, with LLaMA 3 3B and Qwen2.5 3B showing performance gains of 54% and 55% respectively.

Key Contribution

Forget huge models: parameter-efficient fine-tuning turns tiny language models into code-generating powerhouses that outperform larger, untuned counterparts.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities in code generation; however, their high computational demands, privacy limitations, and challenges in edge deployment restrict their practical use in domain-specific applications. This study explores the effectiveness of parameter efficient fine-tuning for small language models (SLMs) with fewer than 3 billion parameters. We adopt a hybrid approach that combines low-rank adaptation (LoRA) and 4-bit quantization (QLoRA) to reduce fine-tuning costs while preserving semantic consistency. Experiments on the CodeAlpaca-20k dataset reveal that SLMs fine-tuned with this method outperform larger baseline models, including Phi-3 Mini 4K base, in ROUGE-L. Notably, applying our approach to the LLaMA 3 3B and Qwen2.5 3B models yielded performance improvements of 54% and 55%, respectively, over untuned counterparts. We evaluate models developed by major artificial intelligence (AI) providers Google (Gemma 2B), Meta (LLaMA 3 1B/3B), and Alibaba (Qwen2.5 1.5B/3B) and show that parameter-efficient fine-tuning enables them to serve as cost-effective, high-performing alternatives to larger LLMs. These findings highlight the potential of SLMs as scalable solutions for domain-specific software engineering tasks, supporting broader adoption and democratization of neural code synthesis.

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueInternational Journal of Electrical and Computer Engineering (IJECE)

Related Papers

Finding related papers...

Search

Parameter-efficient fine-tuning of small language models for code generation: a comparative study of Gemma, Qwen 2.5 and Llama 3.2

Related Papers