Search papers, labs, and topics across Lattice.
This paper investigates the effectiveness of parameter-efficient fine-tuning (PEFT) techniques, specifically LoRA and QLoRA, for adapting small language models (SLMs) with fewer than 3 billion parameters to code generation tasks. The study fine-tunes Gemma, Qwen 2.5, and Llama 3 models on the CodeAlpaca-20k dataset. Results demonstrate that PEFT significantly improves the performance of SLMs, surpassing larger baseline models like Phi-3 Mini 4K base in ROUGE-L, with LLaMA 3 3B and Qwen2.5 3B showing performance gains of 54% and 55% respectively.
Forget huge models: parameter-efficient fine-tuning turns tiny language models into code-generating powerhouses that outperform larger, untuned counterparts.
Large language models (LLMs) have demonstrated impressive capabilities in code generation; however, their high computational demands, privacy limitations, and challenges in edge deployment restrict their practical use in domain-specific applications. This study explores the effectiveness of parameter efficient fine-tuning for small language models (SLMs) with fewer than 3 billion parameters. We adopt a hybrid approach that combines low-rank adaptation (LoRA) and 4-bit quantization (QLoRA) to reduce fine-tuning costs while preserving semantic consistency. Experiments on the CodeAlpaca-20k dataset reveal that SLMs fine-tuned with this method outperform larger baseline models, including Phi-3 Mini 4K base, in ROUGE-L. Notably, applying our approach to the LLaMA 3 3B and Qwen2.5 3B models yielded performance improvements of 54% and 55%, respectively, over untuned counterparts. We evaluate models developed by major artificial intelligence (AI) providers Google (Gemma 2B), Meta (LLaMA 3 1B/3B), and Alibaba (Qwen2.5 1.5B/3B) and show that parameter-efficient fine-tuning enables them to serve as cost-effective, high-performing alternatives to larger LLMs. These findings highlight the potential of SLMs as scalable solutions for domain-specific software engineering tasks, supporting broader adoption and democratization of neural code synthesis.