NUSCUHKEastern Institute of TechnologyZJUJun 4, 2026arXiv:2606.06079

SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization

Qi Zhang, Zhaopeng Feng, Xiaonan Shi, Xiaomeng Hu, Chu Liu, Pengjun Xie, Xiaobin Wang, Jieping Ye, Bryan Hooi, Haobo Wang, Junbo Zhao

AI Summary

This paper introduces SkillComposer, a novel framework that enhances agent skill construction by decomposing the process into three learnable operations: create, improve, and merge. By addressing the limitations of traditional one-shot skill extraction, SkillComposer allows language models to evolve skills dynamically at inference time, leading to significant improvements in both task-specific and generalizable performance. Experimental results demonstrate that SkillComposer-4B outperforms a 27B executor by notable margins across various agent and code tasks, showcasing its ability to generalize across unseen domains and tasks.

Key Contribution

SkillComposer enables language models to self-evolve skills in real-time, achieving up to +4.5 improvements on agent tasks compared to larger models.

Abstract

Agent skills, which consist of reusable strategies that guide agent reasoning and action, have shown strong potential for improving model capability at inference time. However, current skill construction methods treat the problem as one-shot extraction, overlooking a fundamental tension: a skill tailored to the specific task fails to transfer, while the abstracted skill often provides insufficient guidance. We attribute this fragility to the absence of explicit mechanisms for skill specification and generalization. To address this gap, we introduce SkillComposer, a framework that decomposes skill construction into three learnable operations: create, improve, and merge. Trained via systematic rejection sampling recipe, SkillComposer enables language models to self-evolve skills at inference time and supports three deployment modes: offline for building generalized libraries, online for task-specific refinement, and hybrid for combining both. Comprehensive experiments on $τ^2$-Bench, LiveCodeBench v6, and AppWorld show that SkillComposer consistently outperforms baselines. Our SkillComposer-4B improves a 27B executor by up to +4.5 on agent tasks and +3.4 on code tasks, while generalizing across domains and task types unseen during training. Analysis reveals that merge and improve address orthogonal quality dimensions and that skill composition is a transferable meta-ability, providing a practical recipe for skill-augmented inference.

Scalable Oversight & Alignment Theory Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SkillComposer: Learning to Evolve Agent Skills for Specification and Generalization

Related Papers