Search papers, labs, and topics across Lattice.
This paper introduces SkillComposer, a novel framework that enhances agent skill construction by decomposing the process into three learnable operations: create, improve, and merge. By addressing the limitations of traditional one-shot skill extraction, SkillComposer allows language models to evolve skills dynamically at inference time, leading to significant improvements in both task-specific and generalizable performance. Experimental results demonstrate that SkillComposer-4B outperforms a 27B executor by notable margins across various agent and code tasks, showcasing its ability to generalize across unseen domains and tasks.
SkillComposer enables language models to self-evolve skills in real-time, achieving up to +4.5 improvements on agent tasks compared to larger models.
Agent skills, which consist of reusable strategies that guide agent reasoning and action, have shown strong potential for improving model capability at inference time. However, current skill construction methods treat the problem as one-shot extraction, overlooking a fundamental tension: a skill tailored to the specific task fails to transfer, while the abstracted skill often provides insufficient guidance. We attribute this fragility to the absence of explicit mechanisms for skill specification and generalization. To address this gap, we introduce SkillComposer, a framework that decomposes skill construction into three learnable operations: create, improve, and merge. Trained via systematic rejection sampling recipe, SkillComposer enables language models to self-evolve skills at inference time and supports three deployment modes: offline for building generalized libraries, online for task-specific refinement, and hybrid for combining both. Comprehensive experiments on $蟿^2$-Bench, LiveCodeBench v6, and AppWorld show that SkillComposer consistently outperforms baselines. Our SkillComposer-4B improves a 27B executor by up to +4.5 on agent tasks and +3.4 on code tasks, while generalizing across domains and task types unseen during training. Analysis reveals that merge and improve address orthogonal quality dimensions and that skill composition is a transferable meta-ability, providing a practical recipe for skill-augmented inference.