ByteDanceMay 26, 2026arXiv:2605.27366

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Huawei Lin, Peng Li, Jie Song, Fuxin Jiang, Tieying Zhang

AI Summary

MUSE-Autoskill is introduced, a framework enabling LLM agents to continuously improve task-solving by managing skills through a lifecycle of creation, memory, management, evaluation, and refinement. The framework incorporates skill-level memory to accumulate experience for each skill across tasks, improving reuse and adaptation. Experiments on SkillsBench demonstrate improved task success, efficiency, reuse, and cross-agent transfer when skills are treated as long-lived, experience-aware, and testable assets.

Key Contribution

LLM agents can substantially improve their task-solving abilities by treating skills as long-lived, experience-aware, and testable assets within a managed lifecycle.

Abstract

Large language model (LLM) agents rely on reusable skills to solve complex tasks. However, existing skill creation approaches treat skills as isolated and static artifacts, limiting their reusability, reliability, and long-term improvement. We propose MUSE-Autoskill Agent (Memory-Utilizing Skill Evolution), a skill-centric agent framework that lets agents continuously improve their task-solving capability by creating, reusing, and refining skills under a unified lifecycle (creation, memory, management, evaluation, and refinement). Our framework enables agents to create skills on demand, store and reuse them across tasks, organize and select them efficiently, and evaluate them through unit tests and runtime feedback for continuous refinement. We further introduce skill-level memory that accumulates experience for each skill across tasks, enabling more effective reuse and adaptation over time. Experiments on SkillsBench provide initial evidence that lifecycle-managed skills can improve task success, efficiency, reuse, and cross-agent transfer, highlighting the importance of treating skills as long-lived, experience-aware, and testable assets.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Related Papers