Search papers, labs, and topics across Lattice.
This paper introduces Ro-SLM, a framework for distilling LLM knowledge into smaller, onboard SLMs for robot task planning and operation code generation. The approach uses LLMs to synthesize training data consisting of diverse task instructions and corresponding ground truth code, augmented with real-world scenarios. LLMs also serve as a reward function during fine-tuning, guiding the SLM to generate effective code. Experiments on UAV operation tasks show Ro-SLM significantly improves SLM performance, approaching that of LLMs.
Onboard SLMs can now rival LLMs in robot task planning and code generation, thanks to a novel LLM-distillation framework that leverages LLMs for data synthesis and reward shaping.
Recent advances in large language models (LLMs) provide robots with contextual reasoning abilities to comprehend human instructions. Yet, current LLM-enabled robots typically depend on cloud-based models or high-performance computing infrastructure, which limit their deployment on robots under unreliable internet environments or with constrained computational resources, such as UAVs and small ground vehicles. Thus, deploying fine-tuned small language models (SLMs) that support onboard deployment offers a promising alternative. This paper introduces Ro-SLM, a framework that enables reliable SLM-driven robot operation by distilling LLMs'knowledge and reasoning. Ro-SLM starts from dataset synthesis by leveraging LLMs to generate diverse task instructions, produce corresponding ground truth code with minimal human assistance, and augment instructions into real-world application scenarios. Ro-SLM is then fine-tuned with the dataset, in which LLM serves as a reward function to guide the training. Extensive experiments on UAV operation tasks demonstrate that Ro-SLM improves the performance of SLM from being incapable of supporting robotic task planning and code generation to achieving performance that approaches LLM.