Tsinghua AIBJUTFudanHamburgHubei University of Chinese MedicineApr 22, 2026arXiv:2604.20721

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

Yutong Shen, Hangxu Liu, Lei Zhang, Penghui Liu, Yinqi Liu, Liuxiang Yang, Tongtong Feng

AI Summary

This paper introduces ALAS, a novel framework for long-horizon human-scene interaction tasks that disentangles environment understanding from skill execution using a dual-stream architecture inspired by the brain's "where-what" pathways. ALAS achieves cross-domain transfer by independently learning spatial relationships and motor patterns, enabling generalization to new environment and skill combinations. Experiments demonstrate a 23% improvement in subtask success rate and a 29% improvement in execution efficiency compared to skill chaining methods.

Key Contribution

Achieve superhuman dexterity: ALAS unlocks robust long-horizon task completion by decoupling environment understanding from motor control, enabling generalization across diverse human-scene interaction scenarios.

Abstract

Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, ALAS can achieve an average subtasks success rate improvement of 23\% and average execution efficiency improvement of 29\%.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

Related Papers