Search papers, labs, and topics across Lattice.
This paper introduces a flow-matching framework for generating stable and natural human-human co-manipulation motions guided by object affordances and spatial configurations. The method incorporates an adversarial interaction prior to encourage natural poses and realistic interactions, and integrates a stability-driven simulation to refine unstable states via sampling-based optimization and vector field adjustments. Experiments show the proposed approach achieves superior contact accuracy, reduced penetration, and improved distributional fidelity compared to existing human-object interaction baselines.
Generating realistic and stable human co-manipulation motions is now possible by explicitly modeling object affordances and spatial configurations within a flow-matching framework.
Co-manipulation requires multiple humans to synchronize their motions with a shared object while ensuring reasonable interactions, maintaining natural poses, and preserving stable states. However, most existing motion generation approaches are designed for single-character scenarios or fail to account for payload-induced dynamics. In this work, we propose a flow-matching framework that ensures the generated co-manipulation motions align with the intended goals while maintaining naturalness and effectiveness. Specifically, we first introduce a generative model that derives explicit manipulation strategies from the object's affordance and spatial configuration, which guide the motion flow toward successful manipulation. To improve motion quality, we then design an adversarial interaction prior that promotes natural individual poses and realistic inter-person interactions during co-manipulation. In addition, we also incorporate a stability-driven simulation into the flow matching process, which refines unstable interaction states through sampling-based optimization and directly adjusts the vector field regression to promote more effective manipulation. The experimental results demonstrate that our method achieves higher contact accuracy, lower penetration, and better distributional fidelity compared to state-of-the-art human-object interaction baselines. The code is available at https://github.com/boycehbz/StaCOM.