Mar 11, 2026arXiv:2603.10436

COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

M. Anwar, Anuradha Ravi, Indrajeet Ghosh, Gaurav Shinde, C. Busart, Nirmalya Roy

AI Summary

COHORT, a collaborative DNN inference framework for multi-robot systems, was developed to address the challenges of deploying large DNNs on resource-constrained robots in mission-critical scenarios. The framework uses a hybrid offline-online RL approach, combining offline RL policy learning with Advantage-Weighted Regression (AWR) and online policy adaptation via Multi-Agent PPO (MAPPO). Experiments with vision-language models (CLIP and SAM) demonstrate that COHORT reduces battery consumption by 15.4% and increases GPU utilization by 51.67%, while satisfying frame-rate and deadline constraints 2.55 times more often than baseline methods.

Key Contribution

Multi-robot systems can slash battery consumption by 15% and boost GPU utilization by 50% for large DNN inference by using a hybrid offline-online reinforcement learning strategy to dynamically schedule and distribute DNN module execution.

Abstract

Large deep neural networks (DNNs), especially transformer-based and multimodal architectures, are computationally demanding and challenging to deploy on resource-constrained edge platforms like field robots. These challenges intensify in mission-critical scenarios (e.g., disaster response), where robots must collaborate under tight constraints on bandwidth, latency, and battery life, often without infrastructure or server support. To address these limitations, we present COHORT, a collaborative DNN inference and task-execution framework for multi-robot systems built on the Robotic Operating System (ROS). COHORT employs a hybrid offline-online reinforcement learning (RL) strategy to dynamically schedule and distribute DNN module execution across robots. Our key contributions are threefold: (a) Offline RL policy learning combined with Advantage-Weighted Regression (AWR), trained on auction-based task allocation data from heterogeneous DNN workloads across distributed robots, (b) Online policy adaptation via Multi-Agent PPO (MAPPO), initialized from the offline policy and fine-tuned in real time, and (c) comprehensive evaluation of COHORT on vision-language model (VLM) inference tasks such as CLIP and SAM, analyzing scalability with increasing robot/workload and robustness under . We benchmark COHORT against genetic algorithms and multiple RL baselines. Experimental results demonstrate that COHORT reduces battery consumption by 15.4% and increases GPU utilization by 51.67%, while satisfying frame-rate and deadline constraints 2.55 times of the time.

Distributed Systems & Hardware Inference & Quantization Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References48

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time Constraints

Related Papers