Search papers, labs, and topics across Lattice.
The paper introduces HyperParallel, a novel AI framework designed to efficiently utilize supernode architectures by treating the supernode as a single logical computer. It addresses limitations in existing AI frameworks regarding programming complexity, load imbalance, and memory utilization on these architectures. The framework incorporates HyperOffload for hierarchical memory management, HyperMPMD for fine-grained parallelism, and HyperShard for parallel strategy specification, demonstrating improved training and inference efficiency.
HyperParallel unlocks the potential of supernode architectures for AI by offering a framework that simplifies parallel programming and boosts performance through hardware-aware orchestration.
The emergence of large-scale, sparse, multimodal, and agentic AI models has coincided with a shift in hardware toward supernode architectures that integrate hundreds to thousands of accelerators with ultra-low-latency interconnects and unified memory pools. However, existing AI frameworks are not designed to exploit these architectures efficiently, leading to high programming complexity, load imbalance, and poor memory utilization. In this paper, we propose a supernode-affinity AI framework that treats the supernode as a single logical computer and embeds hardware-aware orchestration into the framework. Implemented in MindSpore, our HyperParallel architecture comprises HyperOffload for automated hierarchical memory management, HyperMPMD for fine-grained MPMD parallelism across heterogeneous workloads, and HyperShard for declarative parallel strategy specification. Together, these techniques significantly improve training and inference efficiency while reducing parallel programming and system tuning overhead, demonstrating the necessity of supernode affinity for next-generation AI frameworks.