May 25, 2026arXiv:2605.25771

MDGMIX: Boundary-Aware Subgraph Mixing for Multi-Domain Graph Pre-Training

Ziyu Zheng, Yaming Yang, Ziyu Guan, Xinyan Huang

AI Summary

The paper introduces MDGMIX, a multi-domain graph pre-training framework designed to address data redundancy and computational costs in cross-domain generalization. MDGMIX constructs mixed-domain subgraphs by selecting boundary nodes and uses hierarchical discrimination losses to decouple shared and domain-specific patterns. Experiments show MDGMIX outperforms baselines in few-shot classification tasks with improved time and memory efficiency, suggesting boundary-aware subgraph mixing is an effective pre-training strategy.

Key Contribution

Multi-domain graph pre-training suffers from significant data redundancy; MDGMIX leverages boundary-aware subgraph mixing to achieve superior few-shot performance with improved efficiency.

Abstract

Multi-domain graph pre-training is a crucial step in constructing foundational graph models with cross-domain generalization capabilities. However, existing methods predominantly rely on jointly training all source domain graphs, resulting in high computational costs. Furthermore, it remains unclear whether all source domain graph data contribute equally to effective transfer. This paper empirically reveals significant data redundancy in multi-domain graph pre-training. Based on this finding, we propose the Multi-domain Graph Pre-training Framework, MDGMIX, which combines boundary-aware subgraph mixing with hierarchical discrimination. By selecting boundary nodes to construct challenging mixed-domain subgraphs, MDGMIX employs coarse-grained domain discrimination and fine-grained domain decomposition losses to decouple shared patterns from domain-specific patterns. During adaptation, MDGMIX employs a lightweight prompt weighting mechanism to transfer source domain knowledge. Extensive experiments demonstrate that MDGMIX consistently outperforms strong baselines in few-shot classification tasks while exhibiting superior time and memory efficiency. The code is available at: https://github.com/zhengziyu77/MDGMIX.

Architecture Design (Transformers, SSMs, MoE)Data Curation & Synthetic Data Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

MDGMIX: Boundary-Aware Subgraph Mixing for Multi-Domain Graph Pre-Training

Related Papers