HKUSTFeb 25, 2026arXiv:2602.21597

NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

Zhongwei Xie, Jiaxin Bai, Shujie Liu, Haoyu Huang, Yufei Li, Yisen Gao, Hong Ting Tsang, Yangqiu Song

AI Summary

The paper introduces NGDB-Zoo, a unified framework for training Neural Graph Databases (NGDBs) that addresses limitations in training efficiency and expressivity by decoupling logical operators from query topologies and integrating semantic priors. NGDB-Zoo transforms the training loop into a dynamically scheduled data-flow execution, enabling multi-stream parallelism and achieving significant throughput improvements. Evaluations on six benchmarks, including large graphs, demonstrate NGDB-Zoo's high GPU utilization and its ability to mitigate representation friction in neuro-symbolic reasoning.

Key Contribution

NGDB-Zoo unlocks up to 6.8x faster training for Neural Graph Databases by decoupling logical operators and integrating semantic priors from pre-trained text encoders, all while maintaining high GPU utilization.

Abstract

Neural Graph Databases (NGDBs) facilitate complex logical reasoning over incomplete knowledge structures, yet their training efficiency and expressivity are constrained by rigid query-level batching and structure-exclusive embeddings. We present NGDB-Zoo, a unified framework that resolves these bottlenecks by synergizing operator-level training with semantic augmentation. By decoupling logical operators from query topologies, NGDB-Zoo transforms the training loop into a dynamically scheduled data-flow execution, enabling multi-stream parallelism and achieving a $1.8\times$ - $6.8\times$ throughput compared to baselines. Furthermore, we formalize a decoupled architecture to integrate high-dimensional semantic priors from Pre-trained Text Encoders (PTEs) without triggering I/O stalls or memory overflows. Extensive evaluations on six benchmarks, including massive graphs like ogbl-wikikg2 and ATLAS-Wiki, demonstrate that NGDB-Zoo maintains high GPU utilization across diverse logical patterns and significantly mitigates representation friction in hybrid neuro-symbolic reasoning.

Distributed Systems & Hardware Reasoning & Chain-of-Thought Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

Related Papers