Search papers, labs, and topics across Lattice.
This paper introduces the concept of "locality radius" to quantify the minimum structural neighborhood needed for prediction in relational schema tasks. They hypothesize and empirically demonstrate that GNN performance on database learning tasks depends on the alignment between the task's locality radius and the model's aggregation depth. Through experiments on tasks like foreign key prediction and join cost estimation, they show a consistent bias-radius alignment effect, suggesting that multi-hop reasoning is not always necessary for relational tasks.
GNNs for relational data are often overkill: performance peaks when a model's aggregation depth matches the task's "locality radius," suggesting simpler models can be optimal.
Foreign key discovery and related schema-level prediction tasks are often modeled using graph neural networks (GNNs), implicitly assuming that relational inductive bias improves performance. However, it remains unclear when multi-hop structural reasoning is actually necessary. In this work, we introduce locality radius, a formal measure of the minimum structural neighborhood required to determine a prediction in relational schemas. We hypothesize that model performance depends critically on alignment between task locality radius and architectural aggregation depth. We conduct a controlled empirical study across foreign key prediction, join cost estimation, blast radius regression, cascade impact classification, and additional graph-derived schema tasks. Our evaluation includes multi-seed experiments, capacity-matched comparisons, statistical significance testing, scaling analysis, and synthetic radius-controlled benchmarks. Results reveal a consistent bias-radius alignment effect.