HITHKUSTSEUSUSTechTexas A&MFeb 25, 2026arXiv:2602.21864

DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs

Yanbin Wei, Yanbin Wei, Jiangyue Yan, Jiangyue Yan, Chun Kang, Chun Kang, Yang Chen, Huaizhong Liu, Hua Liu, James T. Kwok, James Kwok, Yu Zhang

AI Summary

The paper introduces DynamicGTR, a framework that dynamically selects the optimal Graph Topology Representation (GTR) for each query to improve Vision-Language Model (VLM) performance on graph question answering tasks. By considering model-specific and task-specific preferences for GTRs (e.g., visual images or text descriptions), DynamicGTR overcomes the limitations of using a single, fixed GTR. Experiments demonstrate that DynamicGTR enhances VLM-based graph algorithm QA performance and transfers effectively from synthetic tasks to real-world applications like link prediction and node classification, without requiring additional training.

Key Contribution

Stop feeding VLMs the same graph representation for every question – DynamicGTR dynamically picks the best one, boosting accuracy and brevity in graph Q&A without retraining.

Abstract

Vision-Language Models (VLMs) have emerged as versatile solutions for zero-shot question answering (QA) across various domains. However, enabling VLMs to effectively comprehend structured graphs and perform accurate, efficient QA remains challenging. Existing approaches typically rely on one single graph topology representation (GTR), such as fixed-style visual images or unified text descriptions. This ``one-size-fits-all''strategy often neglects model-specific and task-specific preferences, resulting in inaccurate or over-lengthy responses to graph-related queries. To address this, we propose the $\mbox{DynamicGTR}$ framework, which dynamically selects the optimal GTR for each query during inference, thereby enhancing the zero-shot graph QA capabilities of VLMs with a customizable accuracy and brevity trade-off. Extensive experiments show that DynamicGTR not only improves VLM-based graph algorithm QA performance but also successfully transfers the experience trained from synthetic graph algorithm tasks to real-world applications like link prediction and node classification, without any additional training. Additionally, DynamicGTR demonstrates strong transferability across tasks, domains, and models, suggesting its potential as a flexible solution for broad graph scenarios.

Eval Frameworks & Benchmarks Multimodal Models Reasoning & Chain-of-Thought

Citation Metrics

Citations0

Influential citations0

References55

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DynamicGTR: Leveraging Graph Topology Representation Preferences to Boost VLM Capabilities on Graph QAs

Related Papers