DeepAuto.aiKAISTMay 28, 2026arXiv:2605.29250

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Jinheon Baek, Soyeong Jeong, Sangwoo Park, Woongyeong Yeo, Woongyeong Yeo, Minki Kang, Patara Trirat, Heejun Lee, S. Hwang

AI Summary

OmniRetrieval is introduced as a framework for unified retrieval across heterogeneous knowledge sources, including text, relational tables, and knowledge graphs, by dispatching natural language queries to source-native query engines. This approach avoids homogenizing data, preserving the structural affordances of each source. Experiments across 13 datasets and 309 knowledge bases show that OmniRetrieval outperforms single-source baselines, demonstrating its effectiveness as a general-purpose interface.

Key Contribution

Stop forcing all your data into one format – OmniRetrieval lets you query across text, tables, and graphs using their native structures, and it actually works better.

Abstract

Real-world information needs require access to structurally diverse knowledge sources, from unstructured text and relational tables to knowledge graphs and property graphs. Existing retrievers, however, operate over one source at a time under a fixed query language, leaving the broader landscape of available knowledge fragmented behind incompatible interfaces. A natural attempt at unification would collapse these sources into a shared space, but this erases the structural affordances (such as schemas, ontologies, compositional operators) that give each source its expressive power. Effective retrieval over diverse knowledge, therefore, requires not homogenization but an overarching layer that meets each source on its own terms. To achieve this, we present OmniRetrieval, a framework that takes any natural-language query, identifies appropriate knowledge sources, and dispatches source-native queries to their native execution engines. Across an extensive benchmark spanning 13 datasets and 309 distinct knowledge bases over text, relational, and graph-structured sources, OmniRetrieval exceeds single-source baselines, demonstrating that it can serve as a general-purpose interface to the heterogeneous sources while preserving the structural distinctions that make each source valuable.

Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References51

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Related Papers