May 5, 2026arXiv:2605.03275

Beyond Similarity Search: A Unified Data Layer for Production RAG Systems

Venkata Krishna Prasanth Budigi, Siri Chandana Sirigiri

AI Summary

This paper analyzes the performance gap between prototype and production RAG systems, attributing it to data staleness, tenant data leakage, and query composition explosion stemming from split-system data layers. It proposes a unified data layer built on PostgreSQL with pgvector and HNSW indexing to address these issues. Benchmarks on 50,000 documents demonstrate significant latency reductions (up to 92%) and the elimination of data leakage compared to conventional approaches.

Key Contribution

Ditch the brittle RAG stack: a unified PostgreSQL data layer slashes latency by up to 92% and eliminates data leakage, making production RAG finally reliable.

Abstract

Retrieval-Augmented Generation (RAG) systems have become the standard architecture for grounding large language models in organizational knowledge. Yet production deployments consistently expose a gap between clean prototype performance and real-world reliability. This paper identifies three root causes of that gap: data staleness, tenant data leakage, and query composition explosion. All three trace back to the conventional split-system data layer. We propose and evaluate a unified data layer built on PostgreSQL with native vector search (pgvector) and HNSW indexing. Controlled benchmarks on 50,000 documents show 92% latency reduction for date-filtered queries, 74% for tenant-scoped queries, zero synchronization inconsistency, and complete elimination of cross-tenant data leakage with 93% less synchronization code. We additionally discuss a recommended hybrid tier architecture

Data Curation & Synthetic Data Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...