Search papers, labs, and topics across Lattice.
NeedleDB is introduced as an open-source database system that uses generative AI to improve image retrieval accuracy for complex natural language queries. It synthesizes guide images from text queries and performs image-to-image search using multiple vision embedders, aggregated via a weighted rank-fusion strategy with Monte Carlo error bounds. Experiments show NeedleDB achieves up to 93% improvement in Mean Average Precision compared to CLIP-based methods, while maintaining sub-second query latency.
Generative AI can drastically improve image retrieval accuracy for complex queries, outperforming contrastive learning methods by up to 93%.
We demonstrate NeedleDB, an open-source, deployment-ready database system for answering complex natural language queries over image data. Unlike existing approaches that rely on contrastive-learning embeddings (e.g., CLIP), which degrade on compositional or nuanced queries, NeedleDB leverages generative AI to synthesize guide images that represent the query in the visual domain, transforming the text-to-image retrieval problem into a more tractable image-to-image search. The system aggregates nearest-neighbor results across multiple vision embedders using a weighted rank-fusion strategy grounded in a Monte Carlo estimator with provable error bounds. NeedleDB ships with a full-featured command-line interface (needlectl), a browser-based Web UI, and a modular microservice architecture backed by PostgreSQL and Milvus. On challenging benchmarks, it improves Mean Average Precision by up to 93% over the strongest baseline while maintaining sub-second query latency. In our demonstration, attendees interact with NeedleDB through three hands-on scenarios that showcase its retrieval capabilities, data ingestion workflow, and pipeline configurability.