NAVER LabsApr 29, 2026arXiv:2604.26483

Efficient Listwise Reranking with Compressed Document Representations

Hervé Déjean, Herv'e D'ejean, St'ephane Clinchant, Stéphane Clinchant

AI Summary

The paper introduces RRK, a listwise reranker that compresses documents into fixed-size embeddings to improve efficiency. Trained via distillation, RRK achieves significant speedups (3x-18x) compared to smaller rerankers while maintaining or improving effectiveness, especially on long documents. This approach leverages rich compressed representations to enable efficient listwise reranking with large language models.

Key Contribution

Forget slow reranking: this new method compresses documents into embeddings, letting an 8B parameter model run up to 18x faster than smaller models with better accuracy.

Abstract

Reranking, the process of refining the output from a first-stage retriever, is often considered computationally expensive, especially when using Large Language Models (LLMs). A common approach to mitigate this cost involves utilizing smaller LLMs or controlling input length. Inspired by recent advances in document compression for retrieval-augmented generation (RAG), we introduce RRK, an efficient and effective listwise reranker compressing documents into multi-token fixed-size embedding representations. Our simple training via distillation shows that this combination of rich compressed representations and listwise reranking yields a highly efficient and effective system. In particular, our 8B-parameter model runs 3x-18x faster than smaller rerankers (0.6-4B parameters) while matching or outperforming them in effectiveness. The efficiency gains are even more striking on long-document benchmarks, where RRK widens its advantage further.

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Efficient Listwise Reranking with Compressed Document Representations

Related Papers