MilaIDEAFeb 23, 2026arXiv:2602.19969

ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting

Yuxing Tian, Yuxing Tian, Fengran Mo, Weixu Zhang, Weixu Zhang, Yiyan Qi, Jian-Yun Nie, Jian-Yun Nie

AI Summary

The paper introduces ReAttn, a post-hoc attention re-weighting strategy to improve the performance of attention-based re-ranking methods using LLMs. ReAttn addresses the limitations of concentrated attention and lexical bias by incorporating cross-document IDF weighting to down-weight query-overlapping tokens and entropy-based regularization to encourage a more balanced attention distribution. Experiments demonstrate that ReAttn enhances the effectiveness of attention-based re-ranking without requiring additional training or supervision.

Key Contribution

Attention-based re-ranking gets a boost: ReAttn's post-hoc re-weighting tames over-concentration and lexical bias, leading to more accurate and interpretable results without extra training.

Abstract

The strong capabilities of recent Large Language Models (LLMs) have made them highly effective for zero-shot re-ranking task. Attention-based re-ranking methods, which derive relevance scores directly from attention weights, offer an efficient and interpretable alternative to generation-based re-ranking methods. However, they still face two major limitations. First, attention signals are highly concentrated a small subset of tokens within a few documents, making others indistinguishable. Second, attention often overemphasizes phrases lexically similar to the query, yielding biased rankings that irrelevant documents with mere lexical resemblance are regarded as relevant. In this paper, we propose \textbf{ReAttn}, a post-hoc re-weighting strategy for attention-based re-ranking methods. It first compute the cross-document IDF weighting to down-weight attention on query-overlapping tokens that frequently appear across the candidate documents, reducing lexical bias and emphasizing distinctive terms. It then employs entropy-based regularization to mitigate over-concentrated attention, encouraging a more balanced distribution across informative tokens. Both adjustments operate directly on existing attention weights without additional training or supervision. Extensive experiments demonstrate the effectiveness of our method.

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Citation Metrics

Citations0

Influential citations0

References51

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting

Related Papers