Recommendation & Information Retrieval

Applications

Search systems, recommendation engines, retrieval-augmented generation, dense retrieval, and ranking models.

Keywords

recommendation systeminformation retrievalRAGretrieval augmented generationdense retrievalrankingsearchembedding

Recent Papers

Feb 12, 2026

Department of Artificial2d ago

SAGEO Arena: A Realistic Environment for Evaluating Search-Augmented Generative Engine Optimization

The paper introduces SAGEO Arena, a realistic evaluation environment for Search-Augmented Generative Engine Optimization (SAGEO) that addresses limitations of existing benchmarks by incorporating a full generative search pipeline over a large-scale corpus of web documents with rich structural information. They demonstrate that existing optimization approaches are often impractical and degrade performance in retrieval and reranking stages under realistic conditions. The study highlights the importance of structural information and stage-specific optimization for effective SAGEO.

Introduces SAGEO Arena, a novel benchmark environment enabling realistic, stage-level evaluation of search-augmented generative engine optimization strategies.

Sunghwan Kim, Wooseok Jeong, Serin Kim +22602.12187

Eval Frameworks & BenchmarksRecommendation & Information RetrievalNatural Language Processing

LTCI2d ago

TopoFair: Linking Topological Bias to Fairness in Link Prediction Benchmarks

This paper introduces TopoFair, a benchmarking framework for fair link prediction that focuses on the impact of diverse topological biases beyond homophily. They formalize a taxonomy of topological bias measures and develop a graph generation method that allows for controlled variation of these biases while maintaining real-world graph characteristics. Through empirical evaluation of link prediction models, including fairness-aware methods, they demonstrate the sensitivity of fairness interventions to these structural biases.

Introduces a novel benchmarking framework, TopoFair, to analyze the interplay between topological biases and fairness in link prediction.

Lilian Marey, Tiphaine Viard, Charlotte Laclau2602.11802

Constitutional AI & AI EthicsEval Frameworks & BenchmarksRecommendation & Information Retrieval

2d ago

Bandit Learning in Matching Markets with Interviews

This paper studies bandit learning in two-sided matching markets where agents and firms conduct interviews to learn preferences. The authors introduce strategic deferral, allowing firms to delay hiring decisions and recover from suboptimal matches, and model interviews as low-cost hints that reveal partial preference information. They develop novel algorithms for centralized and decentralized settings that achieve time-independent regret, improving upon logarithmic regret bounds for learning stable matchings without interviews.

Introduces strategic deferral for firms in matching markets, enabling decentralized learning and recovery from suboptimal hires.

Amirmahdi Mirfakhar, Xuchuang Wang, Hedyeh Beyhaghi +12602.12224

Recommendation & Information RetrievalNatural Language Processing

2d ago

RELATE: A Reinforcement Learning-Enhanced LLM Framework for Advertising Text Generation

The paper introduces RELATE, a reinforcement learning framework for end-to-end advertising text generation that directly optimizes for conversion-oriented metrics and compliance constraints. RELATE integrates performance and compliance objectives into the text generation process via policy learning, moving beyond the traditional two-stage generation and alignment paradigm. Experiments on industrial datasets and online deployment show that RELATE significantly improves click-through conversion rate (CTCVR) while adhering to policy constraints.

Introduces an end-to-end reinforcement learning framework, RELATE, that unifies advertising text generation with conversion-oriented objective alignment and compliance constraints.

Jinfang Wang, Jiajie Liu, Jianwei Wu +62602.11780

RLHF & Preference LearningNatural Language ProcessingRecommendation & Information Retrieval

2d ago

IncompeBench: A Permissively Licensed, Fine-Grained Benchmark for Music Information Retrieval

The paper introduces IncompeBench, a new benchmark for Music Information Retrieval (MIR) consisting of 1,574 permissively licensed music snippets, 500 diverse queries, and over 125,000 relevance judgements. This benchmark addresses the lack of high-quality evaluation datasets in MIR, enabling more rigorous and reproducible research. High inter-annotator agreement was achieved through a multi-stage annotation pipeline, ensuring data quality.

Provides IncompeBench, a permissively licensed, fine-grained benchmark dataset to facilitate advancements in music information retrieval.

Benjamin Clavi'e, Atoof Shakir, Sean Lee +22602.11941

Eval Frameworks & BenchmarksMultimodal ModelsRecommendation & Information Retrieval

2d ago

A Subword Embedding Approach for Variation Detection in Luxembourgish User Comments

This paper introduces a subword embedding approach to detect lexical and orthographic variation in user-generated text, specifically addressing the challenges of "noisy" and low-resource settings without relying on normalization or predefined variant lists. The method trains subword embeddings on raw Luxembourgish user comments and clusters related forms using a combination of cosine similarity and n-gram similarity. The results demonstrate the effectiveness of distributional modeling in uncovering meaningful patterns of variation, aligning with existing dialectal and sociolinguistic research.

Introduces a novel subword embedding method that automatically discovers and clusters lexical variations in user-generated text, even in low-resource languages, without requiring prior normalization or predefined variant lists.

Anne-Marie Lutgen, Alistair Plum, Christoph Purschke2602.11795

Natural Language ProcessingData Curation & Synthetic DataRecommendation & Information Retrieval

2d ago

RI-Mamba: Rotation-Invariant Mamba for Robust Text-to-Shape Retrieval

The paper introduces RI-Mamba, a rotation-invariant state-space model for text-to-shape retrieval that addresses the limitations of existing methods in handling objects with arbitrary orientations and diverse categories. RI-Mamba disentangles pose from geometry using global and local reference frames and Hilbert sorting to create rotation-invariant token sequences. The model incorporates orientational embeddings via feature-wise linear modulation and employs cross-modal contrastive learning with automated triplet generation for scalable training, achieving state-of-the-art results on the OmniObject3D benchmark.

Introduces a novel rotation-invariant state-space model, RI-Mamba, for robust text-to-shape retrieval by disentangling pose from geometry and incorporating orientational embeddings.

Dasith de Silva Edirimuni, G. Hassan, Ajmal S. Mian2602.11673

Multimodal ModelsArchitecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

2d ago

ULTRA:Urdu Language Transformer-based Recommendation Architecture

The paper introduces ULTRA, a transformer-based recommendation architecture for Urdu, a low-resource language, to improve personalized news retrieval. ULTRA employs a dual-embedding architecture with a query-length aware routing mechanism to handle varying query lengths, directing queries to either title/headline-level or full-content pipelines. Experiments on a large Urdu news corpus demonstrate that ULTRA achieves over 90% precision compared to single-pipeline baselines, showing improved recommendation relevance.

Introduces a query-adaptive dual-embedding architecture for semantic content recommendation in low-resource languages, dynamically routing queries based on length to optimize retrieval relevance.

Alishbah Bashir, Fatima Qaiser, Ijaz Hussain2602.11836

Architecture Design (Transformers, SSMs, MoE)Natural Language ProcessingRecommendation & Information Retrieval

Bilibili Inc.2d ago

Compress, Cross and Scale: Multi-Level Compression Cross Networks for Efficient Scaling in Recommender Systems

The paper introduces Multi-Level Compression Cross Networks (MLCC) and its multi-channel extension (MC-MLCC) to efficiently model high-order feature interactions in recommender systems. MLCC uses hierarchical compression and dynamic composition to capture feature dependencies with favorable computational complexity, while MC-MLCC decomposes feature interactions into parallel subspaces for efficient horizontal scaling. Experiments on public and industrial datasets demonstrate that MLCC and MC-MLCC outperform DLRM-style baselines, achieving up to 0.52 AUC improvement and up to 26x reduction in parameters and FLOPs, and the approach has been adopted in Bilibili's advertising system.

Introduces a novel feature interaction architecture, MLCC, that uses hierarchical compression and dynamic composition to efficiently capture high-order feature interactions, along with its multi-channel extension, MC-MLCC, for improved scalability.

Heng Yu, Xiangjun Zhou, Heng Zhao +12602.12041

Recommendation & Information RetrievalArchitecture Design (Transformers, SSMs, MoE)Inference & Quantization

2d ago

CitiLink-Minutes: A Multilayer Annotated Dataset of Municipal Meeting Minutes

The paper introduces CitiLink-Minutes, a novel multilayer dataset of 120 European Portuguese municipal meeting minutes from six municipalities, designed to address the lack of annotated datasets for NLP and IR research in this domain. The dataset features over one million tokens with de-identified personal information and includes manual annotations across metadata, subjects of discussion, and voting outcomes. Experiments demonstrate the dataset's utility for downstream tasks like metadata extraction, topic classification, and vote labeling, facilitating transparent access to municipal decisions.

Contributes CitiLink-Minutes, a unique multilayer annotated dataset of municipal meeting minutes, enabling NLP and IR research on local governance.

Ricardo Campos, Ana Filipa Pacheco, Ana Lu´ısa Fernandes +112602.12137

Data Curation & Synthetic DataNatural Language ProcessingRecommendation & Information Retrieval

2d ago

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

The paper investigates how to best pretrain small language models (SLMs) to decide which tokens to predict directly versus delegating to an external source via a special token. They find that loss alone is insufficient for determining optimal delegation, as some high-loss tokens represent acceptable alternative continuations. They introduce LaCy, a pretraining method that uses a spaCy grammar parser to augment the loss signal, enabling SLMs to learn when to delegate and resulting in improved FactScore in cascaded generation setups compared to other methods.

Introduces LaCy, a pretraining method that leverages a spaCy grammar parser to augment the loss signal, enabling SLMs to learn when to delegate token prediction to an external source.

Szilvia Ujv'ary, Louis B'ethune, Pierre Ablin +32602.12005

Eval Frameworks & BenchmarksTool Use & AgentsRecommendation & Information Retrieval

2d ago

Analytical Search

The paper introduces "analytical search" as a new search paradigm tailored for complex analytical information needs, addressing the limitations of relevance-based ranking and retrieval-augmented generation (RAG) in tasks requiring trend analysis, causal inference, and verifiable conclusions. It proposes a system framework that integrates query understanding, recall-oriented retrieval, reasoning-aware fusion, and adaptive verification to support structured, multi-step inference. The authors argue that analytical search offers improved control over reasoning, evidence usage, and verifiability, leading to more accountable and utility-driven results compared to existing search paradigms.

Introduces and formalizes the concept of "analytical search" as a distinct search paradigm designed to address complex analytical information needs by emphasizing evidence-governed, process-oriented workflows.

Shuo Miao, Yiqun Liu, Qingyao Ai2602.11581

Recommendation & Information RetrievalNatural Language Processing

2d ago

SIGHT: Reinforcement Learning with Self-Evidence and Information-Gain Diverse Branching for Search Agent

The paper introduces SIGHT, a reinforcement learning framework designed to improve search-based reasoning in LLMs by mitigating redundancy and noise in search results. SIGHT uses Self-Evidence Support (SES) to distill search results into high-fidelity evidence and employs an Information Gain score to identify pivotal states for Dynamic Prompting Interventions like de-duplication and adaptive branching. By integrating SES and correctness rewards via Group Relative Policy Optimization, SIGHT achieves superior performance on single-hop and multi-hop QA benchmarks with fewer search steps compared to existing methods.

Introduces a novel reinforcement learning framework, SIGHT, that leverages self-evidence support and information-gain driven diverse branching to enhance search-based reasoning in LLMs.

Jinluan Yang, Yiquan Wu, Yi Liu +22602.11551

Tool Use & AgentsReasoning & Chain-of-ThoughtRecommendation & Information Retrieval

2d ago

AlphaPROBE: Alpha Mining via Principled Retrieval and On-graph biased evolution

The paper introduces AlphaPROBE, a novel framework for alpha factor mining in quantitative finance that represents the factor pool as a Directed Acyclic Graph (DAG) to capture the evolutionary relationships between factors. AlphaPROBE employs a Bayesian Factor Retriever to identify promising seed factors and a DAG-aware Factor Generator to produce context-aware and non-redundant optimizations based on the full ancestral trace of factors. Experiments on Chinese stock market datasets demonstrate that AlphaPROBE outperforms existing methods in predictive accuracy, return stability, and training efficiency by leveraging the global evolutionary topology.

Introduces a DAG-based framework for alpha factor mining that explicitly models the evolutionary relationships between factors to improve search efficiency and factor diversity.

Junyu Luo, Hongjun Ding, Jinsheng Huang +12602.11917

Recommendation & Information RetrievalScientific Discovery & Drug Design

2d ago

Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation

This paper investigates the phenomenon of "token overflow" in soft compression architectures for retrieval-augmented generation (RAG), where compressed token representations lose task-relevant information. They propose a methodology to characterize and detect token overflow, evaluating it within the xRAG framework. Their key finding is that lightweight probing classifiers, leveraging both query and context xRAG representations, achieve an average AUC-ROC of 0.72 in detecting overflow across HotpotQA, SQuADv2, and TriviaQA datasets, demonstrating the importance of query-aware detection.

Introduces a methodology using lightweight probing classifiers to detect token overflow in compressed token representations for retrieval-augmented generation by leveraging query and context information.

Julia Belikova, Danila Rozhevskii, Dennis Svirin +22602.12235

Inference & QuantizationRecommendation & Information RetrievalArchitecture Design (Transformers, SSMs, MoE)

2d ago

Benchmarking Vision-Language Models for French PDF-to-Markdown Conversion

This paper introduces a French-focused benchmark for PDF-to-Markdown conversion using VLMs, addressing the lack of evaluation datasets for non-English documents and the over-penalization of formatting variations in existing benchmarks. The benchmark consists of challenging French documents selected via model-disagreement sampling and is evaluated using unit-test-style checks targeting specific failure modes like text presence and reading order, combined with category-specific normalization. Results across 15 models show that proprietary models exhibit higher robustness on handwriting and forms, while open-weight models are competitive on standard layouts.

Introduces a new French-language PDF-to-Markdown benchmark with targeted unit tests and category-specific normalization to more accurately assess VLM performance in RAG pipelines.

Bruno Rigal, Victor Dupriez, Alexis Mignon +22602.11960

Multimodal ModelsEval Frameworks & BenchmarksRecommendation & Information Retrieval

2d ago

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

The paper introduces Hi-SAM, a novel multi-modal recommendation framework designed to address limitations in semantic ID-based approaches, specifically suboptimal tokenization and architecture-data mismatch. Hi-SAM employs a Disentangled Semantic Tokenizer (DST) that uses geometry-aware alignment and coarse-to-fine quantization to separate shared and modality-specific semantics, and a Hierarchical Memory-Anchor Transformer (HMAT) that incorporates hierarchical positional encoding and anchor tokens to better model user-item interactions. Experiments on real-world datasets and a large-scale social platform demonstrate that Hi-SAM outperforms state-of-the-art baselines, particularly in cold-start scenarios, achieving a 6.55% improvement in a core online metric.

Introduces a hierarchical structure-aware multi-modal framework, Hi-SAM, that disentangles cross-modal semantics and modality-specific details during tokenization and incorporates hierarchical positional encoding within a transformer architecture for improved recommendation performance.

Pin-Yu Pan, Tingting Fei, Hongxiang Chen2602.11799

Multimodal ModelsArchitecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

BNU-BNBU Institute of Artificial Intelligence and Future Networks2d ago

Meta-Sel: Efficient Demonstration Selection for In-Context Learning via Supervised Meta-Learning

The paper introduces Meta-Sel, a supervised meta-learning approach for efficient demonstration selection in in-context learning, which addresses the challenge of selecting optimal few-shot examples under a limited prompt budget. Meta-Sel learns a scoring function based on TF-IDF cosine similarity and length-compatibility ratio between candidate demonstrations and queries, trained on a meta-dataset constructed from training data using class agreement as supervision. Empirical evaluation across four intent datasets and five LLMs demonstrates that Meta-Sel achieves competitive accuracy and selection-time overhead compared to 12 other demonstration selection methods, especially benefiting smaller models.

Introduces Meta-Sel, a lightweight supervised meta-learning approach that learns a fast, interpretable scoring function for selecting demonstrations for in-context learning.

Xubin Wang2602.12123

Natural Language ProcessingTraining Efficiency & OptimizationRecommendation & Information Retrieval

Amap2d ago

IntTravel: A Real-World Dataset and Generative Framework for Integrated Multi-Task Travel Recommendation

The authors introduce IntTravel, a large-scale dataset with 4.1 billion interactions for integrated travel recommendation, addressing the limitations of existing datasets that focus solely on next POI recommendation. To leverage this dataset, they propose a decoder-only generative framework that balances task collaboration and differentiation through information preservation, selection, and factorization. Experiments demonstrate state-of-the-art performance on IntTravel and another benchmark dataset, with a successful deployment on Amap resulting in a 1.09% CTR increase.

Introduces a large-scale dataset, IntTravel, and a novel generative framework for integrated multi-task travel recommendation, demonstrating improved performance and real-world impact.

Longfei Xu, Zheng Liu, Xiangxiang Chu2602.11664

Data Curation & Synthetic DataRecommendation & Information Retrieval

Xiaohongshu2d ago

LASER: An Efficient Target-Aware Segmented Attention Framework for End-to-End Long Sequence Modeling

The paper introduces LASER, a full-stack optimization framework for efficient long sequence modeling in recommendation systems, addressing I/O and computational bottlenecks. LASER incorporates SeqVault, a hybrid DRAM-SSD indexing strategy, to reduce retrieval latency, and Segmented Target Attention (STA), a novel attention mechanism with a sigmoid-based gating strategy and Global Stacked Target Attention (GSTA), to reduce computational complexity. Online A/B testing showed LASER achieved significant improvements in ADVV and revenue, demonstrating its practical impact.

Introduces a full-stack optimization framework, LASER, featuring SeqVault and Segmented Target Attention (STA), to achieve efficient long sequence modeling for recommendation systems.

Tianhe Lin, Baoyuan Ou, Yingjie Qin +42602.11562

Architecture Design (Transformers, SSMs, MoE)Inference & QuantizationRecommendation & Information Retrieval

2d ago

Commencing-Student Enrolment Forecasting Under Data Sparsity with Time Series Foundation Models

This paper investigates the use of Time Series Foundation Models (TSFMs) for forecasting commencing student enrollments in data-sparse higher education settings. The authors introduce the Institutional Operating Conditions Index (IOCI), a novel covariate derived from time-stamped documentary evidence, and combine it with Google Trends data to improve forecast accuracy. Results from an expanding-window backtest demonstrate that covariate-conditioned TSFMs achieve performance comparable to classical benchmarks without institution-specific training, highlighting their potential for zero-shot enrollment forecasting.

Introduces the Institutional Operating Conditions Index (IOCI), a transferable covariate derived from documentary evidence, to enhance TSFM-based enrollment forecasting in data-sparse environments.

Jittarin Jetwiriyanon, Teo Sušnjak, Surangika Ranathunga2602.12120

Natural Language ProcessingRecommendation & Information Retrieval

2d ago

Query-focused and Memory-aware Reranker for Long Context Processing

The paper introduces a query-focused and memory-aware reranking framework that leverages attention scores from selected heads in large language models to estimate passage-query relevance in a listwise manner. This approach generates continuous relevance scores, allowing training on diverse retrieval datasets and capturing holistic information from the candidate shortlist. Experiments show the method outperforms existing pointwise and listwise rerankers on Wikipedia, long narrative datasets, and the LoCoMo benchmark, achieving state-of-the-art results.

Introduces a novel reranking framework that utilizes attention scores from specific heads to estimate passage-query relevance in a listwise, memory-aware fashion.

Yuqing Li, Mo Yu, Guoxuan Ding +32602.12192

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information RetrievalNatural Language Processing

2d ago

Towards Personalized Bangla Book Recommendation: A Large-Scale Multi-Entity Book Graph Dataset

The authors introduce RokomariBG, a large-scale, multi-entity heterogeneous book graph dataset for personalized Bangla book recommendation, addressing the lack of resources in this low-resource language setting. They construct a knowledge graph comprising books, users, authors, categories, publishers, and reviews connected through eight relation types. Through benchmarking experiments on Top-N recommendation using collaborative filtering, matrix factorization, content-based methods, graph neural networks, and neural retrieval models, they demonstrate the dataset's utility and the importance of leveraging multi-relational structure and textual side information, achieving an NDCG@10 of 0.204 with neural retrieval models.

Introduces RokomariBG, a novel large-scale, multi-entity heterogeneous graph dataset for Bangla book recommendation, complete with benchmarking experiments.

Rahin Arefin Ahmed, Sakil Ahmed Sheikh Reza, Devnil Bhattacharjee +22602.12129

Data Curation & Synthetic DataRecommendation & Information RetrievalNatural Language Processing

FPT Software AI Center; Hanoi2d ago

Do Not Treat Code as Natural Language: Implications for Repository-Level Code Generation and Beyond

The paper introduces Hydra, a repository-level code generation framework that moves away from treating code as natural language and instead leverages its structured nature. Hydra employs a structure-aware indexing strategy using hierarchical trees, a dependency-aware retriever (DAR) to identify true dependencies, and a hybrid retrieval mechanism. Experiments on DevEval and RepoExec benchmarks demonstrate that Hydra achieves state-of-the-art performance, surpassing existing methods by over 5% in Pass@1 and enabling smaller models to outperform larger ones.

Introduces a novel repository-level code generation framework, Hydra, that leverages structure-aware indexing and dependency-aware retrieval to improve performance on complex code generation tasks.

Minh Le-Anh, Khanh An Tran, Nam Le Hai +32602.11671

Code Generation & Program SynthesisRecommendation & Information RetrievalNatural Language Processing

Deakin University2d ago·affiliated lab: MIT CSAIL

Beyond Code: Empirical Insights into How Team Dynamics Influence OSS Project Selection

This paper investigates the influence of team dynamics on OSS project selection by surveying 198 OSS practitioners. The study reveals that communication-related team dynamics like responsiveness and clarity are consistently prioritized, but the relative importance varies based on contributor motivations such as gaining reputation or networking. The findings demonstrate that aligning team dynamics with contributor motivations is crucial for understanding project selection behavior and designing better project recommendation systems.

Empirically demonstrates that team dynamics, particularly communication-related aspects, significantly influence OSS project selection, with the relative importance of specific dynamics varying based on contributor motivations.

Shashiwadana Nirmani, Hourieh Khalajzadeh, Mojtaba Shahin2602.11692

Code Generation & Program SynthesisOpen-Source Models & WeightsRecommendation & Information Retrieval

2d ago

EpicCBR: Item-Relation-Enhanced Dual-Scenario Contrastive Learning for Cold-Start Bundle Recommendation

The paper addresses the cold-start problem in bundle recommendation by proposing EpicCBR, a multi-view contrastive learning framework that leverages user-item (UI) and bundle-item (BI) relations. EpicCBR constructs user profiles by mining item relations and characterizes new bundles using historical bundle information and user preferences. Experiments on three benchmarks demonstrate that EpicCBR significantly outperforms state-of-the-art methods, achieving up to 387% improvement in cold-start scenarios.

Introduces a novel item-relation-enhanced dual-scenario contrastive learning framework (EpicCBR) to improve cold-start bundle recommendation by explicitly modeling user-item and bundle-item relationships.

Yihang Li2602.11680

Recommendation & Information RetrievalTraining Efficiency & OptimizationArchitecture Design (Transformers, SSMs, MoE)

2d ago

Recurrent Preference Memory for Efficient Long-Sequence Generative Recommendation

The paper introduces Rec2PM, a generative recommendation framework that compresses long user interaction histories into compact Preference Memory tokens to address the computational cost and noise accumulation challenges of full-attention models. Rec2PM uses a self-referential teacher-forcing strategy, generating reference memories from a global history view to supervise parallelized recurrent updates, enabling fully parallel training and iterative updates during inference. Experiments on large-scale benchmarks demonstrate that Rec2PM achieves superior accuracy with reduced inference latency and memory footprint, functioning as a denoising Information Bottleneck.

Introduces a novel self-referential teacher-forcing strategy for training recurrent preference memory in generative recommendation, enabling parallel training and efficient long-sequence modeling.

Yixiao Chen, Qiyao Wang, Shuojin Yang +12602.11605

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information RetrievalTraining Efficiency & Optimization

University of Guelph2d ago

From Noise to Order: Learning to Rank via Denoising Diffusion

The paper introduces DiffusionRank, a novel generative learning-to-rank (LTR) approach based on denoising diffusion that models the joint distribution of feature vectors and relevance labels. This contrasts with traditional discriminative LTR methods that model the conditional probability of relevance given features. By learning the full data distribution, DiffusionRank aims to produce more robust ranking models, achieving significant improvements over discriminative counterparts.

Introduces DiffusionRank, a denoising diffusion-based generative model for learning-to-rank that outperforms discriminative methods.

Sajad Ebrahimi, Bhaskar Mitra, Negar Arabzadeh +32602.11453

Recommendation & Information RetrievalNatural Language ProcessingArchitecture Design (Transformers, SSMs, MoE)Computer Vision

2d ago

Leveraging Language Models to Discover Evidence-Based Actions for OSS Sustainability

The paper introduces a RAG-pipeline and two-layer prompting strategy to extract actionable recommendations (ReACTs) for improving OSS sustainability from software engineering literature. They systematically explore open LLMs and prompting techniques to derive candidate ReACTs from ICSE and FSE papers, followed by a filtering and refinement stage to ensure quality and extract supporting evidence. The pipeline generates 1,922 ReACTs, with 1,312 meeting strict quality criteria, providing a structured and scalable approach to translate research findings into practical guidance for OSS projects.

Introduces a novel RAG-pipeline leveraging LLMs to extract and structure evidence-based, actionable recommendations (ReACTs) from software engineering literature for improving OSS project sustainability.

Vladimir Filkov2602.11746

Natural Language ProcessingCode Generation & Program SynthesisOpen-Source Models & WeightsRecommendation & Information Retrieval

2d ago

Statistical Parsing for Logical Information Retrieval

This paper extends the Quantified Boolean Bayesian Network (QBBN) to incorporate negation and backward reasoning, completing Prawitz's simple elimination rules within a probabilistic factor graph framework. It introduces a typed logical language with role-labeled predicates and modal quantifiers, along with a typed slot grammar that deterministically compiles sentences to logical form. The authors demonstrate that while LLMs can assist in disambiguation, grammars are essential for structured parsing, and the QBBN architecture leverages LLMs for annotation and verification in logical information retrieval.

Introduces a complete logical information retrieval system combining LLMs, typed slot grammars, and a QBBN inference engine to reconcile formal semantics with modern language models.

Greg Coppola2602.12170

Reasoning & Chain-of-ThoughtNatural Language ProcessingRecommendation & Information Retrieval

2d ago

P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling

The paper introduces P-GenRM, a personalized generative reward model that addresses limitations in existing personalized reward models by transforming preference signals into structured evaluation chains to derive adaptive personas and scoring rubrics. P-GenRM clusters users into User Prototypes and employs a dual-granularity scaling mechanism, scaling at both the individual and prototype levels to mitigate noise and enhance generalization. Experiments demonstrate state-of-the-art results on personalized reward model benchmarks, with a 2.31% average improvement and a 3% boost from test-time user-based scaling, indicating stronger personalized alignment.

Introduces a personalized generative reward model (P-GenRM) that leverages structured evaluation chains and dual-granularity scaling to improve personalization and generalization in reward modeling for LLMs.

Pinyi Zhang, Ting-En Lin, Yuchuan Wu +52602.12116

RLHF & Preference LearningRecommendation & Information RetrievalNatural Language Processing

2d ago

AttentionRetriever: Attention Layers are Secretly Long Document Retrievers

The paper introduces AttentionRetriever, a novel retrieval model designed for long documents that addresses context-awareness, causal dependence, and scope of retrieval limitations in existing RAG systems. AttentionRetriever leverages attention mechanisms and entity-based retrieval to create context-aware embeddings for long documents and determine the relevant retrieval scope. Experiments demonstrate that AttentionRetriever significantly outperforms existing retrieval models on long document retrieval datasets while maintaining the efficiency of dense retrieval methods.

Introduces AttentionRetriever, a novel long document retrieval model using attention and entity-based retrieval to create context-aware embeddings.

David Jiahao Fu, Lam Thanh Do2602.12278

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information RetrievalNatural Language Processing

2d ago

Uncertainty-aware Generative Recommendation

The paper addresses the "uncertainty blindness" limitation in generative recommendation models, where models treat all outcomes as equally certain, leading to unstable training and unquantifiable risks. They introduce Uncertainty-aware Generative Recommendation (UGR), a framework that incorporates uncertainty as a signal for adaptive optimization via uncertainty-weighted rewards, difficulty-aware optimization, and explicit confidence alignment. Experiments show that UGR improves recommendation performance, stabilizes training, and enables risk-aware applications.

Introduces a unified framework, UGR, that leverages uncertainty signals to improve generative recommendation by adaptively optimizing training based on model confidence, sample difficulty, and explicit confidence alignment.

Chenxiao Fan, Chongming Gao, Fuli Feng +12602.11719

Recommendation & Information RetrievalNatural Language Processing

2d ago

Efficient Crawling for Scalable Web Data Acquisition (Extended Version)

The paper introduces a reinforcement learning-based web crawling algorithm, SB-CLASSIFIER, designed to efficiently acquire statistical datasets (SDs) from websites. The algorithm addresses the challenge of inefficient or impossible SD retrieval at scale by learning which hyperlinks lead to pages that link to many targets, based on the paths leading to the links in their enclosing webpages. Experiments on large websites demonstrate that SB-CLASSIFIER can retrieve a high fraction of a site's targets while crawling only a small part of the website.

Introduces a novel reinforcement learning-based web crawler, SB-CLASSIFIER, that leverages sleeping bandits to efficiently identify and extract statistical datasets from large websites.

Antoine Gauquier, I. Manolescu, Pierre Senellart2602.11874

Data Curation & Synthetic DataRecommendation & Information RetrievalNatural Language Processing

2d ago

KuaiSearch: A Large-Scale E-Commerce Search Dataset for Recall, Ranking, and Relevance

The authors introduce KuaiSearch, a large-scale e-commerce search dataset derived from Kuaishou user interactions, designed to address limitations in existing datasets such as anonymization and single-stage coverage. KuaiSearch includes authentic user queries, natural product texts, and covers cold-start users/long-tail products across recall, ranking, and relevance stages of the search pipeline. Through comprehensive analysis and benchmark experiments, the authors demonstrate KuaiSearch's value for advancing research in real-world e-commerce search, particularly for LLM-based approaches.

Introduces KuaiSearch, a novel large-scale e-commerce search dataset built from real-world Kuaishou user interactions spanning recall, ranking, and relevance stages.

Yupeng Li, Ben Chen, Zhiding Liu +32602.11518

Eval Frameworks & BenchmarksRecommendation & Information RetrievalData Curation & Synthetic Data

2d ago

Is Online Linear Optimization Sufficient for Strategic Robustness?

This paper investigates whether online linear optimization (OLO) algorithms are sufficient for achieving strategic robustness in repeated Bayesian first-price auctions. The authors demonstrate that sublinear linearized regret in OLO is sufficient for strategic robustness, enabling the construction of strategically robust no-regret bidding algorithms via black-box reductions. Their reductions yield improved regret bounds compared to prior work, achieving $O(\sqrt{T \log K})$ regret in the known value distribution case and $O(\sqrt{T (\log K+\log(T/\delta)})$ regret in the unknown case, while also removing a bounded density assumption.

Establishes that sublinear linearized regret in online linear optimization is sufficient for achieving strategic robustness in repeated Bayesian first-price auctions, enabling black-box reductions to strategically robust bidding algorithms.

Weiqiang Zheng2602.12253

Training Efficiency & OptimizationRecommendation & Information Retrieval

2d ago

Adapting Vision-Language Models for E-commerce Understanding at Scale

This paper addresses the challenge of adapting general-purpose Vision-Language Models (VLMs) to the specific demands of e-commerce product understanding, characterized by attribute-centric data, multiple images, and noise. The authors demonstrate that targeted adaptation of VLMs can significantly enhance e-commerce performance without compromising general multimodal capabilities. They also introduce a new evaluation suite designed for deep product understanding, instruction following, and dynamic attribute extraction.

Demonstrates a strategy for adapting general-purpose VLMs to e-commerce data that improves performance on product understanding tasks while maintaining general multimodal capabilities.

Matteo Nulli, Vladimir Orshulevich, Tala Bazazo +92602.11733

Multimodal ModelsComputer VisionRecommendation & Information Retrieval

2d ago

Improving Neural Retrieval with Attribution-Guided Query Rewriting

This paper introduces an attribution-guided query rewriting method to improve the robustness of neural retrievers when faced with underspecified or ambiguous queries. The approach computes gradient-based token attributions from the retriever to identify problematic query components and then uses these attributions to guide an LLM in rewriting the query. Experiments on BEIR collections demonstrate that this method consistently improves retrieval effectiveness compared to existing query rewriting and explainability-based techniques, especially for implicit or ambiguous information needs.

Introduces an attribution-guided query rewriting framework that leverages retriever feedback to improve query clarity and retrieval effectiveness.

Moncef Garouani, Josiane Mothe2602.11841

Recommendation & Information RetrievalNatural Language ProcessingInterpretability & Mechanistic Interp

Feb 8, 2026

6d ago

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

This paper introduces MemFly, a framework for on-the-fly memory optimization in LLMs based on the information bottleneck principle. MemFly uses a gradient-free optimizer to minimize compression entropy while maximizing relevance entropy, creating a stratified memory structure. The framework incorporates a hybrid retrieval mechanism combining semantic, symbolic, and topological pathways, achieving superior performance in memory coherence, response fidelity, and accuracy compared to existing methods.

Introduces an information bottleneck-based framework, MemFly, for on-the-fly memory optimization in LLMs, enabling efficient compression and precise retrieval.

Zhenyuan Zhang, Xianzhang Jia, Zhiqin Yang +42602.07885

Tool Use & AgentsInference & QuantizationRecommendation & Information Retrieval

Feb 6, 2026

1w ago

Completing Missing Annotation: Multi-Agent Debate for Accurate and Scalable Relevant Assessment for IR Benchmarks

The paper introduces DREAM, a multi-round debate framework using LLM agents with opposing stances and iterative critique, to address the problem of incomplete relevance labels in IR benchmarks. DREAM achieves 95.2% labeling accuracy with only 3.5% human involvement by using agreement-based debate for accurate labeling and reliable AI-to-human escalation for uncertain cases. Using DREAM, the authors construct BRIDGE, a refined benchmark with 29,824 newly identified relevant chunks, demonstrating that incomplete labels distort retriever rankings and retrieval-generation alignment.

Introduces a multi-agent debate framework, DREAM, that leverages opposing LLM agents and iterative critique to improve the accuracy and scalability of relevance assessment for IR benchmarks.

Minjeong Ban, Jeonghwan Choi, Hyangsuk Min +42602.06526

Eval Frameworks & BenchmarksRecommendation & Information RetrievalScalable Oversight & Alignment Theory

Feb 2, 2026

1w ago

AdNanny: One Reasoning LLM for All Offline Ads Recommendation Tasks

The paper introduces AdNanny, a unified reasoning-centric LLM fine-tuned from a 671B DeepSeek-R1 checkpoint for various offline advertising tasks. They construct reasoning-augmented corpora with structured supervision and natural language explanations, and then use multi-task supervised fine-tuning with adaptive reweighting followed by reinforcement learning to align with online advertising objectives. Deployed in Bing Ads, AdNanny reduces manual labeling effort and improves accuracy, demonstrating a scalable and cost-effective solution by consolidating task-specific models.

The paper demonstrates that a single, reasoning-centric LLM, AdNanny, can effectively replace multiple task-specific models for offline advertising tasks, leading to improved accuracy and reduced manual effort.

Nan Hu, Han Li, Jimeng Sun +162602.01563

Recommendation & Information RetrievalNatural Language ProcessingReasoning & Chain-of-Thought

Jan 27, 2026

2w ago

RATE: Reviewer Profiling and Annotation-free Training for Expertise Ranking in Peer Review Systems

The paper introduces LR-bench, a new benchmark for reviewer assignment comprising 1055 expert-annotated paper-reviewer-score annotations from 2024-2025 AI/NLP manuscripts with five-level self-assessed familiarity ratings. It then proposes RATE, a reviewer-centric ranking framework that distills reviewer publications into keyword profiles and fine-tunes an embedding model using weak supervision from heuristic retrieval signals. Experiments on LR-bench and the CMU dataset demonstrate that RATE achieves state-of-the-art performance compared to strong embedding baselines.

Introduces a novel reviewer-centric ranking framework, RATE, that leverages keyword-based reviewer profiles and weak supervision to improve reviewer assignment.

Weicong Liu, Zi-Yi Yang, Yibo Zhao +12601.19637

Eval Frameworks & BenchmarksRecommendation & Information RetrievalNatural Language Processing

Jan 22, 2026

Centre Énergie3w ago

An Overview of Recent Advances in Natural Language Processing for Information Systems

This paper surveys recent advances in applying deep learning to information systems, contrasting them with classical pattern recognition techniques for text. It highlights the use of large language models and transformer architectures like BERT in digital assistants and various NLP tasks. The review covers post-training alignment, parsing, and reinforcement learning techniques used to improve these systems.

Synthesizes recent progress in applying deep learning, particularly large language models and transformers, to a range of information system tasks, providing context through classical pattern recognition methods.

Douglas O’Shaughnessy

Natural Language ProcessingRecommendation & Information Retrieval

Jan 7, 2026

Whose Facts Win? LLM Source Preferences under Knowledge Conflicts

This paper introduces a framework to study how source preferences influence LLMs' resolution of knowledge conflicts in retrieval-augmented generation. The authors evaluate 13 open-weight LLMs and find that they generally favor institutionally-corroborated information (e.g., government, newspapers) over information from people and social media, but this preference can be overridden by repetition. They propose a novel method to reduce repetition bias, achieving up to 99.8% reduction while maintaining at least 88.8% of the original source preferences.

Introduces a novel framework and method to analyze and mitigate repetition bias in LLMs' source preferences when resolving knowledge conflicts.

Jakob Schuster, Vagrant Gautam, Katja Markert2601.03746

Eval Frameworks & BenchmarksRecommendation & Information RetrievalNatural Language Processing

Dec 18, 2025

Dynamic Tool Dependency Retrieval for Efficient Function Calling

The paper introduces Dynamic Tool Dependency Retrieval (DTDR), a retrieval method that conditions on both the initial query and the evolving execution context to address the limitations of static tool retrieval in function calling agents. DTDR models tool dependencies from function calling demonstrations, enabling adaptive retrieval as plans unfold and improving the selection of relevant tools. Experiments across multiple datasets and LLM backbones demonstrate that DTDR significantly improves function calling success rates, achieving gains between 23% and 104% compared to static retrievers.

Introduces a dynamic tool retrieval mechanism that leverages evolving execution context to model tool dependencies and improve function calling accuracy.

Bhrij Patel, Davide Belli, Amir Jalalirad +32512.17052

Tool Use & AgentsRecommendation & Information RetrievalReasoning & Chain-of-Thought

Nov 7, 2025

Baoshan UniversityNov 7, 2025

Research on Retrieval-Augmented Architecture Design and Implications for Generative AI-Powered Question-Answering Teaching Assistant Systems

This paper presents a systematic review and meta-analysis of 98 publications from 2023-2025 to analyze the architectures and performance of generative AI-powered teaching assistants, focusing on Retrieval-Augmented Generation (RAG) systems. The study examines the fusion of Transformer-based LLMs and RAG across theory, architecture, mechanism, and application, identifying key technical improvement directions like domain knowledge base construction and hybrid retrieval optimization. Meta-analysis reveals that RAG-enhanced systems achieve significantly higher accuracy (87.3%) and learning effectiveness (Cohen's d = 0.68) compared to pure generative models, with Transformer and RAG integration becoming dominant architectures.

Systematically analyzes the architectural evolution and performance of RAG-enhanced generative AI systems in educational question-answering, quantifying the benefits of RAG and identifying key areas for future improvement.

Dingding Cao, Yali Shao, Limin Zhang +1

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information RetrievalNatural Language Processing

Oct 31, 2025

Qufu Normal University Finance OfficeOct 31, 2025

Research on Financial Credit Default Prediction Based on Transformer

This paper introduces FCDP, a credit default prediction model that combines an Enhanced Transformer module (ETransformer) for efficient feature filtering and long-range modeling, an Attention Guidance Prediction Module (AGPM) to enhance feature representation and suppress deep feature loss, and a Channel Attention Module (CAM) to learn channel importance. The model addresses limitations in existing credit default prediction research, such as reliance on manual feature engineering and insufficient feature extraction. Experiments on the Lending Club dataset demonstrate that FCDP outperforms six other forecasting models, suggesting its potential for improved risk assessment.

Introduces a novel credit default prediction model (FCDP) that integrates an Enhanced Transformer, Attention Guidance Prediction Module, and Channel Attention Module to improve prediction accuracy and computational efficiency.

Haiyang Liu, Zeyu Yang, Hua Li +2

Architecture Design (Transformers, SSMs, MoE)Natural Language ProcessingRecommendation & Information Retrieval

Oct 22, 2025

CoSense-LLM: Semantics at the Edge with Cost- and Uncertainty-Aware Cloud-Edge Cooperation

The paper introduces CoSense-LLM, an edge-first framework that converts multimodal sensor data into semantic tokens and coordinates with LLMs while considering latency, energy, bandwidth, and privacy constraints. CoSense-LLM employs a lightweight encoder (SenseFusion), edge-based retrieval (Edge-RAG), cost-aware prompt routing, and secure execution to minimize data transmission and ensure privacy. Experiments across diverse environments demonstrate that CoSense-LLM achieves sub-second latency, reduces bandwidth costs through local retrieval, and preserves privacy by transmitting only discrete codes.

Introduces an edge-first framework, CoSense-LLM, that enables efficient and privacy-preserving integration of multimodal sensor data with large language models under resource constraints.

Hasan Akgul, Mari Eplik, Javier Rojas +22510.19670

Multimodal ModelsInference & QuantizationRecommendation & Information Retrieval

Oct 17, 2025

Repairing Tool Calls Using Post-tool Execution Reflection and RAG

This paper introduces a post-tool execution reflection mechanism that leverages LLM-based reflection and domain-specific RAG to repair failed tool calls in agentic systems. The approach uses a combination of tool-specific documentation and troubleshooting documents to identify and correct both syntactic and semantic errors that are only apparent after the tool's response is analyzed. Experiments using the kubectl command-line tool for Kubernetes management demonstrate that the RAG-based reflection improves the execution pass rate by 55% and the correctness of answers to user queries by 36% on average, with troubleshooting documents outperforming official documentation.

Introduces a novel post-tool execution reflection component that combines LLM-based reflection with domain-specific RAG to improve the reliability and accuracy of tool calls in agentic systems.

Jason Tsay, Zidane Wright, Gaodan Fang +32510.17874

Tool Use & AgentsCode Generation & Program SynthesisRecommendation & Information Retrieval

Sep 30, 2025

SQUARE: Semantic Query-Augmented Fusion and Efficient Batch Reranking for Training-free Zero-Shot Composed Image Retrieval

This paper introduces SQUARE, a training-free zero-shot composed image retrieval (ZS-CIR) framework that uses multimodal large language models (MLLMs) to improve retrieval accuracy. SQUARE employs Semantic Query-Augmented Fusion (SQAF) to enrich the query embedding with MLLM-generated captions, providing high-level semantic guidance. It also uses Efficient Batch Reranking (EBR), where an MLLM jointly reasons about top-ranked candidates presented as an image grid to refine the ranking in a single pass.

Introduces a two-stage training-free ZS-CIR framework, SQUARE, that leverages MLLMs for semantic query augmentation and efficient batch reranking to improve retrieval accuracy without task-specific training.

Ren-Di Wu, Yu-Yen Lin, Huei-Fang Yang2509.26330

Multimodal ModelsComputer VisionRecommendation & Information Retrieval

Lattice is designed for desktop

Recommendation & Information Retrieval

Keywords

Top Labs in This Topic

Recent Papers