April 24 – May 1, 2026

Recommendation & Information Retrieval - Weekly Roundup

100 papers published across 8 labs.

Selected Labs publishing this week

Mila4 Tsinghua AI3 BAIR2 CMU ML1 Amazon Science1

Top Papers

Apr 28, 2026

3w ago·also Adobe Research, Paris-Saclay

Learn user preferences across thousands of items from just tens of node evaluations by exploiting graph smoothness in a new spectral bandit framework.

Tomás Kocák, R. Munos, B. Kveton +312

Recommendation & Information Retrieval

3w ago·also Cyberspace Institute of Advanced Technology, Griffith, Guangdong Key Laboratory of Industrial, Guangzhou University +2

Personalized Multi-Interest Modeling for Cross-Domain Recommendation to Cold-Start Users

Forget one-size-fits-all recommendations: this model uses normalizing flows to capture the *multimodal* nature of individual user preferences, leading to better cold-start performance in cross-domain recommendation.

Xiaodong Li, Jiawei Sheng, Jiangxia Cao +7

Recommendation & Information Retrieval

May 1, 2026

Zi-qiang Zhao +13w ago

Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

Tree-based RAG gets a major upgrade: $\Psi$-RAG's adaptive hierarchical index and multi-granular retrieval agent leapfrog existing methods on complex, cross-document reasoning tasks.

Zi-qiang Zhao, Menglin Yang

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Massimo Rondelli +23w ago

BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis

LLMs can now generate 70% syntactically correct and geometrically consistent 3D objects from text, thanks to retrieval-augmented code synthesis.

Massimo Rondelli, Francesco Pivi, Maurizio Gabbrielli

Code Generation & Program Synthesis Multimodal Models Recommendation & Information Retrieval

Apr 30, 2026

University of Pisa & ISTI–CNR3w ago·also ISTI–CNR, University of Pisa

Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

Token-aware clustering and hierarchical indexing can slash retrieval latency by an order of magnitude without sacrificing accuracy, making multivector retrieval practical at scale.

Silvio Martinico, Silvio Martinico, Franco Maria Nardini +5

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

All Papers (100)

May 1, 2026

Zi-qiang Zhao +13w ago

Hierarchical Abstract Tree for Cross-Document Retrieval-Augmented Generation

Tree-based RAG gets a major upgrade: $\Psi$-RAG's adaptive hierarchical index and multi-granular retrieval agent leapfrog existing methods on complex, cross-document reasoning tasks.

Zi-qiang Zhao, Menglin Yang

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Massimo Rondelli +23w ago

BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis

LLMs can now generate 70% syntactically correct and geometrically consistent 3D objects from text, thanks to retrieval-augmented code synthesis.

Massimo Rondelli, Francesco Pivi, Maurizio Gabbrielli

Code Generation & Program Synthesis Multimodal Models Recommendation & Information Retrieval

Apr 30, 2026

University of Pisa & ISTI–CNR3w ago·also ISTI–CNR, University of Pisa

Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

Token-aware clustering and hierarchical indexing can slash retrieval latency by an order of magnitude without sacrificing accuracy, making multivector retrieval practical at scale.

Silvio Martinico, Silvio Martinico, Franco Maria Nardini +5

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

Ta-Yang Wang +33w ago

TypeBandit: Type-Level Context Allocation and Reweighting for Effective Attribute Completion in Heterogeneous Graph Neural Networks

Stop wasting compute on uninformative node types: TypeBandit intelligently allocates sampling resources in heterogeneous graphs, boosting attribute completion accuracy without architectural changes.

Ta-Yang Wang, Rajgopal Kannan, Viktor Prasanna +1

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

xmemory3w ago

From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction

For AI agents needing reliable facts and stateful computation, *how* you structure memory beats simply scaling retrieval or model size.

Alex Petrov, A.V. Petrov, Alexander Gusak +3

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

3w ago·also IU Bloomington, NTU

How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

Google's AI Overviews favor Google-owned content and penalize sites blocking its AI crawler, raising serious questions about fairness and bias in the emerging generative search landscape.

Riley Grossman, Songjiang Liu, Songjia Liu +6

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Zhongguancun Academy3w ago·also USTC

Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation

LLMs can generate recommendations up to 3.1x faster by explicitly modeling token position within items and speculation depth during speculative decoding.

Jiaju Chen, Chongming Gao, Chenxiao Fan +4

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

3w ago·also Kyoto, MBZUAI, RIKEN, UTokyo

Why Mean Pooling Works: Quantifying Second-Order Collapse in Text Embeddings

Despite its simplicity, mean pooling works surprisingly well because modern text encoders concentrate token embeddings, preserving crucial information about their distribution.

Tomomasa Hara, Hiroto Kurita, Masaaki Imaizumi +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Yildiz Technical University3w ago

Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents

LLMs can learn to safely leverage external memory for code debugging by explicitly modeling and penalizing the risk of false-positive memory injection.

Mehmet Iscan, M. Işcan

Code Generation & Program Synthesis Recommendation & Information Retrieval Tool Use & Agents

Jipeng Tan +33w ago

Temporal and Content Coupling Analysis of Social Media User Behavior

Uncovered: news consumption rhythms follow a predictable hierarchy, from daily cycles to split-second actions, but historical interests still dominate user behavior.

Jipeng Tan, Mengye Yang, Zhanghao Li +1

Natural Language Processing Recommendation & Information Retrieval

BAIR3w ago·also CMU ML, American University of Central Asia, UMich

Empire Amplifier: Uncovering and Contesting the Prioritization of Colonial Content on Platforms Through Community-Informed Algorithmic Auditing

YouTube's recommendation algorithm pushes Kyrgyz children towards Russian-language content, even when they signal a preference for their native tongue, effectively amplifying colonial influence.

Nel Escher, Bakyt Yrysov, B. Yrysov +4

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

3w ago

SST-Guard: Detecting and Characterizing Server-Side Google Analytics in the Wild

Server-side tracking thought it could hide, but this new browser extension spots Google Analytics even when it's sneakily relaying data through custom endpoints.

Muhammad Jazlan, Alexander Gamero-Garrido, Zubair Shafiq +1

Recommendation & Information Retrieval

Jipeng Tan +43w ago

Gender Bias in YouTube Exposure: Allocative and Structural Inequalities in Political Information Environments

YouTube's recommendation algorithm doesn't just show different political content to male and female-coded profiles, it steers them into structurally different information ecosystems.

Jipeng Tan, Weifeng Zhang, Ye Wu +2

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

3w ago

SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions

The standard "human-likeness" test for user simulators is essentially useless for predicting whether they produce valid system rankings.

Saber Zerhoudi

Eval Frameworks & Benchmarks Recommendation & Information Retrieval

3w ago·also Macquarie, Meituan, UNSW

Purifying Multimodal Retrieval: Fragment-Level Evidence Selection for RAG

Stop drowning your MLLMs in irrelevant document noise: FES-RAG shows that carefully selecting multimodal fragments as evidence boosts performance by up to 27% while shrinking context length.

Xihang Wang, Zihan Wang, Chengkai Huang +4

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

M. Rathee +43w ago

Reproducing Adaptive Reranking for Reasoning-Intensive IR

Iteratively exploring a corpus graph during reranking can substantially boost reasoning-intensive retrieval performance, even with weaker rerankers, offering a surprisingly effective alternative to computationally expensive retriever improvements.

M. Rathee, Mandeep Rathee, V. Venktesh +2

Reasoning & Chain-of-Thought Recommendation & Information Retrieval

3w ago·also Interdisciplinary Transformation

NuggetIndex: Governed Atomic Retrieval for Maintainable RAG

Stop retrieving passages in your RAG system: NuggetIndex shows that retrieving and filtering atomic "nuggets" of information yields substantial gains in recall, temporal correctness, and reduced conflicts.

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic +1

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Yujun Wu +133w ago

Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists

AI research agents can now reliably trace method evolution topologies thanks to a new methodological evolution graph, Intern-Atlas, that captures structured relationships between research methods.

Yujun Wu, Dongxu Zhang, Xinchen Li +11

Natural Language Processing Recommendation & Information Retrieval Scientific Discovery & Drug Design

3w ago

One Pass, Any Order: Position-Invariant Listwise Reranking for LLM-Based Recommendation

LLMs' ranking instability, where shuffling candidates changes recommendations, can be solved with a novel architecture that enforces permutation invariance.

Ethan Bito, Yongli Ren, Estrid He

Natural Language Processing Recommendation & Information Retrieval

Lei Li +83w ago·also Ickylin AI Team

ChipLingo: A Systematic Training Framework for Large Language Models in EDA

Domain-adapting LLMs for EDA requires explicit RAG scenario training to prevent performance degradation, and QA augmentation during corpus construction further boosts performance.

Lei Li, Xingwen Yu, Xing Yu +6

Code Generation & Program Synthesis Recommendation & Information Retrieval Training Efficiency & Optimization

3w ago·also Baidu, Brown

MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection

Achieve state-of-the-art multimodal stance detection by having multiple AI agents debate each other, complete with retrieval-augmented context and self-reflection.

Weihai Lu, Zhejun Zhao, Yanshu Li +1

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

Mohit Dubey +23w ago

ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era

Stop wasting tokens and context window space: OBJECTGRAPH reimagines documents as knowledge graphs, slashing token usage by up to 95% without sacrificing task accuracy.

Mohit Dubey, Mohit L. Dubey, Open Gigantic

Recommendation & Information Retrieval Tool Use & Agents

3w ago

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Retrieval improvements don't always boost reasoning in RAG systems, but NeocorRAG's evidence chains can fix that, achieving SOTA with 20% fewer tokens.

Shiyao Peng, Qianhe Zheng, Zhuodi Hao +8

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Xupeng Chen +83w ago

Iterative Multimodal Retrieval-Augmented Generation for Medical Question Answering

Ditching text chunks for full document page images in medical RAG boosts QA accuracy by a full percentage point, proving that visual context matters.

Xupeng Chen, Binbin Shi, Chenqian Le +6

Multimodal Models Recommendation & Information Retrieval Scientific Discovery & Drug Design

3w ago

Contextual Agentic Memory is a Memo, Not True Memory

Today's AI agents aren't really "remembering" – they're just taking notes, which means they'll hit a wall on complex tasks and can be easily brainwashed.

Binyan Xu, Xilin Dai, Kehuan Zhang

Recommendation & Information Retrieval Tool Use & Agents

Hiroyuki Deguchi +23w ago

One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness

A single, optimized text snippet can fool CLIP into thinking it's a good caption for almost any image, revealing a surprising vulnerability in cross-modal understanding.

Hiroyuki Deguchi, Katsuki Chousa, Yusuke Sakai

Multimodal Models Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

3w ago·also Amazon Science

From Unstructured to Structured: LLM-Guided Attribute Graphs for Entity Search and Ranking

LLMs can achieve better zero-shot product ranking with 57% less token usage by reasoning over structured attribute graphs instead of raw text.

Yilun Zhu, Nikhita Vedula, Shervin Malmasi +1

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

3w ago

EviMem: Evidence-Gap-Driven Iterative Retrieval for Long-Term Conversational Memory

Explicitly diagnosing what's missing from a retrieval set unlocks substantial gains in long-term conversational memory, boosting accuracy on temporal and multi-hop questions by up to 20% while simultaneously reducing latency.

Yuyang Li, Yime He, Yimeng He +2

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

3w ago·also BAIR, Mila, Toronto Metropolitan University, UofT

A Reproducibility Study of LLM-Based Query Reformulation

LLM-powered query reformulation, a hot topic in IR, often fails to translate gains from lexical to neural retrieval, and bigger models don't always help.

Amin Bigdeli, Radin Hamidi Rad, Hai Son Le +4

Eval Frameworks & Benchmarks Open-Source Models & Weights Recommendation & Information Retrieval

Md Hasan Saju +13w ago

Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations

LLMs, when carefully constrained and augmented with retrieval, can slash incident triage times from hours to minutes in real-world security operations.

Md Hasan Saju, Akramul Azim

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

3w ago·also SMU

Tail-aware N-version Machine Learning Models for Reliable API Recommendation

Mitigating long-tail distributions in code datasets boosts API recommendation reliability by up to 10% using an ensemble of models that strategically reject low-confidence predictions.

Aoi Matsuda, Fumio Machida, David Lo

Code Generation & Program Synthesis Recommendation & Information Retrieval

Ji-Hyeon Kim +23w ago

ClipTBP: Clip-Pair based Temporal Boundary Prediction with Boundary-Aware Learning for Moment Retrieval

By explicitly modeling relationships between multiple relevant video segments, ClipTBP significantly improves video moment retrieval, especially when queries are ambiguous.

Ji-Hyeon Kim, Ho-Joong Kim, Seong-Whan Lee

Computer Vision Multimodal Models Recommendation & Information Retrieval

Naeem Rehmat +63w ago·also UMich

Iterative Definition Refinement for Zero-Shot Classification via LLM-Based Semantic Prototype Optimization

Zero-shot classification accuracy hinges more on the *definition* of a category than the model architecture itself.

Naeem Rehmat, Muhammad Saad Saeed, M. S. Saeed +4

Natural Language Processing Recommendation & Information Retrieval

Apr 29, 2026

3w ago

LLM-Enhanced Topical Trend Detection at Snapchat

Snapchat's new trend detection system proves that LLMs can successfully consolidate multimodal signals at scale to surface emerging topics from short-form video, boosting content freshness and user engagement.

Hangqi Zhao, Jay Li, Abhiruchi Bhattacharya +6

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

Federal University of Bahia3w ago·also Western University

A Gated Hybrid Contrastive Collaborative Filtering Recommendation

Injecting review semantics into collaborative filtering via adaptive gating and contrastive learning substantially boosts top-N recommendation accuracy, outperforming existing review-aware methods.

Eduardo Ferreira da Silva, Mayki dos Santos Oliveira, Joel Machado Pires +6

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

3w ago·also Southwestern University of Finance and Economics

CARD: Non-Uniform Quantization of Visual Semantic Unit for Generative Recommendation

Skewed item distributions in recommendation systems can be tamed with a learnable non-uniform quantization, leading to better codebook utilization and more accurate generative recommendations.

Yibiao Wei, Jie Zou, Pengfei Zhang +4

Inference & Quantization Multimodal Models Recommendation & Information Retrieval

UC Santa Cruz3w ago·also UESTC, UQ

ProMax: Exploring the Potential of LLM-derived Profiles with Distribution Shaping for Recommender Systems

LLM-derived user profiles can be powerfully leveraged for recommendation via a surprisingly simple distribution shaping approach, outperforming more complex fusion methods.

Yi Zhang, Yiwen Zhang, Kai Zheng +2

Natural Language Processing Recommendation & Information Retrieval

3w ago

LUCid: Redefining Relevance For Lifelong Personalization

Even state-of-the-art models like Gemini and Claude can completely miss critical user information when it's buried in semantically unrelated past interactions, tanking personalization performance.

Chimaobi Okite, Anika Misra, Joyce Chai +1

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

3w ago

RAQG-QPP: Query Performance Prediction with Retrieved Query Variants and Retrieval Augmented Query Generation

Forget term expansion: leveraging retrieved queries and LLMs to generate query variants boosts Query Performance Prediction by up to 30% on neural rankers.

Fangzheng Tian, Debasis Ganguly, Craig Macdonald

Natural Language Processing Recommendation & Information Retrieval

3w ago

TimeMM: Time-as-Operator Spectral Filtering for Dynamic Multimodal Recommendation

Forget static graphs: TimeMM dynamically reweights user-item interactions based on recency and modality, adapting to evolving user preferences in multimodal recommendations.

Wei Yang, Rui Zhong, Xiaodan Wang +3

Multimodal Models Recommendation & Information Retrieval

Know Center Research GmbH3w ago·also Graz University of Technology, JKU, Know Center Research GmbH &, Know-Center GmbH

Meta-Learning and Targeted Differential Privacy to Improve the Accuracy-Privacy Trade-off in Recommendations

Stop blindly applying differential privacy: targeting stereotypical user data and using meta-learning can dramatically improve the accuracy of privacy-preserving recommender systems.

Peter Müllner, P. Mullner, Dominik Kowald +5

Constitutional AI & AI Ethics Recommendation & Information Retrieval

Yuxuan Huang +83w ago

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

LLMs can achieve a 7.5x performance boost in web search and extraction by using a bi-level multi-agent architecture with iterative refinement and shared memory.

Yuxuan Huang, Yihang Chen, Zhiyuan He +6

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

Federal University of São Carlos3w ago

The Bandit's Blind Spot: The Critical Role of User State Representation in Recommender Systems

The secret to better bandit-based recommendations isn't always the bandit algorithm itself, but the way you represent user state.

Pedro R. Pires, Gregorio F. Azevedo, Rafael T. Sereicikas +2

Natural Language Processing Recommendation & Information Retrieval

Independent Researcher3w ago·also Macquarie, Meituan, UNSW

Factorized Latent Reasoning for LLM-based Recommendation

LLMs can model user preferences more effectively by disentangling intent into multiple latent factors, leading to improved recommendation accuracy and interpretability.

Tianqi Gao, Lina Yao

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

3w ago·also Interdisciplinary Transformation

AgentSim: A Platform for Verifiable Agent-Trace Simulation

Forget synthetic QA datasets – AgentSim offers verifiable, step-by-step RAG traces, revealing how LLMs *actually* reason over documents.

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

3w ago

Understanding DNNs in Feature Interaction Models: A Dimensional Collapse Perspective

DNNs in recommendation models don't just learn feature interactions, they fundamentally reshape embedding spaces by preventing dimensional collapse.

Jiancheng Wang, Mingjia Yin

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

3w ago

Efficient Listwise Reranking with Compressed Document Representations

Forget slow reranking: this new method compresses documents into embeddings, letting an 8B parameter model run up to 18x faster than smaller models with better accuracy.

Hervé Déjean, Herv'e D'ejean, St'ephane Clinchant +1

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

Saurabh K. Singh +23w ago

Benchmarking Complex Multimodal Document Processing Pipelines: A Unified Evaluation Framework for Enterprise AI

Document AI pipelines don't work the way you think: quality bottlenecks aren't where you expect, and components don't cascade quality.

Saurabh K. Singh, S. Raj, Sachin Raj

Eval Frameworks & Benchmarks Multimodal Models Recommendation & Information Retrieval

LinkedIn Corporation3w ago·also NTU

Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent

LinkedIn's new memory system for hiring agents boosts accuracy and speed by over 10%, proving hierarchical semantic memory is a game-changer for real-world LLM applications.

Zhentao Xu, Shangjing Zhang, Emir Poyraz +7

Natural Language Processing Recommendation & Information Retrieval Scalable Oversight & Alignment Theory+1

3w ago·also HKU, Tsukuba, University of North Texas, Yonsei

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory

LLM agents can now remember far more, far more accurately, by "seeing" their past experiences instead of just reading about them.

Jinze Li, Yang Zhang, Jiayi Qu +3

Multimodal Models Recommendation & Information Retrieval Tool Use & Agents

3w ago

Hypencoder Revisited: Reproducibility and Analysis of Non-Linear Scoring for First-Stage Retrieval

Non-linear scoring with Hypencoders boosts retrieval performance, but don't expect it to fix your speed or adversarial robustness problems.

Arne Eichholtz, Yongkang Li, Jutte Vijverberg +2

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval

3w ago·also Jahangirnagar University

HealthNLP_Retrievers at ArchEHR-QA 2026: Cascaded LLM Pipeline for Grounded Clinical Question Answering

Gemini 2.5 Pro shines at question interpretation within a cascaded pipeline, but struggles to generate answers and identify evidence as effectively.

Md Biplob Hosen, Md Alomgeer Hussein, Md Akmol Masud +3

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Tsinghua AI3w ago

Decoupling Knowledge and Task Subspaces for Composable Parametric Retrieval Augmented Generation

Untangling task-solving skills from factual knowledge in PRAG adapters makes them play better together, boosting performance when you combine multiple documents.

Weihang Su, Hanwen Zhang, Qingyao Ai +1

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

3w ago·also Stellaris AI Limited

When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models

Injecting knowledge at the *right* moment during reasoning boosts accuracy by 10% while cutting retrieval calls in half, blowing away static RAG strategies.

Dongxin Guo, Jikun Wu, Siu Ming Yiu

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

Zhijun Li +83w ago

PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Privacy-preserving RAG is now practical: PRAG achieves competitive recall and low latency while fully encrypting both documents and queries.

Zhijun Li, Minghui Xu, Huayi Qi +6

Natural Language Processing Recommendation & Information Retrieval

Guodong Fan +63w ago

When Model Editing Meets Service Evolution: A Knowledge-Update Perspective for Service Recommendation

Forget retraining: model editing and constrained decoding can keep service recommendations fresh and valid in ever-changing software ecosystems.

Guodong Fan, Cuiyun Gao, Chun Yong Chong +4

Natural Language Processing Recommendation & Information Retrieval

Apr 28, 2026

SequeL3w ago·also INRIA, Paris-Saclay, UPF

Online learning with Erdős-Rényi side-observation graphs

Learning in multi-armed bandits gets a boost: even with only probabilistic side observations of other arms' losses, near-optimal regret is achievable without knowing the observation probability.

Tomáš Kocák, Michal Valko

Recommendation & Information Retrieval

Sharma Aditya +23w ago

G-Loss: Graph-Guided Fine-Tuning of Language Models

Fine-tuning language models with a graph-guided loss that captures global semantic relationships can significantly boost classification accuracy and convergence speed.

Sharma Aditya, Agarwal Vinti, Kumar Rajesh

Natural Language Processing Recommendation & Information Retrieval Training Efficiency & Optimization

3w ago·also Adobe Research, Paris-Saclay

Spectral bandits

Learn user preferences across thousands of items from just tens of node evaluations by exploiting graph smoothness in a new spectral bandit framework.

Tomás Kocák, R. Munos, B. Kveton +312

Recommendation & Information Retrieval

3w ago

RADD: Retrieval-Augmented Discrete Diffusion for Multi-Modal Knowledge Graph Completion

Decoupling retrieval and reranking with a discrete diffusion model leaps ahead of monolithic embedding scorers for multi-modal knowledge graph completion.

Guanglin Niu

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

3w ago

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG

Current multilingual RAG systems can miss culturally relevant answers, but CORAL's adaptive retrieval loop closes the gap, boosting accuracy by up to 3.58% on low-resource languages.

Nayeon Lee, Jiwoo Song, Byeongcheol Kang

Natural Language Processing Recommendation & Information Retrieval

Nitin Venkateswaran +33w ago

An Investigation of Linguistic Biases in LLM-Based Recommendations

LLMs exhibit surprising dialect-dependent biases when making recommendations, favoring certain cuisines and product categories based on the linguistic style of the prompt.

Nitin Venkateswaran, Jason Ang, D. Adhikari +1

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Hojae Han +33w ago

R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL

Text-to-SQL models can now achieve significantly higher accuracy by grouping and ranking SQL candidates based on execution results, then strategically resampling when the initial pool is lacking.

Hojae Han, Yeonseok Jeong, Zhewei Yao +1

Code Generation & Program Synthesis Natural Language Processing Recommendation & Information Retrieval

Li Ju +13w ago

Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models

RAG models struggle to ignore their pre-trained knowledge, even when it contradicts the provided context, but a new dataset can help them learn to be more faithful.

Li Ju, Junzhe Wang

Data Curation & Synthetic Data Natural Language Processing Recommendation & Information Retrieval

University College Dublin3w ago

Navigating Global AI Regulation: A Multi-Jurisdictional Retrieval-Augmented Generation System

Forget searching through endless legal documents – a new RAG system achieves 87% faithfulness and 84% relevancy in answering complex, multi-jurisdictional AI regulation questions.

Courtney Ford, Ojas Rane, Susan Leavy

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

Mila3w ago·also BJTU

CroSearch-R1: Better Leveraging Cross-lingual Knowledge for Retrieval-Augmented Generation

CroSearch-R1 reveals that integrating cross-lingual knowledge through a dynamic retrieval strategy can substantially enhance the performance of Retrieval-Augmented Generation systems.

Ruizhen Qi, Fengran Mo, Sijin Lu +3

Natural Language Processing Recommendation & Information Retrieval RLHF & Preference Learning

Graz University of Technology3w ago·also Institute of Software Technology, UNiQUARE Software Development

Recommending Usability Improvements with Multimodal Large Language Models

MLLMs can now automatically identify and rank UI usability issues from screen recordings, offering actionable recommendations with minimal context.

Sebastian Lubos, Alexander Felfernig, Damian Garber +2

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

University of Science3w ago·also NICT

GeoSearch: Augmenting Worldwide Geolocalization with Web-Scale Reverse Image Search and Image Matching

Web-scale reverse image search, combined with a clever filtering mechanism, significantly boosts the accuracy of image geolocalization, even when reference databases lack relevant scenes.

Tung-Duong Le-Duc, Hoang-Quoc Nguyen-Son, Minh-Son Dao

Computer Vision Multimodal Models Recommendation & Information Retrieval

First author3w ago

From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms

ChatGPT extracts more value from each cited source than Google or Perplexity, suggesting that citation *quality* trumps citation *quantity* in generative search.

Zhang Kai, Jingang Yao, Yao Jingang

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

3w ago

K-CARE: Knowledge-driven Symmetrical Contextual Anchoring and Analogical Prototype Reasoning for E-commerce Relevance

LLMs struggle with e-commerce search relevance not because of reasoning limitations, but because they lack domain-specific knowledge, a problem K-CARE solves with external knowledge grounding.

Chen Yifei, Zhixing Tian, Tian Zhixing +4

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

NJUST3w ago·also Kuaishou

Harmonizing Generative Retrieval and Ranking in Chain-of-Recommendation

Bridging the gap between generative retrieval and ranking, RecoChain achieves superior Top-K recommendation performance without sacrificing generative strength.

Yu Liu, Jiangxia Cao

Natural Language Processing Recommendation & Information Retrieval

3w ago·also HKU, UCSD

From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space

Reranking in recommender systems can be revolutionized by shifting from local indices to generating global identifiers, enhancing robustness and user satisfaction.

Pengyue Jia, Pengyue Jia, Xiaobei Wang +28

Recommendation & Information Retrieval

Jongyoon Kim +53w ago

UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval

UnIte reveals that incorporating uncertainty into document sampling can lead to substantial improvements in retrieval performance with fewer training samples.

Jongyoon Kim, Jongyoon Kim, Minseong Hwang +3

Data Curation & Synthetic Data Natural Language Processing Recommendation & Information Retrieval

Guosheng Zhang +53w ago

Combating Visual Neglect and Semantic Drift in Large Multimodal Models for Enhanced Cross-Modal Retrieval

LMMs struggle to ground text queries in the right parts of images, but explicitly modeling salient visual subjects can dramatically improve cross-modal retrieval.

Guosheng Zhang, Linkai Liu, Keyao Wang +3

Computer Vision Multimodal Models Recommendation & Information Retrieval

Shiva Ahir +13w ago·also Department of Electrical

RAG-Enhanced Kernel-Based Heuristic Synthesis (RKHS): A Structured Methodology Using Large Language Models for Hardware Design

LLMs can systematically generate effective hardware design heuristics, achieving an 11% reduction in scheduling latency with minimal overhead.

Shiva Ahir, A. Doboli

Code Generation & Program Synthesis Recommendation & Information Retrieval

Jangho Baik +43w ago

RecFlash: Fast Recommendation System on In-Storage Computing with Frequency-Based Data Mapping

RecFlash slashes recommendation inference latency by up to 81% and energy consumption by nearly 92% through smart data remapping in NAND flash memory.

Jangho Baik, Sunghyun Kim, Gisan Ji +2

Distributed Systems & Hardware Inference & Quantization Recommendation & Information Retrieval

3w ago

Break the Inaccessible Boundary: Distilling Post-Conversion Content for User Retention Modeling

Retention models can now harness the power of post-conversion content without risking feature leakage, leading to more accurate predictions of user engagement.

Tianbao Ma, Ruochen Yang, Chengen Li +8

Data Curation & Synthetic Data Natural Language Processing Recommendation & Information Retrieval

3w ago

Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text

Even basic TF-IDF methods can rival LLM-based approaches in creating navigable text structures, as shown by a new metric for evaluating hypergraphs.

Dean E. Alvarez, D. E. Alvarez, ChengXiang Zhai

Natural Language Processing Recommendation & Information Retrieval

3w ago

Action-Aware Generative Sequence Modeling for Short Video Recommendation

A2Gen transforms short video recommendations by treating user actions as dynamic sequences, resulting in substantial improvements in user engagement metrics.

Wenhao Li, Zihan Lin, Zhengxiao Guo +8

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

Children's Hospital of Philadelphia3w ago·also Google Research, UPenn

Health System Scale Semantic Search Across Unstructured Clinical Notes

Semantic search across hundreds of millions of clinical notes is not just feasible, but can slash chart review times by up to 89% while maintaining accuracy.

Faith Wavinya Mutinda, F. Mutinda, Spandana Makeneni +21

Natural Language Processing Recommendation & Information Retrieval

3w ago·also Mila, Gaoling AI, School of Mathematics, UvA

The Attention Market: Interpreting Online Fair Re-ranking as Manifold Optimization under Walrasian Equilibrium

ManifoldRank reveals that treating fairness as a taxation cost can significantly enhance the effectiveness of online fair re-ranking algorithms.

Chen Xu, Wei Chu, Wenyue Hu +5

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

Julián Urbano +13w ago

Stop Using the Wilcoxon Test: Myth, Misconception and Misuse in IR Research

Misapplying the Wilcoxon test in IR research could lead to a false sense of security, resulting in misleading outcomes that undermine the validity of findings.

Julián Urbano, Julián Urbano

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

3w ago·also Cyberspace Institute of Advanced Technology, Griffith, Guangdong Key Laboratory of Industrial, Guangzhou University +2

Personalized Multi-Interest Modeling for Cross-Domain Recommendation to Cold-Start Users

Xiaodong Li, Jiawei Sheng, Jiangxia Cao +7

Recommendation & Information Retrieval

Apr 27, 2026

Airbnb3w ago

Closing the Loop: A Software Framework for AI to Support Business Decision Making

Unleash your AI agent's business acumen: this framework lets AI not just analyze experiments, but actively ideate, personalize, and optimize business strategies within a safe, unified software interface.

Jeffrey Wong, Antoine Creux

Recommendation & Information Retrieval Tool Use & Agents

Kushal Raj Bhandari +43w ago

Improving Robustness of Tabular Retrieval via Representational Stability

Seemingly innocuous choices in table serialization format (CSV vs. HTML) can drastically alter retrieval performance, but a simple centroid-based correction can restore semantic consistency.

Kushal Raj Bhandari, Adarsh Singh, Jianxi Gao +2

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

NVIDIA3w ago·also Texas Tech University

CiteRadar: A Citation Intelligence Platform for Researcher Profiling and Geographic Visualization

See where your citations are coming from with a single command, thanks to CiteRadar's open-source platform that automatically generates interactive maps and detailed researcher profiles from your Google Scholar ID.

Chenxu Niu, Yiming Sun

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval

3w ago

Don\'t Stop Early: Scalable Enterprise Deep Research with Controlled Information Flow and Evidence-Aware Termination

Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.

Prafulla Kumar Choubey, Kung-Hsiang Huang, P. Venkit +4

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

Mila3w ago·also Capital One

Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models

LLMs re-rank documents better when you learn to route each query to the specific attention heads that matter, instead of relying on static subsets or everything at once.

Yuxing Tian, Fengran Mo, Zhiqi Huang +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Tsinghua AI3w ago

Skill Retrieval Augmentation for Agentic AI

Explicitly enumerating skills in-context doesn't scale for agentic LLMs, but retrieving skills on demand can substantially improve performance – if the LLM can figure out when and which skill to load.

Weihang Su, Jianming Long, Qingyao Ai +4

Recommendation & Information Retrieval Tool Use & Agents

Hang Deng +13w ago

Information-Theoretic Distributed Point Functions with Shorter Keys

Asymptotically shorter secret keys in Information-Theoretic Distributed Point Functions are now possible, thanks to a novel construction leveraging private information retrieval.

Hang Deng, Liang Feng Zhang

Distributed Systems & Hardware Recommendation & Information Retrieval

Tsinghua AI3w ago

MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation

LLMs can bootstrap their understanding of private APIs by autonomously learning from their own coding attempts, outperforming retrieval-augmented generation by 16% on code generation tasks.

Mo Li, Tao Chen, Guowei Yang +1

Code Generation & Program Synthesis Recommendation & Information Retrieval

Driss Choukri +33w ago

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

6G-enabled Internet of Everything promises a unified intelligent ecosystem, but faces critical scalability, security, and privacy challenges that demand innovative research.

Driss Choukri, Essaid Sabir, Elmahdi Driouh +1

Recommendation & Information Retrieval Robotics & Embodied AI Tool Use & Agents

Chen Feng +193w ago·also Nankai University, UC Santa Cruz

FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost

Sequence recommendation models can achieve near-perfect scaling efficiency in distributed training, slashing wasted GPU cycles by up to 90%.

Chen Feng, Haoli Zhang, Sh. B. Ali-zade +17

Distributed Systems & Hardware Recommendation & Information Retrieval Training Efficiency & Optimization

Zhuoling Li +33w ago

XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation

GraphRAG's black-box reasoning gets a spotlight: XGRAG reveals how specific knowledge graph components influence LLM outputs, boosting explanation quality by 14.81% over standard RAG explainability methods.

Zhuoling Li, Ha Nguyen, Valeria Bladinieres +1

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Chenglong Chu +343w ago·also Kuaishou

Kwai Summary Attention Technical Report

Sub-linear attention is now possible without sacrificing complete long-range dependency retention, thanks to learnable summary tokens that compress context.

Chenglong Chu, Guorui Zhou, Guowang Zhang +32

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval Training Efficiency & Optimization

Wenxuan Yang +53w ago

Modeling Behavioral Intensity and Transitions for Generative Recommendation

Generative recommendation gets a boost: modeling behavior intensity and transitions yields 15-23% gains in prediction accuracy.

Wenxuan Yang, Xiaoyang Xu, Hanyu Zhang +3

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

Meta AI3w ago

Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

Storing user interaction histories in a normalized, immutable tier and reconstructing sequences just-in-time slashes data infrastructure costs and unlocks the potential of ultra-long sequence DLRMs.

Liang Guo, Ge Song, Litao Deng +8

Distributed Systems & Hardware Recommendation & Information Retrieval Training Efficiency & Optimization

Esteban Rodr'iguez-Betancourt +13w ago

Geometric Analysis of Self-Supervised Vision Representations for Semantic Image Retrieval

Self-supervised vision models that ace linear probing can still flop at semantic image retrieval because of skewed latent space geometry that breaks approximate nearest neighbor search.

Esteban Rodr'iguez-Betancourt, Edgar Casasola-Murillo

Computer Vision Multimodal Models Recommendation & Information Retrieval

3w ago·also UC Santa Cruz, UQ

Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising

LLMs can denoise sequential recommendations by disagreeing with the recommendation model itself, leading to more robust performance against noisy user data.

Sijian Li, Min Gao, Zongwei Wang +3

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

3w ago·also Macquarie, PKU, UNSW

MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG

Semantic grounding, not token probability, is the key to better multimodal RAG.

Xihang Wang, Chengkai Huang, Quan Z. Sheng +2

Multimodal Models Natural Language Processing Recommendation & Information Retrieval