April 20 – April 27, 2026

Recommendation & Information Retrieval - Weekly Roundup

100 papers published across 8 labs.

999% acceleration

Selected Labs publishing this week

Tsinghua AI3 DAMO2 NVIDIA1 Mila1 Meta AI1

Top Papers

Apr 23, 2026

Hans Ole Hatzel +4Apr 23, 2026

SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

LLM ensembles excel at classifying narrative similarity, but simpler embedding models can achieve comparable performance with clever pre- and post-processing.

Hans Ole Hatzel, Ekaterina Artemova, Haimo Stiemer +2

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Apr 27, 2026

AirbnbApr 27, 2026

Closing the Loop: A Software Framework for AI to Support Business Decision Making

Unleash your AI agent's business acumen: this framework lets AI not just analyze experiments, but actively ideate, personalize, and optimize business strategies within a safe, unified software interface.

Jeffrey Wong, Antoine Creux

Recommendation & Information Retrieval Tool Use & Agents

Kushal Raj Bhandari +4Apr 27, 2026

Improving Robustness of Tabular Retrieval via Representational Stability

Seemingly innocuous choices in table serialization format (CSV vs. HTML) can drastically alter retrieval performance, but a simple centroid-based correction can restore semantic consistency.

Kushal Raj Bhandari, Adarsh Singh, Jianxi Gao +2

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

NVIDIAApr 27, 2026·also Texas Tech University

CiteRadar: A Citation Intelligence Platform for Researcher Profiling and Geographic Visualization

See where your citations are coming from with a single command, thanks to CiteRadar's open-source platform that automatically generates interactive maps and detailed researcher profiles from your Google Scholar ID.

Chenxu Niu, Yiming Sun

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval

Apr 27, 2026

Don\'t Stop Early: Scalable Enterprise Deep Research with Controlled Information Flow and Evidence-Aware Termination

Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.

Prafulla Kumar Choubey, Kung-Hsiang Huang, P. Venkit +4

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

All Papers (100)

Apr 27, 2026

AirbnbApr 27, 2026

Closing the Loop: A Software Framework for AI to Support Business Decision Making

Jeffrey Wong, Antoine Creux

Recommendation & Information Retrieval Tool Use & Agents

Kushal Raj Bhandari +4Apr 27, 2026

Improving Robustness of Tabular Retrieval via Representational Stability

Seemingly innocuous choices in table serialization format (CSV vs. HTML) can drastically alter retrieval performance, but a simple centroid-based correction can restore semantic consistency.

Kushal Raj Bhandari, Adarsh Singh, Jianxi Gao +2

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

NVIDIAApr 27, 2026·also Texas Tech University

CiteRadar: A Citation Intelligence Platform for Researcher Profiling and Geographic Visualization

Chenxu Niu, Yiming Sun

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval

Apr 27, 2026

Don\'t Stop Early: Scalable Enterprise Deep Research with Controlled Information Flow and Evidence-Aware Termination

Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.

Prafulla Kumar Choubey, Kung-Hsiang Huang, P. Venkit +4

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

MilaApr 27, 2026·also Capital One

Learning to Route Queries to Heads for Attention-based Re-ranking with Large Language Models

LLMs re-rank documents better when you learn to route each query to the specific attention heads that matter, instead of relying on static subsets or everything at once.

Yuxing Tian, Fengran Mo, Zhiqi Huang +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Tsinghua AIApr 27, 2026

Skill Retrieval Augmentation for Agentic AI

Explicitly enumerating skills in-context doesn't scale for agentic LLMs, but retrieving skills on demand can substantially improve performance – if the LLM can figure out when and which skill to load.

Weihang Su, Jianming Long, Qingyao Ai +4

Recommendation & Information Retrieval Tool Use & Agents

Hang Deng +1Apr 27, 2026

Information-Theoretic Distributed Point Functions with Shorter Keys

Asymptotically shorter secret keys in Information-Theoretic Distributed Point Functions are now possible, thanks to a novel construction leveraging private information retrieval.

Hang Deng, Liang Feng Zhang

Distributed Systems & Hardware Recommendation & Information Retrieval

Tsinghua AIApr 27, 2026

MEMCoder: Multi-dimensional Evolving Memory for Private-Library-Oriented Code Generation

LLMs can bootstrap their understanding of private APIs by autonomously learning from their own coding attempts, outperforming retrieval-augmented generation by 16% on code generation tasks.

Mo Li, Tao Chen, Guowei Yang +1

Code Generation & Program Synthesis Recommendation & Information Retrieval

Driss Choukri +3Apr 27, 2026

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

6G-enabled Internet of Everything promises a unified intelligent ecosystem, but faces critical scalability, security, and privacy challenges that demand innovative research.

Driss Choukri, Essaid Sabir, Elmahdi Driouh +1

Recommendation & Information Retrieval Robotics & Embodied AI Tool Use & Agents

Chen Feng +19Apr 27, 2026·also Nankai University, UC Santa Cruz

FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost

Sequence recommendation models can achieve near-perfect scaling efficiency in distributed training, slashing wasted GPU cycles by up to 90%.

Chen Feng, Haoli Zhang, Sh. B. Ali-zade +17

Distributed Systems & Hardware Recommendation & Information Retrieval Training Efficiency & Optimization

Zhuoling Li +3Apr 27, 2026

XGRAG: A Graph-Native Framework for Explaining KG-based Retrieval-Augmented Generation

GraphRAG's black-box reasoning gets a spotlight: XGRAG reveals how specific knowledge graph components influence LLM outputs, boosting explanation quality by 14.81% over standard RAG explainability methods.

Zhuoling Li, Ha Nguyen, Valeria Bladinieres +1

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Chenglong Chu +34Apr 27, 2026·also Kuaishou, Nankai University

Kwai Summary Attention Technical Report

Sub-linear attention is now possible without sacrificing complete long-range dependency retention, thanks to learnable summary tokens that compress context.

Chenglong Chu, Guorui Zhou, Guowang Zhang +32

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval Training Efficiency & Optimization

Wenxuan Yang +5Apr 27, 2026

Modeling Behavioral Intensity and Transitions for Generative Recommendation

Generative recommendation gets a boost: modeling behavior intensity and transitions yields 15-23% gains in prediction accuracy.

Wenxuan Yang, Xiaoyang Xu, Hanyu Zhang +3

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

Meta AIApr 27, 2026·also School of Cybersecurity

Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale

Storing user interaction histories in a normalized, immutable tier and reconstructing sequences just-in-time slashes data infrastructure costs and unlocks the potential of ultra-long sequence DLRMs.

Liang Guo, Ge Song, Litao Deng +8

Distributed Systems & Hardware Recommendation & Information Retrieval Training Efficiency & Optimization

Esteban Rodr'iguez-Betancourt +1Apr 27, 2026

Geometric Analysis of Self-Supervised Vision Representations for Semantic Image Retrieval

Self-supervised vision models that ace linear probing can still flop at semantic image retrieval because of skewed latent space geometry that breaks approximate nearest neighbor search.

Esteban Rodr'iguez-Betancourt, Edgar Casasola-Murillo

Computer Vision Multimodal Models Recommendation & Information Retrieval

Apr 27, 2026·also UC Santa Cruz, UQ

Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising

LLMs can denoise sequential recommendations by disagreeing with the recommendation model itself, leading to more robust performance against noisy user data.

Sijian Li, Min Gao, Zongwei Wang +3

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

Apr 27, 2026·also Macquarie, PKU, UNSW

MEG-RAG: Quantifying Multi-modal Evidence Grounding for Evidence Selection in RAG

Semantic grounding, not token probability, is the key to better multimodal RAG.

Xihang Wang, Chengkai Huang, Quan Z. Sheng +2

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

Jiawei Wang +10Apr 27, 2026

DeepTaxon: An Interpretable Retrieval-Augmented Multimodal Framework for Unified Species Identification and Discovery

Species identification and discovery, traditionally treated as separate problems, can be unified into a single framework that leverages retrieval-augmented reasoning for improved accuracy and interpretability.

Jiawei Wang, Min Lei, Yaning Yang +8

Multimodal Models Recommendation & Information Retrieval Scientific Discovery & Drug Design

Theresia Veronika RampiselaApr 27, 2026

Offline Evaluation Measures of Fairness in Recommender Systems

Many recommender system fairness metrics are flawed, producing scores that are uninterpretable, inexpressive, or even incalculable in common scenarios.

Theresia Veronika Rampisela

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Apr 27, 2026·also Fudan, Michigan State, XJTU, ZJU

SEARCH-R: Structured Entity-Aware Retrieval with Chain-of-Reasoning Navigator for Multi-hop Question Answering

Stop relying on LLMs to "hallucinate" reasoning paths – SEARCH-R uses a fine-tuned Llama3.1-8B model and dependency tree-based retrieval to navigate multi-hop question answering more reliably.

Yuqing Fu, Yimin Deng, Yimin Deng +14

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Taeyoon Kim +6Apr 27, 2026

SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances

Multi-node spot instance configurations recommended by SpotVista offer 81% greater availability and 26% more cost savings than current state-of-the-art and publicly available services.

Taeyoon Kim, Kyumi Kim, Kyunghwan Kim +4

Distributed Systems & Hardware Recommendation & Information Retrieval

Apr 26, 2026

Auburn UniversityApr 26, 2026·also UVA

PageGuide: Browser extension to assist users in navigating a webpage and locating information

Stop blindly trusting LLMs: PageGuide visually grounds AI answers directly in the webpage, slashing task times by up to 70% and boosting accuracy by 26%.

Tin Nguyen, Thang Truong, Run Zhou +4

Recommendation & Information Retrieval Tool Use & Agents

Pritesh JhaApr 26, 2026

RaV-IDP: A Reconstruction-as-Validation Framework for Faithful Intelligent Document Processing

By reconstructing extractions and comparing them to the original document, RaV-IDP offers a grounded, label-free quality signal that dramatically improves the fidelity of intelligent document processing pipelines.

Pritesh Jha

Computer Vision Natural Language Processing Recommendation & Information Retrieval

Apr 25, 2026

Tsinghua AIApr 25, 2026·also Cambridge

AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval

Finding similar analog circuits across netlists, schematics, and descriptions just got way easier: a new model achieves 75% recall, unlocking better circuit design automation.

Yihan Wang, Lei Li, Yao Lai +2

Code Generation & Program Synthesis Multimodal Models Recommendation & Information Retrieval

Apr 24, 2026

Shaoang Li +12Apr 24, 2026

Learning Evidence Highlighting for Frozen LLMs

Highlighting pivotal evidence can boost LLM performance without altering the original context, leading to substantial improvements in reasoning tasks.

Shaoang Li, Yanhang Shi, Yufei Li +10

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 23, 2026

Apr 23, 2026·also SJTU

TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale

LLMs, when combined with efficient indexing, can extract actionable incidents from just a handful of noisy user descriptions in real-time, enabling rapid anomaly detection in large-scale cloud services.

Jun Wang, Ziyin Zhang, Rui Wang +3

Distributed Systems & Hardware Natural Language Processing Recommendation & Information Retrieval

Apr 23, 2026

Probably Approximately Consensus: On the Learning Theory of Finding Common Ground

Forget polling every user on every idea – this algorithm learns to find common ground by strategically asking for feedback on a few key statements.

Carter Blair, Ben Armstrong, Shiri Alouf-Heffetz +2

Natural Language Processing Recommendation & Information Retrieval

Ashley Abraham +4Apr 23, 2026

Large-Scale Data Parallelization of Product Quantization and Inverted Indexing Using Dask

Scale up your nearest neighbor search without blowing your budget: this work shows how to use Dask to parallelize Product Quantization and Inverted Indexing, achieving accuracy comparable to single-machine methods on much larger datasets.

Ashley Abraham, Andrew Strelzoff, Haley R. Dozier +2

Distributed Systems & Hardware Inference & Quantization Recommendation & Information Retrieval

N. Severin +10Apr 23, 2026·also Sber AI Lab

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

Get LLM-boosted recommendations without the LLM latency: this distillation method lets you bake rich user profiles into efficient sequential recommenders.

N. Severin, Danil Kartushov, V. Urzhumov +8

Inference & Quantization Natural Language Processing Recommendation & Information Retrieval

S. Piccolo +1Apr 23, 2026

The CriticalSet problem: Identifying Critical Contributors in Bipartite Dependency Networks

A surprisingly simple, linear-time algorithm, MinCov, nearly matches the performance of much slower metaheuristics in identifying critical nodes in bipartite dependency networks.

S. Piccolo, Andrea Tagarelli

Natural Language Processing Recommendation & Information Retrieval

Robin Dey +1Apr 23, 2026

Spatial Metaphors for LLM Memory: A Critical Analysis of the MemPalace Architecture

MemPalace's impressive memory recall isn't due to its fancy "memory palace" spatial organization, but rather its simple "store everything verbatim" approach combined with a strong embedding model.

Robin Dey, Panyanon Viradecha

Architecture Design (Transformers, SSMs, MoE)Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Paul Keuren +2Apr 23, 2026

Finding Meaning in Embeddings: Concept Separation Curves

Sentence embeddings can be objectively evaluated for conceptual stability without relying on downstream classifiers, revealing their true capacity to capture meaning.

Paul Keuren, M. Ponsen, Robert Ayoub Bagheri

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Wenjie Fu +7Apr 23, 2026

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

Enterprise LLM agents leak sensitive information in up to 50% of interactions, and surprisingly, performing better at tasks makes the problem *worse*.

Wenjie Fu, Xiaoting Qin, Jue Zhang +5

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Recommendation & Information Retrieval

J. AcuñaApr 23, 2026

EngramaBench: Evaluating Long-Term Conversational Memory with Structured Graph Retrieval

Structured graph memory can outperform full-context prompting for cross-session LLM reasoning, but optimizing for specific reasoning skills can hurt overall performance.

J. Acuña

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Zixu Li +5Apr 23, 2026

TEMA: Anchor the Image, Follow the Text for Multi-Modification Composed Image Retrieval

Multi-modification image retrieval is now possible: TEMA handles complex, real-world instructions that go beyond simple changes, outperforming existing methods on new datasets M-FashionIQ and M-CIRR.

Zixu Li, Yupeng Hu, Zhiheng Fu +3

Computer Vision Multimodal Models Recommendation & Information Retrieval

University of ColoradoApr 23, 2026

Multistakeholder Impacts of Profile Portability in a Recommender Ecosystem

Data portability in recommender systems doesn't guarantee better outcomes for users, as its impact varies significantly depending on the specific recommendation algorithm employed.

Anas Buhayh, Elizabeth McKinnie, Clement Canel +1

Constitutional AI & AI Ethics Recommendation & Information Retrieval

DAMOApr 23, 2026

ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs

LLMs can now directly predict geographic coordinates with high accuracy, even for vague locations and complex regions, bypassing the need for traditional geocoding pipelines.

Gong Wenbin

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 23, 2026

WPGRec: Wavelet Packet Guided Graph Enhanced Sequential Recommendation

Achieve state-of-the-art sequential recommendations by aligning multi-resolution temporal dynamics with graph propagation at matching scales.

Peilin Liu, Zhiquan Ji, Gang Yan

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval

Apr 23, 2026·also IBM Research

Towards Universal Tabular Embeddings: A Benchmark Across Data Tasks

Turns out, the best way to represent tabular data depends heavily on the task at hand, so a one-size-fits-all tabular foundation model may be a mirage.

Liane Vogel, Liane Vogel, Kavitha Srinivas +9

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Apr 23, 2026·also DAMO

Counterfactual Multi-task Learning for Delayed Conversion Modeling in E-commerce Sales Pre-Promotion

Predicting pre-promotion conversions in e-commerce gets a boost with a new model that understands how users "window shop" before sales actually start.

Kaiyuan Li

Natural Language Processing Recommendation & Information Retrieval

Shan Dong +5Apr 23, 2026·also Corresponding author

On Reasoning Behind Next Occupation Recommendation

Fine-tuning a single LLM to both reason about and predict future occupations surprisingly beats using two separate fine-tuned LLMs for each task.

Shan Dong, P. Achananuparp, Hieu-Hien Mai +3

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 23, 2026·also Sinequa by ChapsVision

From Tokens to Concepts: Leveraging SAE for SPLADE

SPLADE models can ditch their token-based vocabularies for a latent semantic space learned by Sparse Auto-Encoders, maintaining retrieval performance while boosting efficiency.

Yuxuan Zong, Mathias Vast, Basile Van Cooten +2

Interpretability & Mechanistic Interp Natural Language Processing Recommendation & Information Retrieval

Guojing Li +11Apr 23, 2026·also HKU

Job Skill Extraction via LLM-Centric Multi-Module Framework

LLMs can now reliably extract job skills from text, even in low-resource settings, thanks to a novel framework that enforces output validity and reduces hallucinations.

Guojing Li, Zichuan Fu, J. Li +9

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

Hans Ole Hatzel +4Apr 23, 2026

SemEval-2026 Task 4: Narrative Story Similarity and Narrative Representation Learning

LLM ensembles excel at classifying narrative similarity, but simpler embedding models can achieve comparable performance with clever pre- and post-processing.

Hans Ole Hatzel, Ekaterina Artemova, Haimo Stiemer +2

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Minping Chen +6Apr 23, 2026

Enhancing Online Recruitment with Category-Aware MoE and LLM-based Data Augmentation

LLMs can rewrite bad job descriptions and category-aware MoEs can better match candidates, leading to a 19.4% boost in recruitment click-through rates and millions saved.

Minping Chen, Bingquan Xu, Zulong Chen +4

Architecture Design (Transformers, SSMs, MoE)Data Curation & Synthetic Data Recommendation & Information Retrieval

Apr 23, 2026

MiMIC: Mitigating Visual Modality Collapse in Universal Multimodal Retrieval While Avoiding Semantic Misalignment

Early fusion UMR models lean too heavily on text, while late fusion struggles to relate semantically similar content – MiMIC offers a fix.

Juanxi Li, Chuanghao Ding, Xujie Zhang +1

Computer Vision Multimodal Models Recommendation & Information Retrieval

Wang Hai +3Apr 23, 2026

Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation

Quantifying vague software requirements doesn't have to be a guessing game: this method slashes the ambiguity with interactive preference elicitation, achieving 40x better results.

Wang Hai, Wang Shi Hai, Chen Tao +1

Code Generation & Program Synthesis Natural Language Processing Recommendation & Information Retrieval

Apr 22, 2026

Andrew Klearman +5Apr 22, 2026

Coverage, Not Averages: Semantic Stratification for Trustworthy Retrieval Evaluation

Systematic coverage gaps in retrieval evaluations can lead to misleading assessments, but semantic stratification offers a clearer, more trustworthy framework for understanding retrieval performance.

Andrew Klearman, Radu Revutchi, Rohin Garg +3

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Apr 22, 2026

Efficient Multi-Cohort Inference for Long-Term Effects and Lifetime Value in A/B Testing with User Learning

Short-term A/B test metrics can be misleading: this paper shows how to accurately estimate long-term value changes by modeling treatment effects as a decaying function learned from multiple cohorts.

Dario Simionato, Andrea Tonon, Mingxue Wang +3

Recommendation & Information Retrieval

Naizhong XuApr 22, 2026

Self-Aware Vector Embeddings for Retrieval-Augmented Generation: A Neuroscience-Inspired Framework for Temporal, Confidence-Weighted, and Relational Knowledge

SmartVector nearly doubles the accuracy of retrieval-augmented generation systems by embedding temporal and relational context directly into vector representations.

Naizhong Xu

Natural Language Processing Recommendation & Information Retrieval

Ruihan Zhou +4Apr 22, 2026

Cold-Start Forecasting of New Product Life-Cycles via Conditional Diffusion Models

CDLF outperforms traditional forecasting methods by adapting to new product data in real-time, even in the absence of historical outcomes.

Ruihan Zhou, Zishi Zhang, Jinhui Han +2

Computer Vision Recommendation & Information Retrieval Scientific Discovery & Drug Design

Juntao Li +1Apr 22, 2026

ACT: Anti-Crosstalk Learning for Cross-Sectional Stock Ranking via Temporal Disentanglement and Structural Purification

Achieve up to 74% improvement in stock ranking accuracy by disentangling temporal trends and purifying structural relationships, sidestepping the crosstalk problem that plagues existing graph-based methods.

Juntao Li, Liang Zhang

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval Training Efficiency & Optimization

Anhalt University of Applied SciencesApr 22, 2026·also CNRS, Université Grenoble Alpes

Lever: Inference-Time Policy Reuse under Support Constraints

Forget retraining: LEVER lets you snap together pre-trained RL policies at inference time, matching or beating from-scratch performance in some cases.

Ihor Vitenki, Noha Ibrahim, S. Amer-Yahia

Recommendation & Information Retrieval Robotics & Embodied AI

Apr 22, 2026·also D temporal RoPE applied on top of the, SCU

Where and What: Reasoning Dynamic and Implicit Preferences in Situated Conversational Recommendation

SiPeR reveals how integrating scene dynamics with Bayesian inference can dramatically enhance the relevance of conversational recommendations in real-world contexts.

Dongding Lin, Jian Wang, Wenjie Li

Multimodal Models Natural Language Processing Recommendation & Information Retrieval

Ioannis E. Livieris +4Apr 22, 2026

ORPHEAS: A Cross-Lingual Greek-English Embedding Model for Retrieval-Augmented Generation

ORPHEAS outperforms state-of-the-art multilingual models, proving that specialized fine-tuning can enhance retrieval capabilities for morphologically complex languages.

Ioannis E. Livieris, Athanasios Koursaris, Alexandra Apostolopoulou +2

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval

Google ResearchApr 22, 2026·also Max Planck

Semantic Recall for Vector Search

Stop penalizing your ANN search algorithms for failing to retrieve irrelevant neighbors – Semantic Recall offers a more nuanced and effective way to measure retrieval quality.

Leonardo Kuffó, Ioanna Tsakalidou, Roberta De Viti +3

Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Apr 22, 2026·also La Trobe University

Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction

Achieve unbounded historical video association for popularity prediction without unbounded storage growth by clustering videos in a topology-aware memory bank and updating cluster features instead of storing individual videos.

Dali Wang, Yunyao Zhang, Junqing Yu +3

Computer Vision Recommendation & Information Retrieval

Zhejiang Angel Medical AI Technology Co.Apr 22, 2026·also Miti AI Technology Co.

Knowledge Capsules: Structured Nonparametric Memory Units for LLMs

Forget RAG's indirect knowledge injection – Knowledge Capsules let external knowledge directly influence LLM attention, boosting performance and stability in complex reasoning tasks.

Bin Ju, Shenfeng Weng, Danying Zhou +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Deevashwer Rathee +4Apr 22, 2026

Onyx: Cost-Efficient Disk-Oblivious ANN Search

Leaking user queries through disk access patterns in sensitive ANN search? Onyx flips the script on prior work to achieve up to 9.9x cost reduction and 12.3x latency improvement.

Deevashwer Rathee, Jean Watson, Zirui Neil Zhao +2

Distributed Systems & Hardware Inference & Quantization Recommendation & Information Retrieval

Tong Zhao +3Apr 22, 2026

ATIR: Towards Audio-Text Interleaved Contextual Retrieval

Key contribution not extracted.

Tong Zhao, Chenghao Zhang, Yutao Zhu +1

Multimodal Models Recommendation & Information Retrieval Speech & Audio

V. SrinivasanApr 22, 2026

Stateless Decision Memory for Enterprise AI Agents

Enterprise AI agents don't need stateful memory to be effective: a stateless architecture called Deterministic Projection Memory (DPM) actually *beats* stateful approaches in regulated domains when memory is constrained, while also being faster and more auditable.

V. Srinivasan

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval Tool Use & Agents

Apr 22, 2026·also Tsinghua AI, XJU

Ask Only When Needed: Proactive Retrieval from Memory and Skills for Experience-Driven Lifelong Agents

Stop passively waiting for retrieval cues – ProactAgent proactively asks for information from its memory and skills, leading to significant gains in lifelong learning performance.

Yuxuan Cai, Qin Chen, Liang He

Recommendation & Information Retrieval Tool Use & Agents World Models & Planning

University of the Basque Country UPV/EHUApr 22, 2026

Effects of Cross-lingual Evidence in Multilingual Medical Question Answering

Turns out, the best external knowledge source for multilingual medical QA depends on whether you're working with a high- or low-resource language, and blindly adding PubMed might not be the answer.

Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Luca Foppiano +5Apr 22, 2026

Construction of a Battery Research Knowledge Graph using a Global Open Catalog

Discover expertise and collaborators in battery research at a global scale, grounded in semantic understanding rather than just citations.

Luca Foppiano, Sae Dieb, Malik Zain +3

Data Curation & Synthetic Data Recommendation & Information Retrieval Scientific Discovery & Drug Design

Lei Zheng +3Apr 22, 2026

To Know is to Construct: Schema-Constrained Generation for Agent Memory

Retrieval-based memory is out: schema-constrained generation ensures agents recall contextually relevant information without hallucinating memory keys, leading to substantial performance gains.

Lei Zheng, Weinan Song, Daili Li +1

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

Apr 22, 2026

Evolution of Research Method Usage Across the Academic Careers of Library and Information Science Scholars

LIS scholars get more basic as they age: bibliometric methods dominate the twilight of their careers.

Jiayi Hao, Chengzhi Zhang

Natural Language Processing Recommendation & Information Retrieval

Mingyu Zhang +1Apr 22, 2026·also HIT

ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

CIR models struggle with noisy data because "hard noise" breaks the small loss hypothesis, but ConeSep's novel unlearning approach overcomes this to achieve state-of-the-art results.

Mingyu Zhang, Liqiang Nie

Computer Vision Multimodal Models Recommendation & Information Retrieval

Apr 22, 2026·also HKU, SUSTech

UniCVR: From Alignment to Reranking for Unified Zero-Shot Composed Visual Retrieval

Key contribution not extracted.

Haokun Wen, Xuemeng Song, Weili Guan +1

Computer Vision Multimodal Models Recommendation & Information Retrieval

Thrust of Artificial IntelligenceApr 22, 2026·also NJU, OPPO

Discrete Preference Learning for Personalized Multimodal Generation

Quantizing user preferences into discrete tokens unlocks personalized multimodal content generation with improved consistency between modalities.

Yuting Zhang, Ying Sun, Dazhong Shen +3

Multimodal Models Recommendation & Information Retrieval RLHF & Preference Learning

Zhangchi ZhuApr 22, 2026

Break the Optimization Barrier of LLM-Enhanced Recommenders: A Theoretical Analysis and Practical Framework

LLM-enhanced recommenders stumble because of representation norm disparities and semantic misalignment, but a simple normalization and PCA-inspired alignment can unlock their potential.

Zhangchi Zhu

Natural Language Processing Recommendation & Information Retrieval Training Efficiency & Optimization

P. A. Bereuter +1Apr 22, 2026

Embedding-Based Intrusive Evaluation Metrics for Musical Source Separation Using MERT Representations

Ditch your old MSS evaluation metrics: MERT-based embeddings correlate far better with human perception.

P. A. Bereuter, Alois Sontacchi

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Speech & Audio

Apr 22, 2026·also Microsoft Research, Independent

From Hidden Profiles to Governable Personalization: Recommender Systems in the Age of LLM Agents

LLMs are poised to flip the script on personalization, giving users unprecedented control over their data and how it's used across platforms.

Jiahao Liu, Mingzhe Han, Guanming Liu +5

Recommendation & Information Retrieval Tool Use & Agents

Apr 22, 2026·also BIGAI, HKUST

Fast-then-Fine: A Two-Stage Framework with Multi-Granular Representation for Cross-Modal Retrieval in Remote Sensing

Achieve state-of-the-art remote sensing image-text retrieval without the computational burden of large-scale vision-language model pre-training, thanks to a novel two-stage approach.

Xi Chen, Xiangyang Jia, Xu Zhang +2

Computer Vision Multimodal Models Recommendation & Information Retrieval

Peng Peng +4Apr 22, 2026

HaS: Accelerating RAG through Homology-Aware Speculative Retrieval

Speed up your RAG pipelines by up to 37% without sacrificing accuracy by speculatively retrieving documents based on query homology.

Peng Peng, Weiwei Lin, Wentai Wu +2

Inference & Quantization Recommendation & Information Retrieval

Biao Zhang +4Apr 22, 2026·also Chongqing

AFMRL: Attribute-Enhanced Fine-Grained Multi-Modal Representation Learning in E-commerce

Forget generic image-text embeddings – teaching models to generate and reason about product *attributes* unlocks SOTA e-commerce retrieval.

Biao Zhang, Lixin Chen, Bin Zhang +2

Computer Vision Multimodal Models Recommendation & Information Retrieval

Apr 22, 2026

All Languages Matter: Understanding and Mitigating Language Bias in Multilingual RAG

Multilingual RAG systems are systematically suppressing "answer-critical" documents in non-English languages, crippling their ability to leverage global knowledge.

Guozhao Mo, Yafei Shi, Boxi Cao +6

Constitutional AI & AI Ethics Natural Language Processing Recommendation & Information Retrieval

Apr 21, 2026

Abdulmoneam Ali +1Apr 21, 2026

FB-NLL: A Feature-Based Approach to Tackle Noisy Labels in Personalized Federated Learning

By clustering users based on the geometry of their feature spaces *before* training, FB-NLL sidesteps the vulnerability of existing federated learning methods to noisy labels and corrupted updates.

Abdulmoneam Ali, Ahmed Arafa

Data Curation & Synthetic Data Recommendation & Information Retrieval Training Efficiency & Optimization

Pierre Perrault +3Apr 21, 2026·also INRIA, Paris-Saclay

Budgeted Online Influence Maximization

Forget picking influencers by headcount; this new framework lets you maximize influence based on your actual ad budget, and it even sharpens the math for the old way of doing things.

Pierre Perrault, Jennifer Healey, Zheng Wen +1

Natural Language Processing Recommendation & Information Retrieval

School of ComputingApr 21, 2026·also UNSW

CAST: Modeling Semantic-Level Transitions for Complementary-Aware Sequential Recommendation

Achieve up to 17.6% recall and 16% NDCG gains in sequential recommendation by modeling transitions directly in the discrete semantic code space, effectively capturing fine-grained semantic dependencies often lost in aggregated item representations.

Qian Zhang, Lech Szymanski, Haibo Zhang +1

Recommendation & Information Retrieval

Apr 21, 2026

Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding

CKGE benchmarks overestimate performance by up to 25% because they fail to account for "entity interference," a newly identified phenomenon where embeddings of new entities disrupt previously learned relationships.

Gerard Pons, Carlos Escolano, Besim Bilalli +1

Natural Language Processing Recommendation & Information Retrieval

Chenghao Zhang +5Apr 21, 2026·also UB

FOCAL-Attention for Heterogeneous Multi-Label Prediction

FOCAL-Attention resolves the inherent coverage-anchoring conflict in heterogeneous graph learning, outperforming existing methods in multi-label node classification.

Chenghao Zhang, Qingqing Long, Ludi Wang +3

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Apr 21, 2026

TabEmb: Joint Semantic-Structure Embedding for Table Annotation

LLMs can be effectively combined with graph-based methods to capture both semantic and structural information in tables, leading to state-of-the-art performance in table annotation tasks.

Ehsan Hoseinzade, Anandharaju Durai Raju

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

Apr 21, 2026·also Bristol

A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding

Forget relying on implicit reasoning: A-MAR's explicit reasoning plans unlock better artwork understanding by strategically retrieving relevant evidence.

Hongyi Zhu, Jia-Hong Huang, Yixian Shen +5

Multimodal Models Recommendation & Information Retrieval Tool Use & Agents

Eastern Institute of TechnologyApr 21, 2026·also ANU, CSIRO, Fuzhou University, HKU +2

EgoSelf: From Memory to Personalized Egocentric Assistant

Forget generic assistants – EgoSelf learns your habits from your first-person view data to predict your future interactions.

Yanshuo Wang, Xuesong Li, Jie Hong +3

Computer Vision Recommendation & Information Retrieval Tool Use & Agents

Hangzhou Dianzi UniversityApr 21, 2026·also Central South University, HKUST, Shanghai AI Lab, SYSU

From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning

Stop optimizing generative engines in isolation: MAGEO learns reusable editing strategies that dramatically improve visibility and citation fidelity across diverse engines.

Beining Wu, Fuyou Mao, Jiong Lin +6

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

National Cheng Kung UniversityApr 21, 2026·also Artificial Intelligence Research Center, Chang Gung University

SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization

Ditch ROUGE and unstable LLM rankings: SCURank leverages Summary Content Units to identify and select the most semantically rich summaries from diverse LLMs, boosting distillation performance.

Bo Wang, Ying-Jia Lin, Hung-Yu Kao

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Apr 21, 2026

An Answer is just the Start: Related Insight Generation for Open-Ended Document-Grounded QA

Forget single-shot QA: this paper introduces a new task of generating follow-up insights that extend and improve initial answers, enabling richer, more iterative user interactions.

Saransh Sharma, Pritika Ramu, Aparna Garimella +1

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

François Remy +1Apr 21, 2026

Diagnosable ColBERT: Debugging Late-Interaction Retrieval Models Using a Learned Latent Space as Reference

Unlock the black box of late-interaction retrieval models: Diagnosable ColBERT lets you directly inspect what the model "understands" by aligning token embeddings to a clinically-grounded latent space.

François Remy, Franccois Remy

Interpretability & Mechanistic Interp Natural Language Processing Recommendation & Information Retrieval

Yinhao Xiao +2Apr 21, 2026

EvoPatch-IoT: Evolution-Aware Cross-Architecture Vulnerability Retrieval and Patch-State Profiling for BusyBox-Based IoT Firmware

Forget relying on symbols or version strings – this new method pinpoints vulnerabilities in stripped IoT firmware across different architectures with impressive accuracy.

Yinhao Xiao, Huixi Li, Yongluo Shen

Code Generation & Program Synthesis Recommendation & Information Retrieval

College of Computer Science and TechnologyApr 21, 2026·also HKUST

iCoRe: An Iterative Correlation-Aware Retriever for Bug Reproduction Test Generation

Stop feeding your LLM-based bug reproduction tools irrelevant code: iCoRe's correlation-aware retrieval boosts test generation accuracy by up to 31.7%.

Junyi Wang, Jialun Cao, Zhongxin Liu

Code Generation & Program Synthesis Recommendation & Information Retrieval Tool Use & Agents

Apr 21, 2026

LoopCTR: Unlocking the Loop Scaling Power for Click-Through Rate Prediction

Train smarter, not bigger: LoopCTR unlocks state-of-the-art CTR prediction by decoupling computation from parameter growth through recursive layer reuse.

Jiakai Tang, Runfeng Zhang, Weiqiu Wang +8

Architecture Design (Transformers, SSMs, MoE)Recommendation & Information Retrieval Training Efficiency & Optimization

Apr 21, 2026·also NUS

Debating the Unspoken: Role-Anchored Multi-Agent Reasoning for Half-Truth Detection

Uncover misleading half-truths by pitting a Politician agent against a Scientist agent in a debate moderated by a Judge, revealing what's left unsaid.

Yixuan Tang, Hang Feng, Anthony K. H. Tung

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Qianyun Yang +1Apr 21, 2026

Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

MLLMs can be distilled into lightweight arbiters that dramatically improve the robustness of composed image retrieval by disentangling noisy training signals.

Qianyun Yang, Shiqi Zhang

Computer Vision Multimodal Models Recommendation & Information Retrieval

Huazhong Agricultural UniversityApr 21, 2026·also ECNU, HUST

DINO Eats CLIP: Adapting Beyond Knowns for Open-set 3D Object Retrieval

DINO, not CLIP, might be the better foundation for open-set 3D object retrieval, especially when paired with dynamic view integration and virtual feature synthesis to avoid overfitting.

Xinwei He, Yansong Zheng, Qianru Han +7

Computer Vision Multimodal Models Recommendation & Information Retrieval

Yi Xiang +1Apr 21, 2026

Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract

Academic paper "highlights" sections are a surprisingly rich source of keywords, boosting unsupervised extraction when combined with abstracts.

Yi Xiang, Chengzhi Zhang

Natural Language Processing Recommendation & Information Retrieval

Saket MagantiApr 21, 2026

When Graph Structure Becomes a Liability: A Critical Re-Evaluation of Graph Neural Networks for Bitcoin Fraud Detection under Temporal Distribution Shift

The widely-held belief that GNNs outperform feature-only methods for Bitcoin fraud detection crumbles under rigorous, leakage-free evaluation, revealing that the graph structure can actually hurt performance.

Saket Maganti

Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Gabriel Iturra-Bocaz +1Apr 21, 2026·also University of Stavanger

A Reproducibility Study of Metacognitive Retrieval-Augmented Generation

Reproducibility crisis hits RAG: closed-source LLM updates, missing implementation details, and unreleased prompts make replicating MetaRAG's original performance a challenge, despite confirming relative gains.

Gabriel Iturra-Bocaz, Petra Galuščáková

Open-Source Models & Weights Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 21, 2026·also Iowa State, UC Davis, UMN

GraphRAG-IRL: Personalized Recommendation with Graph-Grounded Inverse Reinforcement Learning and LLM Re-ranking

LLMs are unreliable ranking engines on their own, but fusing them with graph-grounded IRL creates a recommender system that's more than the sum of its parts, boosting NDCG@10 by up to 16.8%.

Siqi Liang, Xiawei Wang, Yudi Zhang +1

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

Nico Baumgart +3Apr 21, 2026·also Phoenix Contact GmbH & Co. KG Blomberg

ECLASS-Augmented Semantic Product Search for Electronic Components

Forget web search, dense retrieval augmented with hierarchical metadata achieves 94% hit rate in semantic search for electronic components, blowing away traditional methods.

Nico Baumgart, Markus Lange-Hegermann, Janina Henze +1

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

VNU University of Engineering and TechnologyApr 21, 2026·also TU Delft

From Top-1 to Top-K: A Reproducibility Study and Benchmarking of Counterfactual Explanations for Recommender Systems

Counterfactual explainers for recommender systems don't generalize as well as we thought: their effectiveness and sparsity depend heavily on the evaluation setting, and graph-based methods struggle to scale.

Quang-Huy Nguyen, Thanh-Hai Nguyen, Khac-Manh Thai +6

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Recommendation & Information Retrieval