Benjamin Van Durme

Papers (5)

Apr 10, 2026

Apr 10, 2026·also Apple ML

Many-Tier Instruction Hierarchy in LLM Agents

Even state-of-the-art LLMs struggle to follow complex instruction hierarchies, achieving only ~40% accuracy when navigating conflicts across a dozen privilege levels in agentic tasks.

Jingyu Zhang, Tianjian Li, William Jurayj +3

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Apr 6, 2026

DeonticBench: A Benchmark for Reasoning over Rules

LLMs struggle to navigate the nuances of real-world rules, achieving only ~45% accuracy on a new benchmark of legal and policy reasoning tasks.

Guangyao Dou, Luis Brena, Akhil Deo +5

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

Mar 11, 2026

Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers

Reasoning rerankers don't magically fix fairness issues in search, preserving the biases of their input rankings despite boosting relevance.

Saron Samuel, Benjamin Van Durme, Eugene Yang

Constitutional AI & AI Ethics Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Mar 9, 2026

Mar 9, 2026·also UMass, UMD

Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

Stop blindly optimizing for retrieval relevance in RAG pipelines: coverage-based retrieval metrics are better early indicators of the final generated response's information coverage.

Saron Samuel, Alexander Martin, Eugene Yang +5

Eval Frameworks & Benchmarks Natural Language Processing Recommendation & Information Retrieval

Feb 24, 2026

Multi-Vector Index Compression in Any Modality

Attention-guided clustering slashes the storage costs of multi-vector document representations for retrieval across text, images, and video, often *improving* performance compared to uncompressed indexes.

Hanxiang Qin, Alexander Martin, Alexander Martin +5

Inference & Quantization Multimodal Models Recommendation & Information Retrieval

Benjamin Van Durme

Research focus

Frequent co-authors

Papers (5)