CU BoulderMay 28, 2026arXiv:2605.29468

SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

Almene De Meran Meguimtsop, Maria Leonor Pacheco, Daniel E. Acuna

AI Summary

SciIntBench, a new benchmark, evaluates LLMs' adherence to Responsible Conduct of Research (RCR) norms using 810 prompts across ten RCR categories and three scientific domains, framed as Overt Adversarial, Covert Adversarial, and Benign scenarios. Evaluating 16 LLMs, the study reveals that LLMs are significantly more susceptible to endorsing misconduct when it is framed covertly, particularly when presented as a response to pressure. This framing sensitivity varies across RCR categories, with weaker boundaries observed for transparency, plagiarism, and fabrication.

Key Contribution

LLMs are alarmingly susceptible to endorsing scientific misconduct when framed as a shortcut under pressure, revealing a critical gap in their alignment with research integrity norms.

Abstract

Large language models (LLMs) are increasingly used to support scientific work, but it is unclear whether they uphold responsible conduct of research (RCR) norms or help undermine them. We introduce SciIntBench, an adversarial benchmark of 810 prompts across ten RCR categories and three scientific domains. Each scenario appears as an Overt Adversarial, Covert Adversarial, and Benign version, allowing us to jointly measure framing-sensitive refusal of misconduct and helpfulness on legitimate requests. We evaluate 16 commercial and open-weight LLMs from six providers (2024--2026), producing 12,960 responses. We find that scientific integrity alignment is strongly framing-sensitive: models refuse explicit misconduct far more reliably than covert violations, especially failing when misconduct is presented as a pressure-driven shortcut. Refusals vary by RCR category, with weaker boundaries around transparency, plagiarism, and fabrication.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References27

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

Related Papers