NUSA*STAREPFLMar 5, 2026arXiv:2603.05189

Small Changes, Big Impact: Demographic Bias in LLM-Based Hiring Through Subtle Sociocultural Markers in Anonymised Resumes

Bryan Chen Zhengyu Tan, Shaun Khoo, Ngoc Bich Doan, Zheng Liu, Nancy F. Chen, Roy Ka-Wei Lee

AI Summary

This paper introduces a framework to stress-test LLM-based hiring pipelines for demographic bias arising from subtle sociocultural markers in anonymized resumes. They augment 100 neutral resumes into 4100 variants spanning four ethnicities and two genders, differing only in job-irrelevant markers, and evaluate 18 LLMs in direct comparison and score-and-shortlist settings. Results show that LLMs can infer demographic attributes and exhibit systematic disparities, favoring markers associated with Chinese and Caucasian males, and that explanation prompting amplifies this bias.

Key Contribution

Even after removing names and other PII, LLMs still exhibit significant demographic biases in resume screening, favoring candidates based on subtle sociocultural markers like language and hobbies.

Abstract

Large Language Models (LLMs) are increasingly deployed in resume screening pipelines. Although explicit PII (e.g., names) is commonly redacted, resumes typically retain subtle sociocultural markers (languages, co-curricular activities, volunteering, hobbies) that can act as demographic proxies. We introduce a generalisable stress-test framework for hiring fairness, instantiated in the Singapore context: 100 neutral job-aligned resumes are augmented into 4100 variants spanning four ethnicities and two genders, differing only in job-irrelevant markers. We evaluate 18 LLMs in two realistic settings: (i) Direct Comparison (1v1) and (ii) Score&Shortlist (top-scoring rate), each with and without rationale prompting. Even without explicit identifiers, models recover demographic attributes with high F1 and exhibit systematic disparities, with models favouring markers associated with Chinese and Caucasian males. Ablations show language markers suffice for ethnicity inference, whereas gender relies on hobbies and activities. Furthermore, prompting for explanations tends to amplify bias. Our findings suggest that seemingly innocuous markers surviving anonymisation can materially skew automated hiring outcomes.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References58

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Small Changes, Big Impact: Demographic Bias in LLM-Based Hiring Through Subtle Sociocultural Markers in Anonymised Resumes

Related Papers