Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring Contexts

Preethi Seshadri, Hongyu Chen, Sameer Singh, Seraphina Goldfarb-Tarrant

公開日: 2025/1/8

Abstract

Large language models (LLMs) are increasingly being deployed in high-stakes applications like hiring, yet their potential for unfair decision-making remains understudied in generative and retrieval settings. In this work, we examine the allocational fairness of LLM-based hiring systems through two tasks that reflect actual HR usage: resume summarization and applicant ranking. By constructing a synthetic resume dataset with controlled perturbations and curating job postings, we investigate whether model behavior differs across demographic groups. Our findings reveal that generated summaries exhibit meaningful differences more frequently for race than for gender perturbations. Models also display non-uniform retrieval selection patterns across demographic groups and exhibit high ranking sensitivity to both gender and race perturbations. Surprisingly, retrieval models can show comparable sensitivity to both demographic and non-demographic changes, suggesting that fairness issues may stem from broader model brittleness. Overall, our results indicate that LLM-based hiring systems, especially in the retrieval stage, can exhibit notable biases that lead to discriminatory outcomes in real-world contexts.

Small Changes, Large Consequences: Analyzing the Allocational Fairness of LLMs in Hiring Contexts | SummarXiv | SummarXiv