Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection in the Pursuit of Fairness

Andrus, McKane; Villeneuve, Sarah

Computer Science > Computers and Society

arXiv:2205.01038 (cs)

[Submitted on 18 Apr 2022 (v1), last revised 4 May 2022 (this version, v2)]

Title:Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection in the Pursuit of Fairness

Authors:McKane Andrus, Sarah Villeneuve

View PDF

Abstract:Most proposed algorithmic fairness techniques require access to data on a "sensitive attribute" or "protected category" (such as race, ethnicity, gender, or sexuality) in order to make performance comparisons and standardizations across groups, however this data is largely unavailable in practice, hindering the widespread adoption of algorithmic fairness. Through this paper, we consider calls to collect more data on demographics to enable algorithmic fairness and challenge the notion that discrimination can be overcome with smart enough technical methods and sufficient data alone. We show how these techniques largely ignore broader questions of data governance and systemic oppression when categorizing individuals for the purpose of fairer algorithmic processing. In this work, we explore under what conditions demographic data should be collected and used to enable algorithmic fairness methods by characterizing a range of social risks to individuals and communities. For the risks to individuals we consider the unique privacy risks associated with the sharing of sensitive attributes likely to be the target of fairness analysis, the possible harms stemming from miscategorizing and misrepresenting individuals in the data collection process, and the use of sensitive data beyond data subjects' expectations. Looking more broadly, the risks to entire groups and communities include the expansion of surveillance infrastructure in the name of fairness, misrepresenting and mischaracterizing what it means to be part of a demographic group or to hold a certain identity, and ceding the ability to define for themselves what constitutes biased or unfair treatment. We argue that, by confronting these questions before and during the collection of demographic data, algorithmic fairness methods are more likely to actually mitigate harmful treatment disparities without reinforcing systems of oppression.

Comments:	21 pages, accepted to FAccT 2022. Updated to camera ready version: added a section defining how we use demographic data, clarified distinction between sections 5.2 and 6.2, additional line edits
Subjects:	Computers and Society (cs.CY); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2205.01038 [cs.CY]
	(or arXiv:2205.01038v2 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2205.01038

Submission history

From: McKane Andrus [view email]
[v1] Mon, 18 Apr 2022 04:50:09 UTC (732 KB)
[v2] Wed, 4 May 2022 17:25:56 UTC (394 KB)

Computer Science > Computers and Society

Title:Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection in the Pursuit of Fairness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection in the Pursuit of Fairness

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators