Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Decisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fairness, accounta...

Full description

Bibliographic Details
Main Authors:	Michael Veale, Reuben Binns
Format:	Article
Language:	English
Published:	SAGE Publishing 2017-11-01
Series:	Big Data & Society
Online Access:	https://doi.org/10.1177/2053951717743530

id	doaj-182f318f5e9b47019277e3af5ef1f758
record_format	Article
spelling	doaj-182f318f5e9b47019277e3af5ef1f7582020-11-25T03:12:24ZengSAGE PublishingBig Data & Society2053-95172017-11-01410.1177/2053951717743530Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive dataMichael VealeReuben BinnsDecisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fairness, accountability and transparency machine learning (FATML), their practical implementation faces real-world challenges. For legal, institutional or commercial reasons, organisations might not hold the data on sensitive attributes such as gender, ethnicity, sexuality or disability needed to diagnose and mitigate emergent indirect discrimination-by-proxy, such as redlining. Such organisations might also lack the knowledge and capacity to identify and manage fairness issues that are emergent properties of complex sociotechnical systems. This paper presents and discusses three potential approaches to deal with such knowledge and information deficits in the context of fairer machine learning. Trusted third parties could selectively store data necessary for performing discrimination discovery and incorporating fairness constraints into model-building in a privacy-preserving manner. Collaborative online platforms would allow diverse organisations to record, share and access contextual and experiential knowledge to promote fairness in machine learning systems. Finally, unsupervised learning and pedagogically interpretable algorithms might allow fairness hypotheses to be built for further selective testing and exploration. Real-world fairness challenges in machine learning are not abstract, constrained optimisation problems, but are institutionally and contextually grounded. Computational fairness tools are useful, but must be researched and developed in and with the messy contexts that will shape their deployment, rather than just for imagined situations. Not doing so risks real, near-term algorithmic harm.https://doi.org/10.1177/2053951717743530
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Michael Veale Reuben Binns
spellingShingle	Michael Veale Reuben Binns Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data Big Data & Society
author_facet	Michael Veale Reuben Binns
author_sort	Michael Veale
title	Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data
title_short	Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data
title_full	Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data
title_fullStr	Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data
title_full_unstemmed	Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data
title_sort	fairer machine learning in the real world: mitigating discrimination without collecting sensitive data
publisher	SAGE Publishing
series	Big Data & Society
issn	2053-9517
publishDate	2017-11-01
description	Decisions based on algorithmic, machine learning models can be unfair, reproducing biases in historical data used to train them. While computational techniques are emerging to address aspects of these concerns through communities such as discrimination-aware data mining (DADM) and fairness, accountability and transparency machine learning (FATML), their practical implementation faces real-world challenges. For legal, institutional or commercial reasons, organisations might not hold the data on sensitive attributes such as gender, ethnicity, sexuality or disability needed to diagnose and mitigate emergent indirect discrimination-by-proxy, such as redlining. Such organisations might also lack the knowledge and capacity to identify and manage fairness issues that are emergent properties of complex sociotechnical systems. This paper presents and discusses three potential approaches to deal with such knowledge and information deficits in the context of fairer machine learning. Trusted third parties could selectively store data necessary for performing discrimination discovery and incorporating fairness constraints into model-building in a privacy-preserving manner. Collaborative online platforms would allow diverse organisations to record, share and access contextual and experiential knowledge to promote fairness in machine learning systems. Finally, unsupervised learning and pedagogically interpretable algorithms might allow fairness hypotheses to be built for further selective testing and exploration. Real-world fairness challenges in machine learning are not abstract, constrained optimisation problems, but are institutionally and contextually grounded. Computational fairness tools are useful, but must be researched and developed in and with the messy contexts that will shape their deployment, rather than just for imagined situations. Not doing so risks real, near-term algorithmic harm.
url	https://doi.org/10.1177/2053951717743530
work_keys_str_mv	AT michaelveale fairermachinelearningintherealworldmitigatingdiscriminationwithoutcollectingsensitivedata AT reubenbinns fairermachinelearningintherealworldmitigatingdiscriminationwithoutcollectingsensitivedata
_version_	1724650540267208704

Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data

Similar Items