FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

With the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when...

Full description

Bibliographic Details
Main Authors:	Teresa Salazar, Miriam Seoane Santos, Helder Araujo, Pedro Henriques Abreu
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Classification bias fairness imbalanced data K-nearest neighborhood oversampling
Online Access:	https://ieeexplore.ieee.org/document/9442706/

id	doaj-9fb86190df9d412e890c49413c9a979f
record_format	Article
spelling	doaj-9fb86190df9d412e890c49413c9a979f2021-06-10T23:00:58ZengIEEEIEEE Access2169-35362021-01-019813708137910.1109/ACCESS.2021.30841219442706FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive AttributesTeresa Salazar0https://orcid.org/0000-0003-2471-5783Miriam Seoane Santos1https://orcid.org/0000-0002-5912-963XHelder Araujo2https://orcid.org/0000-0002-9544-424XPedro Henriques Abreu3https://orcid.org/0000-0002-9278-8194Department of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalDepartment of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalDepartment of Electrical and Computer Engineering, University of Coimbra, Coimbra, PortugalDepartment of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalWith the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups defined by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes’ distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes’ imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more difficult to learn by the classifiers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identified. We test the impact of FAWOS on different learning classifiers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classifiers while not neglecting the classification performance. Source code can be found at: <uri>https://github.com/teresalazar13/FAWOS</uri>https://ieeexplore.ieee.org/document/9442706/Classification biasfairnessimbalanced dataK-nearest neighborhoodoversampling
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu
spellingShingle	Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes IEEE Access Classification bias fairness imbalanced data K-nearest neighborhood oversampling
author_facet	Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu
author_sort	Teresa Salazar
title	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_short	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_fullStr	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_full_unstemmed	FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
title_sort	fawos: fairness-aware oversampling algorithm based on distributions of sensitive attributes
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	With the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups defined by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes’ distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes’ imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more difficult to learn by the classifiers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identified. We test the impact of FAWOS on different learning classifiers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classifiers while not neglecting the classification performance. Source code can be found at: <uri>https://github.com/teresalazar13/FAWOS</uri>
topic	Classification bias fairness imbalanced data K-nearest neighborhood oversampling
url	https://ieeexplore.ieee.org/document/9442706/
work_keys_str_mv	AT teresasalazar fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT miriamseoanesantos fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT helderaraujo fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT pedrohenriquesabreu fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes
_version_	1721384290736930816

FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes

Similar Items