FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes
With the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9442706/ |
id |
doaj-9fb86190df9d412e890c49413c9a979f |
---|---|
record_format |
Article |
spelling |
doaj-9fb86190df9d412e890c49413c9a979f2021-06-10T23:00:58ZengIEEEIEEE Access2169-35362021-01-019813708137910.1109/ACCESS.2021.30841219442706FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive AttributesTeresa Salazar0https://orcid.org/0000-0003-2471-5783Miriam Seoane Santos1https://orcid.org/0000-0002-5912-963XHelder Araujo2https://orcid.org/0000-0002-9544-424XPedro Henriques Abreu3https://orcid.org/0000-0002-9278-8194Department of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalDepartment of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalDepartment of Electrical and Computer Engineering, University of Coimbra, Coimbra, PortugalDepartment of Informatics Engineering, Centre for Informatics and Systems, University of Coimbra, Coimbra, PortugalWith the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups defined by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes’ distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes’ imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more difficult to learn by the classifiers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identified. We test the impact of FAWOS on different learning classifiers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classifiers while not neglecting the classification performance. Source code can be found at: <uri>https://github.com/teresalazar13/FAWOS</uri>https://ieeexplore.ieee.org/document/9442706/Classification biasfairnessimbalanced dataK-nearest neighborhoodoversampling |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu |
spellingShingle |
Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes IEEE Access Classification bias fairness imbalanced data K-nearest neighborhood oversampling |
author_facet |
Teresa Salazar Miriam Seoane Santos Helder Araujo Pedro Henriques Abreu |
author_sort |
Teresa Salazar |
title |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_short |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_full |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_fullStr |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_full_unstemmed |
FAWOS: Fairness-Aware Oversampling Algorithm Based on Distributions of Sensitive Attributes |
title_sort |
fawos: fairness-aware oversampling algorithm based on distributions of sensitive attributes |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
With the increased use of machine learning algorithms to make decisions which impact people’s lives, it is of extreme importance to ensure that predictions do not prejudice subgroups of the population with respect to sensitive attributes such as race or gender. Discrimination occurs when the probability of a positive outcome changes across privileged and unprivileged groups defined by the sensitive attributes. It has been shown that this bias can be originated from imbalanced data contexts where one of the classes contains a much smaller number of instances than the other classes. It is also important to identify the nature of the imbalanced data, including the characteristics of the minority classes’ distribution. This paper presents FAWOS: a Fairness-Aware oversampling algorithm which aims to attenuate unfair treatment by handling sensitive attributes’ imbalance. We categorize different types of datapoints according to their local neighbourhood with respect to the sensitive attributes, identifying which are more difficult to learn by the classifiers. In order to balance the dataset, FAWOS oversamples the training data by creating new synthetic datapoints using the different types of datapoints identified. We test the impact of FAWOS on different learning classifiers and analyze which can better handle sensitive attribute imbalance. Empirically, we observe that this algorithm can effectively increase the fairness results of the classifiers while not neglecting the classification performance. Source code can be found at: <uri>https://github.com/teresalazar13/FAWOS</uri> |
topic |
Classification bias fairness imbalanced data K-nearest neighborhood oversampling |
url |
https://ieeexplore.ieee.org/document/9442706/ |
work_keys_str_mv |
AT teresasalazar fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT miriamseoanesantos fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT helderaraujo fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes AT pedrohenriquesabreu fawosfairnessawareoversamplingalgorithmbasedondistributionsofsensitiveattributes |
_version_ |
1721384290736930816 |