A robust machine learning approach to SDG data segmentation

Abstract In the light of the recent technological advances in computing and data explosion, the complex interactions of the Sustainable Development Goals (SDG) present both a challenge and an opportunity to researchers and decision makers across fields and sectors. The deep and wide socio-economic,...

Full description

Bibliographic Details
Main Authors: Kassim S. Mwitondi, Isaac Munyakazi, Barnabas N. Gatsheni
Format: Article
Language:English
Published: SpringerOpen 2020-11-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-020-00373-y
id doaj-f142c964410b4b569b2ee954bdd22c9c
record_format Article
spelling doaj-f142c964410b4b569b2ee954bdd22c9c2020-11-25T04:09:41ZengSpringerOpenJournal of Big Data2196-11152020-11-017111710.1186/s40537-020-00373-yA robust machine learning approach to SDG data segmentationKassim S. Mwitondi0Isaac Munyakazi1Barnabas N. Gatsheni2College of Business, Technology and Engineering, Sheffield Hallam UniversityMinistry of EducationDepartment of Applied Information Systems, University of JohannesburgAbstract In the light of the recent technological advances in computing and data explosion, the complex interactions of the Sustainable Development Goals (SDG) present both a challenge and an opportunity to researchers and decision makers across fields and sectors. The deep and wide socio-economic, cultural and technological variations across the globe entail a unified understanding of the SDG project. The complexity of SDGs interactions and the dynamics through their indicators align naturally to technical and application specifics that require interdisciplinary solutions. We present a consilient approach to expounding triggers of SDG indicators. Illustrated through data segmentation, it is designed to unify our understanding of the complex overlap of the SDGs by utilising data from different sources. The paper treats each SDG as a Big Data source node, with the potential to contribute towards a unified understanding of applications across the SDG spectrum. Data for five SDGs was extracted from the United Nations SDG indicators data repository and used to model spatio-temporal variations in search of robust and consilient scientific solutions. Based on a number of pre-determined assumptions on socio-economic and geo-political variations, the data is subjected to sequential analyses, exploring distributional behaviour, component extraction and clustering. All three methods exhibit pronounced variations across samples, with initial distributional and data segmentation patterns isolating South Africa from the remaining five countries. Data randomness is dealt with via a specially developed algorithm for sampling, measuring and assessing, based on repeated samples of different sizes. Results exhibit consistent variations across samples, based on socio-economic, cultural and geo-political variations entailing a unified understanding, across disciplines and sectors. The findings highlight novel paths towards attaining informative patterns for a unified understanding of the triggers of SDG indicators and open new paths to interdisciplinary research.http://link.springer.com/article/10.1186/s40537-020-00373-yBig DataConsilienceData randomnessData ScienceDevelopment Science FrameworkK-Means
collection DOAJ
language English
format Article
sources DOAJ
author Kassim S. Mwitondi
Isaac Munyakazi
Barnabas N. Gatsheni
spellingShingle Kassim S. Mwitondi
Isaac Munyakazi
Barnabas N. Gatsheni
A robust machine learning approach to SDG data segmentation
Journal of Big Data
Big Data
Consilience
Data randomness
Data Science
Development Science Framework
K-Means
author_facet Kassim S. Mwitondi
Isaac Munyakazi
Barnabas N. Gatsheni
author_sort Kassim S. Mwitondi
title A robust machine learning approach to SDG data segmentation
title_short A robust machine learning approach to SDG data segmentation
title_full A robust machine learning approach to SDG data segmentation
title_fullStr A robust machine learning approach to SDG data segmentation
title_full_unstemmed A robust machine learning approach to SDG data segmentation
title_sort robust machine learning approach to sdg data segmentation
publisher SpringerOpen
series Journal of Big Data
issn 2196-1115
publishDate 2020-11-01
description Abstract In the light of the recent technological advances in computing and data explosion, the complex interactions of the Sustainable Development Goals (SDG) present both a challenge and an opportunity to researchers and decision makers across fields and sectors. The deep and wide socio-economic, cultural and technological variations across the globe entail a unified understanding of the SDG project. The complexity of SDGs interactions and the dynamics through their indicators align naturally to technical and application specifics that require interdisciplinary solutions. We present a consilient approach to expounding triggers of SDG indicators. Illustrated through data segmentation, it is designed to unify our understanding of the complex overlap of the SDGs by utilising data from different sources. The paper treats each SDG as a Big Data source node, with the potential to contribute towards a unified understanding of applications across the SDG spectrum. Data for five SDGs was extracted from the United Nations SDG indicators data repository and used to model spatio-temporal variations in search of robust and consilient scientific solutions. Based on a number of pre-determined assumptions on socio-economic and geo-political variations, the data is subjected to sequential analyses, exploring distributional behaviour, component extraction and clustering. All three methods exhibit pronounced variations across samples, with initial distributional and data segmentation patterns isolating South Africa from the remaining five countries. Data randomness is dealt with via a specially developed algorithm for sampling, measuring and assessing, based on repeated samples of different sizes. Results exhibit consistent variations across samples, based on socio-economic, cultural and geo-political variations entailing a unified understanding, across disciplines and sectors. The findings highlight novel paths towards attaining informative patterns for a unified understanding of the triggers of SDG indicators and open new paths to interdisciplinary research.
topic Big Data
Consilience
Data randomness
Data Science
Development Science Framework
K-Means
url http://link.springer.com/article/10.1186/s40537-020-00373-y
work_keys_str_mv AT kassimsmwitondi arobustmachinelearningapproachtosdgdatasegmentation
AT isaacmunyakazi arobustmachinelearningapproachtosdgdatasegmentation
AT barnabasngatsheni arobustmachinelearningapproachtosdgdatasegmentation
AT kassimsmwitondi robustmachinelearningapproachtosdgdatasegmentation
AT isaacmunyakazi robustmachinelearningapproachtosdgdatasegmentation
AT barnabasngatsheni robustmachinelearningapproachtosdgdatasegmentation
_version_ 1724422222818312192