Investigating autism etiology and heterogeneity by decision tree algorithm

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a...

Full description

Bibliographic Details
Main Authors: Mariam M. Hassan, Hoda M.O. Mokhtar
Format: Article
Language:English
Published: Elsevier 2019-01-01
Series:Informatics in Medicine Unlocked
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914819300541
id doaj-bce3abb24ec3483492549f6b6e25baf4
record_format Article
spelling doaj-bce3abb24ec3483492549f6b6e25baf42020-11-25T02:13:59ZengElsevierInformatics in Medicine Unlocked2352-91482019-01-0116Investigating autism etiology and heterogeneity by decision tree algorithmMariam M. Hassan0Hoda M.O. Mokhtar1Corresponding author.; Faculty of Computers and Information, Cairo University, EgyptFaculty of Computers and Information, Cairo University, EgyptAutism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a wide spectrum of symptoms without any unifying genetic or environmental factors to account for the disorder. For better understanding of ASD, it is paramount to identify potential genetic and environmental risk factors that are comorbid with it. Identifying such factors is of great importance to determine potential causes for the disorder, and understand its heterogeneity. Existing large-scale datasets offer an opportunity for computer scientists to undertake this task by utilizing machine learning to reliably and efficiently obtain insight about potential ASD risk factors, which would in turn assist in guiding research in the field. In this study, decision tree algorithms were utilized to analyze related factors in datasets obtained from the National Database for Autism Research (NDAR) consisting of nearly 3000 individuals. We were able to identify 15 medical conditions that were highly associated with ASD diagnoses in patients; furthermore, we extended our analysis to the family medical history of patients and we report six potentially hereditary medical conditions associated with ASD. Associations reported had a 90% accuracy. Meanwhile, gender comparisons highlighted conditions that were unique to each gender and others that overlapped. Those findings were validated by the academic literature, thus opening the way for new directions for the use of decision tree algorithms to further understand the etiology of autism. Keywords: Autism spectrum disorder, Decision tree, Feature selectionhttp://www.sciencedirect.com/science/article/pii/S2352914819300541
collection DOAJ
language English
format Article
sources DOAJ
author Mariam M. Hassan
Hoda M.O. Mokhtar
spellingShingle Mariam M. Hassan
Hoda M.O. Mokhtar
Investigating autism etiology and heterogeneity by decision tree algorithm
Informatics in Medicine Unlocked
author_facet Mariam M. Hassan
Hoda M.O. Mokhtar
author_sort Mariam M. Hassan
title Investigating autism etiology and heterogeneity by decision tree algorithm
title_short Investigating autism etiology and heterogeneity by decision tree algorithm
title_full Investigating autism etiology and heterogeneity by decision tree algorithm
title_fullStr Investigating autism etiology and heterogeneity by decision tree algorithm
title_full_unstemmed Investigating autism etiology and heterogeneity by decision tree algorithm
title_sort investigating autism etiology and heterogeneity by decision tree algorithm
publisher Elsevier
series Informatics in Medicine Unlocked
issn 2352-9148
publishDate 2019-01-01
description Autism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a wide spectrum of symptoms without any unifying genetic or environmental factors to account for the disorder. For better understanding of ASD, it is paramount to identify potential genetic and environmental risk factors that are comorbid with it. Identifying such factors is of great importance to determine potential causes for the disorder, and understand its heterogeneity. Existing large-scale datasets offer an opportunity for computer scientists to undertake this task by utilizing machine learning to reliably and efficiently obtain insight about potential ASD risk factors, which would in turn assist in guiding research in the field. In this study, decision tree algorithms were utilized to analyze related factors in datasets obtained from the National Database for Autism Research (NDAR) consisting of nearly 3000 individuals. We were able to identify 15 medical conditions that were highly associated with ASD diagnoses in patients; furthermore, we extended our analysis to the family medical history of patients and we report six potentially hereditary medical conditions associated with ASD. Associations reported had a 90% accuracy. Meanwhile, gender comparisons highlighted conditions that were unique to each gender and others that overlapped. Those findings were validated by the academic literature, thus opening the way for new directions for the use of decision tree algorithms to further understand the etiology of autism. Keywords: Autism spectrum disorder, Decision tree, Feature selection
url http://www.sciencedirect.com/science/article/pii/S2352914819300541
work_keys_str_mv AT mariammhassan investigatingautismetiologyandheterogeneitybydecisiontreealgorithm
AT hodamomokhtar investigatingautismetiologyandheterogeneitybydecisiontreealgorithm
_version_ 1724902748320694272