Investigating autism etiology and heterogeneity by decision tree algorithm
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2019-01-01
|
Series: | Informatics in Medicine Unlocked |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352914819300541 |
id |
doaj-bce3abb24ec3483492549f6b6e25baf4 |
---|---|
record_format |
Article |
spelling |
doaj-bce3abb24ec3483492549f6b6e25baf42020-11-25T02:13:59ZengElsevierInformatics in Medicine Unlocked2352-91482019-01-0116Investigating autism etiology and heterogeneity by decision tree algorithmMariam M. Hassan0Hoda M.O. Mokhtar1Corresponding author.; Faculty of Computers and Information, Cairo University, EgyptFaculty of Computers and Information, Cairo University, EgyptAutism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a wide spectrum of symptoms without any unifying genetic or environmental factors to account for the disorder. For better understanding of ASD, it is paramount to identify potential genetic and environmental risk factors that are comorbid with it. Identifying such factors is of great importance to determine potential causes for the disorder, and understand its heterogeneity. Existing large-scale datasets offer an opportunity for computer scientists to undertake this task by utilizing machine learning to reliably and efficiently obtain insight about potential ASD risk factors, which would in turn assist in guiding research in the field. In this study, decision tree algorithms were utilized to analyze related factors in datasets obtained from the National Database for Autism Research (NDAR) consisting of nearly 3000 individuals. We were able to identify 15 medical conditions that were highly associated with ASD diagnoses in patients; furthermore, we extended our analysis to the family medical history of patients and we report six potentially hereditary medical conditions associated with ASD. Associations reported had a 90% accuracy. Meanwhile, gender comparisons highlighted conditions that were unique to each gender and others that overlapped. Those findings were validated by the academic literature, thus opening the way for new directions for the use of decision tree algorithms to further understand the etiology of autism. Keywords: Autism spectrum disorder, Decision tree, Feature selectionhttp://www.sciencedirect.com/science/article/pii/S2352914819300541 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mariam M. Hassan Hoda M.O. Mokhtar |
spellingShingle |
Mariam M. Hassan Hoda M.O. Mokhtar Investigating autism etiology and heterogeneity by decision tree algorithm Informatics in Medicine Unlocked |
author_facet |
Mariam M. Hassan Hoda M.O. Mokhtar |
author_sort |
Mariam M. Hassan |
title |
Investigating autism etiology and heterogeneity by decision tree algorithm |
title_short |
Investigating autism etiology and heterogeneity by decision tree algorithm |
title_full |
Investigating autism etiology and heterogeneity by decision tree algorithm |
title_fullStr |
Investigating autism etiology and heterogeneity by decision tree algorithm |
title_full_unstemmed |
Investigating autism etiology and heterogeneity by decision tree algorithm |
title_sort |
investigating autism etiology and heterogeneity by decision tree algorithm |
publisher |
Elsevier |
series |
Informatics in Medicine Unlocked |
issn |
2352-9148 |
publishDate |
2019-01-01 |
description |
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that causes deficits in cognition, communication and social skills. ASD, however, is a highly heterogeneous disorder. This heterogeneity has made identifying the etiology of ASD a particularly difficult challenge, as patients exhibit a wide spectrum of symptoms without any unifying genetic or environmental factors to account for the disorder. For better understanding of ASD, it is paramount to identify potential genetic and environmental risk factors that are comorbid with it. Identifying such factors is of great importance to determine potential causes for the disorder, and understand its heterogeneity. Existing large-scale datasets offer an opportunity for computer scientists to undertake this task by utilizing machine learning to reliably and efficiently obtain insight about potential ASD risk factors, which would in turn assist in guiding research in the field. In this study, decision tree algorithms were utilized to analyze related factors in datasets obtained from the National Database for Autism Research (NDAR) consisting of nearly 3000 individuals. We were able to identify 15 medical conditions that were highly associated with ASD diagnoses in patients; furthermore, we extended our analysis to the family medical history of patients and we report six potentially hereditary medical conditions associated with ASD. Associations reported had a 90% accuracy. Meanwhile, gender comparisons highlighted conditions that were unique to each gender and others that overlapped. Those findings were validated by the academic literature, thus opening the way for new directions for the use of decision tree algorithms to further understand the etiology of autism. Keywords: Autism spectrum disorder, Decision tree, Feature selection |
url |
http://www.sciencedirect.com/science/article/pii/S2352914819300541 |
work_keys_str_mv |
AT mariammhassan investigatingautismetiologyandheterogeneitybydecisiontreealgorithm AT hodamomokhtar investigatingautismetiologyandheterogeneitybydecisiontreealgorithm |
_version_ |
1724902748320694272 |