Nonparametric approaches for population structure analysis

Abstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves gro...

Full description

Bibliographic Details
Main Authors: Luluah Alhusain, Alaaeldin M. Hafez
Format: Article
Language:English
Published: BMC 2018-05-01
Series:Human Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40246-018-0156-4
id doaj-0ea09800178647a2bcac68331d64faab
record_format Article
spelling doaj-0ea09800178647a2bcac68331d64faab2020-11-25T00:41:11ZengBMCHuman Genomics1479-73642018-05-0112111210.1186/s40246-018-0156-4Nonparametric approaches for population structure analysisLuluah Alhusain0Alaaeldin M. Hafez1College of Computer and Information Sciences, King Saud UniversityCollege of Computer and Information Sciences, King Saud UniversityAbstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges.http://link.springer.com/article/10.1186/s40246-018-0156-4Population structure analysisClusteringDimension reductionPrincipal component analysisAllele-sharing distanceGenetic data
collection DOAJ
language English
format Article
sources DOAJ
author Luluah Alhusain
Alaaeldin M. Hafez
spellingShingle Luluah Alhusain
Alaaeldin M. Hafez
Nonparametric approaches for population structure analysis
Human Genomics
Population structure analysis
Clustering
Dimension reduction
Principal component analysis
Allele-sharing distance
Genetic data
author_facet Luluah Alhusain
Alaaeldin M. Hafez
author_sort Luluah Alhusain
title Nonparametric approaches for population structure analysis
title_short Nonparametric approaches for population structure analysis
title_full Nonparametric approaches for population structure analysis
title_fullStr Nonparametric approaches for population structure analysis
title_full_unstemmed Nonparametric approaches for population structure analysis
title_sort nonparametric approaches for population structure analysis
publisher BMC
series Human Genomics
issn 1479-7364
publishDate 2018-05-01
description Abstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges.
topic Population structure analysis
Clustering
Dimension reduction
Principal component analysis
Allele-sharing distance
Genetic data
url http://link.springer.com/article/10.1186/s40246-018-0156-4
work_keys_str_mv AT luluahalhusain nonparametricapproachesforpopulationstructureanalysis
AT alaaeldinmhafez nonparametricapproachesforpopulationstructureanalysis
_version_ 1725286842965688320