Nonparametric approaches for population structure analysis
Abstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves gro...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2018-05-01
|
Series: | Human Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s40246-018-0156-4 |
id |
doaj-0ea09800178647a2bcac68331d64faab |
---|---|
record_format |
Article |
spelling |
doaj-0ea09800178647a2bcac68331d64faab2020-11-25T00:41:11ZengBMCHuman Genomics1479-73642018-05-0112111210.1186/s40246-018-0156-4Nonparametric approaches for population structure analysisLuluah Alhusain0Alaaeldin M. Hafez1College of Computer and Information Sciences, King Saud UniversityCollege of Computer and Information Sciences, King Saud UniversityAbstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges.http://link.springer.com/article/10.1186/s40246-018-0156-4Population structure analysisClusteringDimension reductionPrincipal component analysisAllele-sharing distanceGenetic data |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Luluah Alhusain Alaaeldin M. Hafez |
spellingShingle |
Luluah Alhusain Alaaeldin M. Hafez Nonparametric approaches for population structure analysis Human Genomics Population structure analysis Clustering Dimension reduction Principal component analysis Allele-sharing distance Genetic data |
author_facet |
Luluah Alhusain Alaaeldin M. Hafez |
author_sort |
Luluah Alhusain |
title |
Nonparametric approaches for population structure analysis |
title_short |
Nonparametric approaches for population structure analysis |
title_full |
Nonparametric approaches for population structure analysis |
title_fullStr |
Nonparametric approaches for population structure analysis |
title_full_unstemmed |
Nonparametric approaches for population structure analysis |
title_sort |
nonparametric approaches for population structure analysis |
publisher |
BMC |
series |
Human Genomics |
issn |
1479-7364 |
publishDate |
2018-05-01 |
description |
Abstract The analysis of population structure has many applications in medical and population genetic research. Such analysis is used to provide clear insight into the underlying genetic population substructure and is a crucial prerequisite for any analysis of genetic data. The analysis involves grouping individuals into subpopulations based on shared genetic variations. The most widely used markers to study the variation of DNA sequences between populations are single nucleotide polymorphisms. Data preprocessing is a necessary step to assess the quality of the data and to determine which markers or individuals can reasonably be included in the analysis. After preprocessing, several methods can be utilized to uncover population substructure, which can be categorized into two broad approaches: parametric and nonparametric. Parametric approaches use statistical models to infer population structure and assign individuals into subpopulations. However, these approaches suffer from many drawbacks that make them impractical for large datasets. In contrast, nonparametric approaches do not suffer from these drawbacks, making them more viable than parametric approaches for analyzing large datasets. Consequently, nonparametric approaches are increasingly used to reveal population substructure. Thus, this paper reviews and discusses the nonparametric approaches that are available for population structure analysis along with some implications to resolve challenges. |
topic |
Population structure analysis Clustering Dimension reduction Principal component analysis Allele-sharing distance Genetic data |
url |
http://link.springer.com/article/10.1186/s40246-018-0156-4 |
work_keys_str_mv |
AT luluahalhusain nonparametricapproachesforpopulationstructureanalysis AT alaaeldinmhafez nonparametricapproachesforpopulationstructureanalysis |
_version_ |
1725286842965688320 |