LDkit: a parallel computing toolkit for linkage disequilibrium analysis
Abstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-10-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-020-03754-5 |
id |
doaj-7663138ae83948b0b4ec622dea27c097 |
---|---|
record_format |
Article |
spelling |
doaj-7663138ae83948b0b4ec622dea27c0972020-11-25T02:46:18ZengBMCBMC Bioinformatics1471-21052020-10-012111810.1186/s12859-020-03754-5LDkit: a parallel computing toolkit for linkage disequilibrium analysisYou Tang0Zhuo Li1Chao Wang2Yuxin Liu3Helong Yu4Aoxue Wang5Yao Zhou6Electrical and Information Engineering College, Jilin Agricultural Science and Technology UniversityElectrical and Information Engineering College, Jilin Agricultural Science and Technology UniversityKey Laboratory of Crop Biotechnology Breeding of the Ministry of Agriculture, Beidahuang Kenfeng Seed Co., Ltd.College of Horticulture and Landscape Architecture, Northeast Agricultural UniversityInformation Technology Academy, Jilin Agricultural UniversityCollege of Horticulture and Landscape Architecture, Northeast Agricultural UniversityShenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture; Genome Analysis Laboratory of the Ministry of Agriculture; Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural SciencesAbstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. Results We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. Conclusions The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support.http://link.springer.com/article/10.1186/s12859-020-03754-5Population geneticsParallel computingGraphical user interfaceLinkage disequilibrium |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
You Tang Zhuo Li Chao Wang Yuxin Liu Helong Yu Aoxue Wang Yao Zhou |
spellingShingle |
You Tang Zhuo Li Chao Wang Yuxin Liu Helong Yu Aoxue Wang Yao Zhou LDkit: a parallel computing toolkit for linkage disequilibrium analysis BMC Bioinformatics Population genetics Parallel computing Graphical user interface Linkage disequilibrium |
author_facet |
You Tang Zhuo Li Chao Wang Yuxin Liu Helong Yu Aoxue Wang Yao Zhou |
author_sort |
You Tang |
title |
LDkit: a parallel computing toolkit for linkage disequilibrium analysis |
title_short |
LDkit: a parallel computing toolkit for linkage disequilibrium analysis |
title_full |
LDkit: a parallel computing toolkit for linkage disequilibrium analysis |
title_fullStr |
LDkit: a parallel computing toolkit for linkage disequilibrium analysis |
title_full_unstemmed |
LDkit: a parallel computing toolkit for linkage disequilibrium analysis |
title_sort |
ldkit: a parallel computing toolkit for linkage disequilibrium analysis |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2020-10-01 |
description |
Abstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. Results We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. Conclusions The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support. |
topic |
Population genetics Parallel computing Graphical user interface Linkage disequilibrium |
url |
http://link.springer.com/article/10.1186/s12859-020-03754-5 |
work_keys_str_mv |
AT youtang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT zhuoli ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT chaowang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT yuxinliu ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT helongyu ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT aoxuewang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis AT yaozhou ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis |
_version_ |
1724759252488159232 |