LDkit: a parallel computing toolkit for linkage disequilibrium analysis

Abstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis...

Full description

Bibliographic Details
Main Authors: You Tang, Zhuo Li, Chao Wang, Yuxin Liu, Helong Yu, Aoxue Wang, Yao Zhou
Format: Article
Language:English
Published: BMC 2020-10-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03754-5
id doaj-7663138ae83948b0b4ec622dea27c097
record_format Article
spelling doaj-7663138ae83948b0b4ec622dea27c0972020-11-25T02:46:18ZengBMCBMC Bioinformatics1471-21052020-10-012111810.1186/s12859-020-03754-5LDkit: a parallel computing toolkit for linkage disequilibrium analysisYou Tang0Zhuo Li1Chao Wang2Yuxin Liu3Helong Yu4Aoxue Wang5Yao Zhou6Electrical and Information Engineering College, Jilin Agricultural Science and Technology UniversityElectrical and Information Engineering College, Jilin Agricultural Science and Technology UniversityKey Laboratory of Crop Biotechnology Breeding of the Ministry of Agriculture, Beidahuang Kenfeng Seed Co., Ltd.College of Horticulture and Landscape Architecture, Northeast Agricultural UniversityInformation Technology Academy, Jilin Agricultural UniversityCollege of Horticulture and Landscape Architecture, Northeast Agricultural UniversityShenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture; Genome Analysis Laboratory of the Ministry of Agriculture; Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural SciencesAbstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. Results We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. Conclusions The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support.http://link.springer.com/article/10.1186/s12859-020-03754-5Population geneticsParallel computingGraphical user interfaceLinkage disequilibrium
collection DOAJ
language English
format Article
sources DOAJ
author You Tang
Zhuo Li
Chao Wang
Yuxin Liu
Helong Yu
Aoxue Wang
Yao Zhou
spellingShingle You Tang
Zhuo Li
Chao Wang
Yuxin Liu
Helong Yu
Aoxue Wang
Yao Zhou
LDkit: a parallel computing toolkit for linkage disequilibrium analysis
BMC Bioinformatics
Population genetics
Parallel computing
Graphical user interface
Linkage disequilibrium
author_facet You Tang
Zhuo Li
Chao Wang
Yuxin Liu
Helong Yu
Aoxue Wang
Yao Zhou
author_sort You Tang
title LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_short LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_full LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_fullStr LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_full_unstemmed LDkit: a parallel computing toolkit for linkage disequilibrium analysis
title_sort ldkit: a parallel computing toolkit for linkage disequilibrium analysis
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-10-01
description Abstract Background Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing. Results We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK ‘ped + map’ format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals. Conclusions The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users’ convenience. LDkit was written in JAVA language under cross-platform support.
topic Population genetics
Parallel computing
Graphical user interface
Linkage disequilibrium
url http://link.springer.com/article/10.1186/s12859-020-03754-5
work_keys_str_mv AT youtang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT zhuoli ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT chaowang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT yuxinliu ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT helongyu ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT aoxuewang ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
AT yaozhou ldkitaparallelcomputingtoolkitforlinkagedisequilibriumanalysis
_version_ 1724759252488159232