Approximate Bayesian neural networks in genomic prediction

Abstract Background Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few th...

Full description

Bibliographic Details
Main Author: Patrik Waldmann
Format: Article
Language:deu
Published: BMC 2018-12-01
Series:Genetics Selection Evolution
Online Access:http://link.springer.com/article/10.1186/s12711-018-0439-1
id doaj-6e52ea0bb23349618bd7aa905d77eeea
record_format Article
spelling doaj-6e52ea0bb23349618bd7aa905d77eeea2020-11-25T02:55:10ZdeuBMCGenetics Selection Evolution1297-96862018-12-015011910.1186/s12711-018-0439-1Approximate Bayesian neural networks in genomic predictionPatrik Waldmann0Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences (SLU)Abstract Background Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. Results We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. Conclusions The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS.http://link.springer.com/article/10.1186/s12711-018-0439-1
collection DOAJ
language deu
format Article
sources DOAJ
author Patrik Waldmann
spellingShingle Patrik Waldmann
Approximate Bayesian neural networks in genomic prediction
Genetics Selection Evolution
author_facet Patrik Waldmann
author_sort Patrik Waldmann
title Approximate Bayesian neural networks in genomic prediction
title_short Approximate Bayesian neural networks in genomic prediction
title_full Approximate Bayesian neural networks in genomic prediction
title_fullStr Approximate Bayesian neural networks in genomic prediction
title_full_unstemmed Approximate Bayesian neural networks in genomic prediction
title_sort approximate bayesian neural networks in genomic prediction
publisher BMC
series Genetics Selection Evolution
issn 1297-9686
publishDate 2018-12-01
description Abstract Background Genome-wide marker data are used both in phenotypic genome-wide association studies (GWAS) and genome-wide prediction (GWP). Typically, such studies include high-dimensional data with thousands to millions of single nucleotide polymorphisms (SNPs) recorded in hundreds to a few thousands individuals. Different machine-learning approaches have been used in GWAS and GWP effectively, but the use of neural networks (NN) and deep-learning is still scarce. This study presents a NN model for genomic SNP data. Results We show, using both simulated and real pig data, that regularization is obtained using weight decay and dropout, and results in an approximate Bayesian (ABNN) model that can be used to obtain model averaged posterior predictions. The ABNN model is implemented in mxnet and shown to yield better prediction accuracy than genomic best linear unbiased prediction and Bayesian LASSO. The mean squared error was reduced by at least 6.5% in the simulated data and by at least 1% in the real data. Moreover, by comparing NN of different complexities, our results confirm that a shallow model with one layer, one neuron, one-hot encoding and a linear activation function performs better than more complex models. Conclusions The ABNN model provides a computationally efficient approach with good prediction performance and in which the weight components can also provide information on the importance of the SNPs. Hence, ABNN is suitable for both GWP and GWAS.
url http://link.springer.com/article/10.1186/s12711-018-0439-1
work_keys_str_mv AT patrikwaldmann approximatebayesianneuralnetworksingenomicprediction
_version_ 1724717842314559488