Efficient differentially private learning improves drug sensitivity prediction

Abstract Background Users of a personalised recommendation system face a dilemma: recommendations can be improved by learning from data, but only if other users are willing to share their private information. Good personalised predictions are vitally important in precision medicine, but genomic info...

Full description

Bibliographic Details
Main Authors: Antti Honkela, Mrinal Das, Arttu Nieminen, Onur Dikmen, Samuel Kaski
Format: Article
Language:English
Published: BMC 2018-02-01
Series:Biology Direct
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13062-017-0203-4
id doaj-6369e40b6752481290e96031b8b1f470
record_format Article
spelling doaj-6369e40b6752481290e96031b8b1f4702020-11-24T21:33:24ZengBMCBiology Direct1745-61502018-02-0113111210.1186/s13062-017-0203-4Efficient differentially private learning improves drug sensitivity predictionAntti Honkela0Mrinal Das1Arttu Nieminen2Onur Dikmen3Samuel Kaski4Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of HelsinkiHelsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto UniversityHelsinki Institute for Information Technology HIIT, Department of Computer Science, University of HelsinkiHelsinki Institute for Information Technology HIIT, Department of Computer Science, University of HelsinkiHelsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto UniversityAbstract Background Users of a personalised recommendation system face a dilemma: recommendations can be improved by learning from data, but only if other users are willing to share their private information. Good personalised predictions are vitally important in precision medicine, but genomic information on which the predictions are based is also particularly sensitive, as it directly identifies the patients and hence cannot easily be anonymised. Differential privacy has emerged as a potentially promising solution: privacy is considered sufficient if presence of individual patients cannot be distinguished. However, differentially private learning with current methods does not improve predictions with feasible data sizes and dimensionalities. Results We show that useful predictors can be learned under powerful differential privacy guarantees, and even from moderately-sized data sets, by demonstrating significant improvements in the accuracy of private drug sensitivity prediction with a new robust private regression method. Our method matches the predictive accuracy of the state-of-the-art non-private lasso regression using only 4x more samples under relatively strong differential privacy guarantees. Good performance with limited data is achieved by limiting the sharing of private information by decreasing the dimensionality and by projecting outliers to fit tighter bounds, therefore needing to add less noise for equal privacy. Conclusions The proposed differentially private regression method combines theoretical appeal and asymptotic efficiency with good prediction accuracy even with moderate-sized data. As already the simple-to-implement method shows promise on the challenging genomic data, we anticipate rapid progress towards practical applications in many fields. Reviewers This article was reviewed by Zoltan Gaspari and David Kreil.http://link.springer.com/article/10.1186/s13062-017-0203-4Differential privacyLinear regressionDrug sensitivity predictionMachine learning
collection DOAJ
language English
format Article
sources DOAJ
author Antti Honkela
Mrinal Das
Arttu Nieminen
Onur Dikmen
Samuel Kaski
spellingShingle Antti Honkela
Mrinal Das
Arttu Nieminen
Onur Dikmen
Samuel Kaski
Efficient differentially private learning improves drug sensitivity prediction
Biology Direct
Differential privacy
Linear regression
Drug sensitivity prediction
Machine learning
author_facet Antti Honkela
Mrinal Das
Arttu Nieminen
Onur Dikmen
Samuel Kaski
author_sort Antti Honkela
title Efficient differentially private learning improves drug sensitivity prediction
title_short Efficient differentially private learning improves drug sensitivity prediction
title_full Efficient differentially private learning improves drug sensitivity prediction
title_fullStr Efficient differentially private learning improves drug sensitivity prediction
title_full_unstemmed Efficient differentially private learning improves drug sensitivity prediction
title_sort efficient differentially private learning improves drug sensitivity prediction
publisher BMC
series Biology Direct
issn 1745-6150
publishDate 2018-02-01
description Abstract Background Users of a personalised recommendation system face a dilemma: recommendations can be improved by learning from data, but only if other users are willing to share their private information. Good personalised predictions are vitally important in precision medicine, but genomic information on which the predictions are based is also particularly sensitive, as it directly identifies the patients and hence cannot easily be anonymised. Differential privacy has emerged as a potentially promising solution: privacy is considered sufficient if presence of individual patients cannot be distinguished. However, differentially private learning with current methods does not improve predictions with feasible data sizes and dimensionalities. Results We show that useful predictors can be learned under powerful differential privacy guarantees, and even from moderately-sized data sets, by demonstrating significant improvements in the accuracy of private drug sensitivity prediction with a new robust private regression method. Our method matches the predictive accuracy of the state-of-the-art non-private lasso regression using only 4x more samples under relatively strong differential privacy guarantees. Good performance with limited data is achieved by limiting the sharing of private information by decreasing the dimensionality and by projecting outliers to fit tighter bounds, therefore needing to add less noise for equal privacy. Conclusions The proposed differentially private regression method combines theoretical appeal and asymptotic efficiency with good prediction accuracy even with moderate-sized data. As already the simple-to-implement method shows promise on the challenging genomic data, we anticipate rapid progress towards practical applications in many fields. Reviewers This article was reviewed by Zoltan Gaspari and David Kreil.
topic Differential privacy
Linear regression
Drug sensitivity prediction
Machine learning
url http://link.springer.com/article/10.1186/s13062-017-0203-4
work_keys_str_mv AT anttihonkela efficientdifferentiallyprivatelearningimprovesdrugsensitivityprediction
AT mrinaldas efficientdifferentiallyprivatelearningimprovesdrugsensitivityprediction
AT arttunieminen efficientdifferentiallyprivatelearningimprovesdrugsensitivityprediction
AT onurdikmen efficientdifferentiallyprivatelearningimprovesdrugsensitivityprediction
AT samuelkaski efficientdifferentiallyprivatelearningimprovesdrugsensitivityprediction
_version_ 1725953387848007680