NoPeek: Information leakage reduction to share activations in distributed deep learning

For distributed machine learning with sensitive data, we demonstrate how minimizing distance correlation between raw data and intermediary representations reduces leakage of sensitive raw data patterns across client communications while maintaining model accuracy. Leakage (measured using distance co...

Full description

Bibliographic Details
Main Authors: Vepakomma, Praneeth (Author), Singh, Abhishek (Author), Gupta, Otkrist (Author), Raskar, Ramesh (Author)
Other Authors: Program in Media Arts and Sciences (Massachusetts Institute of Technology) (Contributor), Massachusetts Institute of Technology. Media Laboratory (Contributor)
Format: Article
Language:English
Published: IEEE, 2022-07-18T14:30:57Z.
Subjects:
Online Access:Get fulltext
Description
Summary:For distributed machine learning with sensitive data, we demonstrate how minimizing distance correlation between raw data and intermediary representations reduces leakage of sensitive raw data patterns across client communications while maintaining model accuracy. Leakage (measured using distance correlation between input and intermediate representations) is the risk associated with the invertibility of raw data from intermediary representations. This can prevent client entities that hold sensitive data from using distributed deep learning services. We demonstrate that our method is resilient to such reconstruction attacks and is based on reduction of distance correlation between raw data and learned representations during training and inference with image datasets. We prevent such reconstruction of raw data while maintaining information required to sustain good classification accuracies.