Information Bottleneck Classification in Extremely Distributed Systems

We present a new decentralized classification system based on a distributed architecture. This system consists of distributed nodes, each possessing their own datasets and computing modules, along with a centralized server, which provides probes to classification and aggregates the responses of node...

Full description

Bibliographic Details
Main Authors: Denis Ullmann, Shideh Rezaeifar, Olga Taran, Taras Holotyak, Brandon Panos, Slava Voloshynovskiy
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/11/1237
Description
Summary:We present a new decentralized classification system based on a distributed architecture. This system consists of distributed nodes, each possessing their own datasets and computing modules, along with a centralized server, which provides probes to classification and aggregates the responses of nodes for a final decision. Each node, with access to its own training dataset of a given class, is trained based on an auto-encoder system consisting of a fixed <em>data-independent</em> <em>encoder</em>, a pre-trained <em>quantizer</em> and a <em>class-dependent decoder</em>. Hence, these auto-encoders are highly dependent on the class probability distribution for which the reconstruction distortion is minimized. Alternatively, when an encoding–quantizing–decoding node observes data from different distributions, unseen at training, there is a mismatch, and such a decoding is not optimal, leading to a significant increase of the reconstruction distortion. The final classification is performed at the centralized classifier that votes for the class with the minimum reconstruction distortion. In addition to the system applicability for applications facing big-data communication problems and or requiring private classification, the above distributed scheme creates a theoretical bridge to the information bottleneck principle. The proposed system demonstrates a very promising performance on basic datasets such as MNIST and FasionMNIST.
ISSN:1099-4300