Distributed SignSGD With Improved Accuracy and Network-Fault Tolerance
This paper proposes DropSignSGD, a communication-efficient and network-fault tolerant algorithm for training deep neural networks in a distributed and synchronous fashion. In DropSignSGD, all numerical elements communicated between machines are either 1 or -1, represented by only one bit. More impor...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9234512/ |