Communication Usage Optimization of Gradient Sparsification with Aggregation

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 106 === Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of spa...

Full description

Bibliographic Details
Main Authors: Sheng-Ping Wang, 王盛平
Other Authors: Pangfeng Liu
Format: Others
Language:en_US
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/ppyyqb
id ndltd-TW-106NTU05392055
record_format oai_dc
spelling ndltd-TW-106NTU053920552019-07-25T04:46:48Z http://ndltd.ncl.edu.tw/handle/ppyyqb Communication Usage Optimization of Gradient Sparsification with Aggregation 對鬆散化梯度的資料聚集進行通訊量優化 Sheng-Ping Wang 王盛平 碩士 國立臺灣大學 資訊工程學研究所 106 Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification. Pangfeng Liu 劉邦鋒 2018 學位論文 ; thesis 33 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 106 === Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification.
author2 Pangfeng Liu
author_facet Pangfeng Liu
Sheng-Ping Wang
王盛平
author Sheng-Ping Wang
王盛平
spellingShingle Sheng-Ping Wang
王盛平
Communication Usage Optimization of Gradient Sparsification with Aggregation
author_sort Sheng-Ping Wang
title Communication Usage Optimization of Gradient Sparsification with Aggregation
title_short Communication Usage Optimization of Gradient Sparsification with Aggregation
title_full Communication Usage Optimization of Gradient Sparsification with Aggregation
title_fullStr Communication Usage Optimization of Gradient Sparsification with Aggregation
title_full_unstemmed Communication Usage Optimization of Gradient Sparsification with Aggregation
title_sort communication usage optimization of gradient sparsification with aggregation
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/ppyyqb
work_keys_str_mv AT shengpingwang communicationusageoptimizationofgradientsparsificationwithaggregation
AT wángshèngpíng communicationusageoptimizationofgradientsparsificationwithaggregation
AT shengpingwang duìsōngsànhuàtīdùdezīliàojùjíjìnxíngtōngxùnliàngyōuhuà
AT wángshèngpíng duìsōngsànhuàtīdùdezīliàojùjíjìnxíngtōngxùnliàngyōuhuà
_version_ 1719229974010920960