Communication Usage Optimization of Gradient Sparsification with Aggregation
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 106 === Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of spa...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/ppyyqb |
id |
ndltd-TW-106NTU05392055 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106NTU053920552019-07-25T04:46:48Z http://ndltd.ncl.edu.tw/handle/ppyyqb Communication Usage Optimization of Gradient Sparsification with Aggregation 對鬆散化梯度的資料聚集進行通訊量優化 Sheng-Ping Wang 王盛平 碩士 國立臺灣大學 資訊工程學研究所 106 Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification. Pangfeng Liu 劉邦鋒 2018 學位論文 ; thesis 33 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 106 === Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification.
|
author2 |
Pangfeng Liu |
author_facet |
Pangfeng Liu Sheng-Ping Wang 王盛平 |
author |
Sheng-Ping Wang 王盛平 |
spellingShingle |
Sheng-Ping Wang 王盛平 Communication Usage Optimization of Gradient Sparsification with Aggregation |
author_sort |
Sheng-Ping Wang |
title |
Communication Usage Optimization of Gradient Sparsification with Aggregation |
title_short |
Communication Usage Optimization of Gradient Sparsification with Aggregation |
title_full |
Communication Usage Optimization of Gradient Sparsification with Aggregation |
title_fullStr |
Communication Usage Optimization of Gradient Sparsification with Aggregation |
title_full_unstemmed |
Communication Usage Optimization of Gradient Sparsification with Aggregation |
title_sort |
communication usage optimization of gradient sparsification with aggregation |
publishDate |
2018 |
url |
http://ndltd.ncl.edu.tw/handle/ppyyqb |
work_keys_str_mv |
AT shengpingwang communicationusageoptimizationofgradientsparsificationwithaggregation AT wángshèngpíng communicationusageoptimizationofgradientsparsificationwithaggregation AT shengpingwang duìsōngsànhuàtīdùdezīliàojùjíjìnxíngtōngxùnliàngyōuhuà AT wángshèngpíng duìsōngsànhuàtīdùdezīliàojùjíjìnxíngtōngxùnliàngyōuhuà |
_version_ |
1719229974010920960 |