Pipelined Training with Stale Weights in Deep Convolutional Neural Networks

The growth in size and complexity of convolutional neural networks (CNNs) is forcing the partitioning of a network across multiple accelerators during training and pipelining of backpropagation computations over these accelerators. Pipelining results in the use of stale weights. Existing approaches...

Full description

Bibliographic Details
Main Authors:	Lifu Zhang, Tarek S. Abdelrahman
Format:	Article
Language:	English
Published:	Hindawi Limited 2021-01-01
Series:	Applied Computational Intelligence and Soft Computing
Online Access:	http://dx.doi.org/10.1155/2021/3839543

id	doaj-f61241a985b4482793a25f2f26140b1a
record_format	Article
spelling	doaj-f61241a985b4482793a25f2f26140b1a2021-10-04T01:58:05ZengHindawi LimitedApplied Computational Intelligence and Soft Computing1687-97322021-01-01202110.1155/2021/3839543Pipelined Training with Stale Weights in Deep Convolutional Neural NetworksLifu Zhang0Tarek S. Abdelrahman1Edward S. Rogers Sr. Department of Electrical and Computer EngineeringEdward S. Rogers Sr. Department of Electrical and Computer EngineeringThe growth in size and complexity of convolutional neural networks (CNNs) is forcing the partitioning of a network across multiple accelerators during training and pipelining of backpropagation computations over these accelerators. Pipelining results in the use of stale weights. Existing approaches to pipelined training avoid or limit the use of stale weights with techniques that either underutilize accelerators or increase training memory footprint. This paper contributes a pipelined backpropagation scheme that uses stale weights to maximize accelerator utilization and keep memory overhead modest. It explores the impact of stale weights on the statistical efficiency and performance using 4 CNNs (LeNet-5, AlexNet, VGG, and ResNet) and shows that when pipelining is introduced in early layers, training with stale weights converges and results in models with comparable inference accuracies to those resulting from nonpipelined training (a drop in accuracy of 0.4%, 4%, 0.83%, and 1.45% for the 4 networks, respectively). However, when pipelining is deeper in the network, inference accuracies drop significantly (up to 12% for VGG and 8.5% for ResNet-20). The paper also contributes a hybrid training scheme that combines pipelined with nonpipelined training to address this drop. The potential for performance improvement of the proposed scheme is demonstrated with a proof-of-concept pipelined backpropagation implementation in PyTorch on 2 GPUs using ResNet-56/110/224/362, achieving speedups of up to 1.8X over a 1-GPU baseline.http://dx.doi.org/10.1155/2021/3839543
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Lifu Zhang Tarek S. Abdelrahman
spellingShingle	Lifu Zhang Tarek S. Abdelrahman Pipelined Training with Stale Weights in Deep Convolutional Neural Networks Applied Computational Intelligence and Soft Computing
author_facet	Lifu Zhang Tarek S. Abdelrahman
author_sort	Lifu Zhang
title	Pipelined Training with Stale Weights in Deep Convolutional Neural Networks
title_short	Pipelined Training with Stale Weights in Deep Convolutional Neural Networks
title_full	Pipelined Training with Stale Weights in Deep Convolutional Neural Networks
title_fullStr	Pipelined Training with Stale Weights in Deep Convolutional Neural Networks
title_full_unstemmed	Pipelined Training with Stale Weights in Deep Convolutional Neural Networks
title_sort	pipelined training with stale weights in deep convolutional neural networks
publisher	Hindawi Limited
series	Applied Computational Intelligence and Soft Computing
issn	1687-9732
publishDate	2021-01-01
description	The growth in size and complexity of convolutional neural networks (CNNs) is forcing the partitioning of a network across multiple accelerators during training and pipelining of backpropagation computations over these accelerators. Pipelining results in the use of stale weights. Existing approaches to pipelined training avoid or limit the use of stale weights with techniques that either underutilize accelerators or increase training memory footprint. This paper contributes a pipelined backpropagation scheme that uses stale weights to maximize accelerator utilization and keep memory overhead modest. It explores the impact of stale weights on the statistical efficiency and performance using 4 CNNs (LeNet-5, AlexNet, VGG, and ResNet) and shows that when pipelining is introduced in early layers, training with stale weights converges and results in models with comparable inference accuracies to those resulting from nonpipelined training (a drop in accuracy of 0.4%, 4%, 0.83%, and 1.45% for the 4 networks, respectively). However, when pipelining is deeper in the network, inference accuracies drop significantly (up to 12% for VGG and 8.5% for ResNet-20). The paper also contributes a hybrid training scheme that combines pipelined with nonpipelined training to address this drop. The potential for performance improvement of the proposed scheme is demonstrated with a proof-of-concept pipelined backpropagation implementation in PyTorch on 2 GPUs using ResNet-56/110/224/362, achieving speedups of up to 1.8X over a 1-GPU baseline.
url	http://dx.doi.org/10.1155/2021/3839543
work_keys_str_mv	AT lifuzhang pipelinedtrainingwithstaleweightsindeepconvolutionalneuralnetworks AT tareksabdelrahman pipelinedtrainingwithstaleweightsindeepconvolutionalneuralnetworks
_version_	1716844783944597504

Pipelined Training with Stale Weights in Deep Convolutional Neural Networks

Similar Items