Multi-Task Learning Using Gradient Balance and Clipping with an Application in Joint Disparity Estimation and Semantic Segmentation

In this paper, we propose a novel multi-task learning (MTL) strategy from the gradient optimization view which enables automatically learning the optimal gradient from different tasks. In contrast with current multi-task learning methods which rely on careful network architecture adjustment or elabo...

Full description

Bibliographic Details
Main Authors: Guo, Y. (Author), Wei, C. (Author)
Format: Article
Language:English
Published: MDPI 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 01944nam a2200181Ia 4500
001 10.3390-electronics11081217
008 220425s2022 CNT 000 0 und d
020 |a 20799292 (ISSN) 
245 1 0 |a Multi-Task Learning Using Gradient Balance and Clipping with an Application in Joint Disparity Estimation and Semantic Segmentation 
260 0 |b MDPI  |c 2022 
856 |z View Fulltext in Publisher  |u https://doi.org/10.3390/electronics11081217 
520 3 |a In this paper, we propose a novel multi-task learning (MTL) strategy from the gradient optimization view which enables automatically learning the optimal gradient from different tasks. In contrast with current multi-task learning methods which rely on careful network architecture adjustment or elaborate loss functions optimization, the proposed gradient-based MTL is simple and flexible. Specifically, we introduce a multi-task stochastic gradient descent optimization (MTSGD) to learn task-specific and shared representation in the deep neural network. In MTSGD, we decompose the total gradient into multiple task-specific sub-gradients and find the optimal sub-gradient via gradient balance and clipping operations. In this way, the learned network can satisfy the performance of specific task optimization while maintaining the shared representation. We take the joint learning of semantic segmentation and disparity estimation tasks as the exemplar to verify the effectiveness of the proposed method. Extensive experimental results on a large-scale dataset show that our proposed algorithm is superior to the baseline methods by a large margin. Meanwhile, we perform a series of ablation studies to have a deep analysis of gradient descent for MTL. © 2022 by the authors. Licensee MDPI, Basel, Switzerland. 
650 0 4 |a clipping 
650 0 4 |a gradient balance 
650 0 4 |a multi-task learning 
700 1 |a Guo, Y.  |e author 
700 1 |a Wei, C.  |e author 
773 |t Electronics (Switzerland)