Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy

In this paper, we propose an efficient knowledge distillation method to train light networks using heavy networks for semantic segmentation. Most semantic segmentation networks that exhibit good accuracy are based on computationally expensive networks. These networks are not suitable for mobile appl...

Full description

Bibliographic Details
Main Authors:	Sangyong Park, Yong Seok Heo
Format:	Article
Language:	English
Published:	MDPI AG 2020-08-01
Series:	Sensors
Subjects:	semantic segmentation knowledge distillation channel and spatial correlation loss adaptive cross entropy loss
Online Access:	https://www.mdpi.com/1424-8220/20/16/4616

id	doaj-0de9c53d3f7248d38d2d13b926df31d3
record_format	Article
spelling	doaj-0de9c53d3f7248d38d2d13b926df31d32020-11-25T03:40:02ZengMDPI AGSensors1424-82202020-08-01204616461610.3390/s20164616Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross EntropySangyong Park0Yong Seok Heo1Department of Electrical and Computer Engineering, Ajou University, Suwon 16449, KoreaDepartment of Electrical and Computer Engineering, Ajou University, Suwon 16449, KoreaIn this paper, we propose an efficient knowledge distillation method to train light networks using heavy networks for semantic segmentation. Most semantic segmentation networks that exhibit good accuracy are based on computationally expensive networks. These networks are not suitable for mobile applications using vision sensors, because computational resources are limited in these environments. In this view, knowledge distillation, which transfers knowledge from heavy networks acting as teachers to light networks as students, is suitable methodology. Although previous knowledge distillation approaches have been proven to improve the performance of student networks, most methods have some limitations. First, they tend to use only the spatial correlation of feature maps and ignore the relational information of their channels. Second, they can transfer false knowledge when the results of the teacher networks are not perfect. To address these two problems, we propose two loss functions: a channel and spatial correlation (CSC) loss function and an adaptive cross entropy (ACE) loss function. The former computes the full relationship of both the channel and spatial information in the feature map, and the latter adaptively exploits one-hot encodings using the ground truth labels and the probability maps predicted by the teacher network. To evaluate our method, we conduct experiments on scene parsing datasets: Cityscapes and Camvid. Our method presents significantly better performance than previous methods.https://www.mdpi.com/1424-8220/20/16/4616semantic segmentationknowledge distillationchannel and spatial correlation lossadaptive cross entropy loss
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Sangyong Park Yong Seok Heo
spellingShingle	Sangyong Park Yong Seok Heo Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy Sensors semantic segmentation knowledge distillation channel and spatial correlation loss adaptive cross entropy loss
author_facet	Sangyong Park Yong Seok Heo
author_sort	Sangyong Park
title	Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
title_short	Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
title_full	Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
title_fullStr	Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
title_full_unstemmed	Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
title_sort	knowledge distillation for semantic segmentation using channel and spatial correlations and adaptive cross entropy
publisher	MDPI AG
series	Sensors
issn	1424-8220
publishDate	2020-08-01
description	In this paper, we propose an efficient knowledge distillation method to train light networks using heavy networks for semantic segmentation. Most semantic segmentation networks that exhibit good accuracy are based on computationally expensive networks. These networks are not suitable for mobile applications using vision sensors, because computational resources are limited in these environments. In this view, knowledge distillation, which transfers knowledge from heavy networks acting as teachers to light networks as students, is suitable methodology. Although previous knowledge distillation approaches have been proven to improve the performance of student networks, most methods have some limitations. First, they tend to use only the spatial correlation of feature maps and ignore the relational information of their channels. Second, they can transfer false knowledge when the results of the teacher networks are not perfect. To address these two problems, we propose two loss functions: a channel and spatial correlation (CSC) loss function and an adaptive cross entropy (ACE) loss function. The former computes the full relationship of both the channel and spatial information in the feature map, and the latter adaptively exploits one-hot encodings using the ground truth labels and the probability maps predicted by the teacher network. To evaluate our method, we conduct experiments on scene parsing datasets: Cityscapes and Camvid. Our method presents significantly better performance than previous methods.
topic	semantic segmentation knowledge distillation channel and spatial correlation loss adaptive cross entropy loss
url	https://www.mdpi.com/1424-8220/20/16/4616
work_keys_str_mv	AT sangyongpark knowledgedistillationforsemanticsegmentationusingchannelandspatialcorrelationsandadaptivecrossentropy AT yongseokheo knowledgedistillationforsemanticsegmentationusingchannelandspatialcorrelationsandadaptivecrossentropy
_version_	1724536834923429888

Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy

Similar Items