Semantic Relation Model and Dataset for Remote Sensing Scene Understanding

A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different...

Full description

Bibliographic Details
Main Authors: Peng Li, Dezheng Zhang, Aziguli Wulamu, Xin Liu, Peng Chen
Format: Article
Language:English
Published: MDPI AG 2021-07-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/10/7/488
id doaj-a8f6728050b345a390e7fd61d73166a9
record_format Article
spelling doaj-a8f6728050b345a390e7fd61d73166a92021-07-23T13:45:05ZengMDPI AGISPRS International Journal of Geo-Information2220-99642021-07-011048848810.3390/ijgi10070488Semantic Relation Model and Dataset for Remote Sensing Scene UnderstandingPeng Li0Dezheng Zhang1Aziguli Wulamu2Xin Liu3Peng Chen4School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaSchool of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, ChinaFINTECH Innovation Division, Postal Savings Bank of China, Beijing 100808, ChinaA deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.https://www.mdpi.com/2220-9964/10/7/488remote sensing scene understandingsemantic relation cognitionscene graph generationmulti-scale semantic fusionattentional mechanismgraph convolutional network
collection DOAJ
language English
format Article
sources DOAJ
author Peng Li
Dezheng Zhang
Aziguli Wulamu
Xin Liu
Peng Chen
spellingShingle Peng Li
Dezheng Zhang
Aziguli Wulamu
Xin Liu
Peng Chen
Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
ISPRS International Journal of Geo-Information
remote sensing scene understanding
semantic relation cognition
scene graph generation
multi-scale semantic fusion
attentional mechanism
graph convolutional network
author_facet Peng Li
Dezheng Zhang
Aziguli Wulamu
Xin Liu
Peng Chen
author_sort Peng Li
title Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
title_short Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
title_full Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
title_fullStr Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
title_full_unstemmed Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
title_sort semantic relation model and dataset for remote sensing scene understanding
publisher MDPI AG
series ISPRS International Journal of Geo-Information
issn 2220-9964
publishDate 2021-07-01
description A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images.
topic remote sensing scene understanding
semantic relation cognition
scene graph generation
multi-scale semantic fusion
attentional mechanism
graph convolutional network
url https://www.mdpi.com/2220-9964/10/7/488
work_keys_str_mv AT pengli semanticrelationmodelanddatasetforremotesensingsceneunderstanding
AT dezhengzhang semanticrelationmodelanddatasetforremotesensingsceneunderstanding
AT aziguliwulamu semanticrelationmodelanddatasetforremotesensingsceneunderstanding
AT xinliu semanticrelationmodelanddatasetforremotesensingsceneunderstanding
AT pengchen semanticrelationmodelanddatasetforremotesensingsceneunderstanding
_version_ 1721287935340314624