Two-Stage Spatial Mapping for Multimodal Data Fusion in Mobile Crowd Sensing

Human-driven Edge Computing (HEC) integrates the elements of humans, devices, Internet and information, and mobile crowd sensing become an important means of data collection. In HEC, the data collected from large-scale sensing usually includes a variety of modalities. These different modality data c...

Full description

Bibliographic Details
Main Authors:	Jiancun Zhou, Tao Xu, Sheng Ren, Kehua Guo
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Multimodal data unified representation mobile crowd sensing human-driven edge computing
Online Access:	https://ieeexplore.ieee.org/document/9094630/

Description
Summary:	Human-driven Edge Computing (HEC) integrates the elements of humans, devices, Internet and information, and mobile crowd sensing become an important means of data collection. In HEC, the data collected from large-scale sensing usually includes a variety of modalities. These different modality data contain unique information and attributes, which can be complementary. Combining data from many different modalities will get more information. However, current deep learning is usually only for bimodal data. In order for artificial intelligence to make further breakthroughs in understanding our real world, it needs to be able to process data in different modalities together. The key step is to be able to map these different modalities data into the same space. In order to process multimodal data better, we propose a fusion and classification method for multimodal data. First, a multimodal data space is constructed, and data of different modalities are mapped into the multimodal data space to obtain a unified representation of different modalities data. Then, through bilinear pooling, the representations of different modality are fused, and the fused vectors are used in the classification task. Through the experimental verification on the multi-modal data set, it proves that the multi-modal fusion representation is effective, and the classification effect is more accurate than the single-modal data.
ISSN:	2169-3536

Two-Stage Spatial Mapping for Multimodal Data Fusion in Mobile Crowd Sensing

Similar Items