Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection
With the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9541113/ |
id |
doaj-df526dbacd1e4535985ce41bc0e8fd39 |
---|---|
record_format |
Article |
spelling |
doaj-df526dbacd1e4535985ce41bc0e8fd392021-10-01T23:01:10ZengIEEEIEEE Access2169-35362021-01-01913236313237310.1109/ACCESS.2021.31140939541113Multi-Level Multi-Modal Cross-Attention Network for Fake News DetectionLong Ying0https://orcid.org/0000-0001-6834-5441Hui Yu1Jinguang Wang2Yongze Ji3Shengsheng Qian4https://orcid.org/0000-0001-9488-2208School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computer Science and Information Engineering, Hefei University of Technology, Hefei, ChinaSchool of Information Science and Engineering, China University of Petroleum, Beijing, ChinaNational Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, ChinaWith the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post, most existing methods do not sufficiently utilize the complementary multi-modal information containing semantic concepts and entities to complement and enhance each modality. Moreover, these methods do not model and incorporate the rich multi-level semantics of text information to improve fake news detection tasks. In this paper, we propose a novel end-to-end <italic>Multi-level Multi-modal Cross-attention Network</italic> (MMCN) which exploits the multi-level semantics of textual content and jointly integrates the relationships of duplicate and different modalities (textual and visual modality) of social multimedia posts in a unified framework. Pre-trained BERT and ResNet models are employed to generate high-quality representations for text words and image regions respectively. A multi-modal cross-attention network is then designed to fuse the feature embeddings of the text words and image regions by simultaneously considering data relationships in duplicate and different modalities. Specially, due to different layers of the transformer architecture have different feature representations, we employ a multi-level encoding network to capture the rich multi-level semantics to enhance the presentations of posts. Extensive experiments on the two public datasets (WEIBO and PHEME) demonstrate that compared with the state-of-the-art models, the proposed MMCN has an advantageous performance.https://ieeexplore.ieee.org/document/9541113/Multi-level neural networksfake news detectionmulti-modal fusion |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Long Ying Hui Yu Jinguang Wang Yongze Ji Shengsheng Qian |
spellingShingle |
Long Ying Hui Yu Jinguang Wang Yongze Ji Shengsheng Qian Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection IEEE Access Multi-level neural networks fake news detection multi-modal fusion |
author_facet |
Long Ying Hui Yu Jinguang Wang Yongze Ji Shengsheng Qian |
author_sort |
Long Ying |
title |
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection |
title_short |
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection |
title_full |
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection |
title_fullStr |
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection |
title_full_unstemmed |
Multi-Level Multi-Modal Cross-Attention Network for Fake News Detection |
title_sort |
multi-level multi-modal cross-attention network for fake news detection |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
With the development of the Mobile Internet, more and more users publish multi-modal posts on social media platforms. Fake news detection has become an increasingly challenging task. Although there are many works using deep schemes to extract and combine textual and visual representation in the post, most existing methods do not sufficiently utilize the complementary multi-modal information containing semantic concepts and entities to complement and enhance each modality. Moreover, these methods do not model and incorporate the rich multi-level semantics of text information to improve fake news detection tasks. In this paper, we propose a novel end-to-end <italic>Multi-level Multi-modal Cross-attention Network</italic> (MMCN) which exploits the multi-level semantics of textual content and jointly integrates the relationships of duplicate and different modalities (textual and visual modality) of social multimedia posts in a unified framework. Pre-trained BERT and ResNet models are employed to generate high-quality representations for text words and image regions respectively. A multi-modal cross-attention network is then designed to fuse the feature embeddings of the text words and image regions by simultaneously considering data relationships in duplicate and different modalities. Specially, due to different layers of the transformer architecture have different feature representations, we employ a multi-level encoding network to capture the rich multi-level semantics to enhance the presentations of posts. Extensive experiments on the two public datasets (WEIBO and PHEME) demonstrate that compared with the state-of-the-art models, the proposed MMCN has an advantageous performance. |
topic |
Multi-level neural networks fake news detection multi-modal fusion |
url |
https://ieeexplore.ieee.org/document/9541113/ |
work_keys_str_mv |
AT longying multilevelmultimodalcrossattentionnetworkforfakenewsdetection AT huiyu multilevelmultimodalcrossattentionnetworkforfakenewsdetection AT jinguangwang multilevelmultimodalcrossattentionnetworkforfakenewsdetection AT yongzeji multilevelmultimodalcrossattentionnetworkforfakenewsdetection AT shengshengqian multilevelmultimodalcrossattentionnetworkforfakenewsdetection |
_version_ |
1716860718564769792 |