Abstraction and Association: Cross-Modal Retrieval Based on Consistency between Semantic Structures
Cross-modal retrieval aims to find relevant data of different modalities, such as images and text. In order to bridge the modality gap, most existing methods require a lot of coupled sample pairs as training data. To reduce the demands for training data, we propose a cross-modal retrieval framework...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2020-01-01
|
Series: | Mathematical Problems in Engineering |
Online Access: | http://dx.doi.org/10.1155/2020/2503137 |
Summary: | Cross-modal retrieval aims to find relevant data of different modalities, such as images and text. In order to bridge the modality gap, most existing methods require a lot of coupled sample pairs as training data. To reduce the demands for training data, we propose a cross-modal retrieval framework that utilizes both coupled and uncoupled samples. The framework consists of two parts: Abstraction that aims to provide high-level single-modal representations with uncoupled samples; then, Association links different modalities through a few coupled training samples. Moreover, under this framework, we implement a cross-modal retrieval method based on the consistency between the semantic structure of multiple modalities. First, both images and text are represented with the semantic structure-based representation, which represents each sample as its similarity from the reference points that are generated from single-modal clustering. Then, the reference points of different modalities are aligned through an active learning strategy. Finally, the cross-modal similarity can be measured with the consistency between the semantic structures. The experiment results demonstrate that given proper abstraction of single-modal data, the relationship between different modalities can be simplified, and even limited coupled cross-modal training data are sufficient for satisfactory retrieval accuracy. |
---|---|
ISSN: | 1024-123X 1563-5147 |