Short Text Understanding Combining Text Conceptualization and Transformer Embedding

Short text understanding is a key task and popular issue in current natural language processing. Because the content of short texts is characterized by sparsity and semantic limitation, the traditional search methods that analyze only the semantics of literal text for short text understanding and si...

Full description

Bibliographic Details
Main Authors: Jun Li, Guimin Huang, Jianheng Chen, Yabing Wang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8819947/
id doaj-68a2f981ff39408bb7ba17c9e4c6c602
record_format Article
spelling doaj-68a2f981ff39408bb7ba17c9e4c6c6022021-03-29T23:23:49ZengIEEEIEEE Access2169-35362019-01-01712218312219110.1109/ACCESS.2019.29383038819947Short Text Understanding Combining Text Conceptualization and Transformer EmbeddingJun Li0https://orcid.org/0000-0001-5591-721XGuimin Huang1Jianheng Chen2Yabing Wang3School of Information and Communication, Guilin University of Electronic Technology, Guilin, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, ChinaSchool of Information and Communication, Guilin University of Electronic Technology, Guilin, ChinaSchool of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, ChinaShort text understanding is a key task and popular issue in current natural language processing. Because the content of short texts is characterized by sparsity and semantic limitation, the traditional search methods that analyze only the semantics of literal text for short text understanding and similarity matching have certain restrictions. In this paper, we propose a combined method based on knowledge-based conceptualization and a transformer encoder. Specifically, for each term in a short text, we obtain its concepts and enrich the short text information from a knowledge base based on cooccurrence terms and concepts, construct a convolutional neural network (CNN) to capture local context information, and introduce the subnetwork structure based on a transformer embedding encoder. Then, we embed these concepts into a low-dimensional vector space to obtain more attention from these concepts based on a transformer. Finally, the concept space and transformer encoder space construct the understanding models. An experiment shows that the method in this paper can effectively capture more semantics of short texts and can be applied to a variety of applications, such as short text information retrieval and short text classification.https://ieeexplore.ieee.org/document/8819947/Short text understandingtext conceptualizationtransformer encoder
collection DOAJ
language English
format Article
sources DOAJ
author Jun Li
Guimin Huang
Jianheng Chen
Yabing Wang
spellingShingle Jun Li
Guimin Huang
Jianheng Chen
Yabing Wang
Short Text Understanding Combining Text Conceptualization and Transformer Embedding
IEEE Access
Short text understanding
text conceptualization
transformer encoder
author_facet Jun Li
Guimin Huang
Jianheng Chen
Yabing Wang
author_sort Jun Li
title Short Text Understanding Combining Text Conceptualization and Transformer Embedding
title_short Short Text Understanding Combining Text Conceptualization and Transformer Embedding
title_full Short Text Understanding Combining Text Conceptualization and Transformer Embedding
title_fullStr Short Text Understanding Combining Text Conceptualization and Transformer Embedding
title_full_unstemmed Short Text Understanding Combining Text Conceptualization and Transformer Embedding
title_sort short text understanding combining text conceptualization and transformer embedding
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Short text understanding is a key task and popular issue in current natural language processing. Because the content of short texts is characterized by sparsity and semantic limitation, the traditional search methods that analyze only the semantics of literal text for short text understanding and similarity matching have certain restrictions. In this paper, we propose a combined method based on knowledge-based conceptualization and a transformer encoder. Specifically, for each term in a short text, we obtain its concepts and enrich the short text information from a knowledge base based on cooccurrence terms and concepts, construct a convolutional neural network (CNN) to capture local context information, and introduce the subnetwork structure based on a transformer embedding encoder. Then, we embed these concepts into a low-dimensional vector space to obtain more attention from these concepts based on a transformer. Finally, the concept space and transformer encoder space construct the understanding models. An experiment shows that the method in this paper can effectively capture more semantics of short texts and can be applied to a variety of applications, such as short text information retrieval and short text classification.
topic Short text understanding
text conceptualization
transformer encoder
url https://ieeexplore.ieee.org/document/8819947/
work_keys_str_mv AT junli shorttextunderstandingcombiningtextconceptualizationandtransformerembedding
AT guiminhuang shorttextunderstandingcombiningtextconceptualizationandtransformerembedding
AT jianhengchen shorttextunderstandingcombiningtextconceptualizationandtransformerembedding
AT yabingwang shorttextunderstandingcombiningtextconceptualizationandtransformerembedding
_version_ 1724189544886042624