Uniform attribute-content model

There have been growing needs for text processing, such as classifying, retrieving and clustering. The foundation of such a process is to extract features, which can best describe the text. Great progress has been made in text modelling. However, most of the text modelling methods are based only on...

Full description

Bibliographic Details
Main Authors: Yingzhuo Xiang, Jikun Yan, Ling You, Pu An
Format: Article
Language:English
Published: Wiley 2019-05-01
Series:The Journal of Engineering
Subjects:
Online Access:https://digital-library.theiet.org/content/journals/10.1049/joe.2018.5135
id doaj-f878c4e50a7f465886940f4b6c1e9aaf
record_format Article
spelling doaj-f878c4e50a7f465886940f4b6c1e9aaf2021-04-02T15:23:38ZengWileyThe Journal of Engineering2051-33052019-05-0110.1049/joe.2018.5135JOE.2018.5135Uniform attribute-content modelYingzhuo Xiang0Jikun Yan1Jikun Yan2Ling You3Pu An4National Key Laboratory of Science and Technology on Blind Signal ProcessingNational Key Laboratory of Science and Technology on Blind Signal ProcessingNational Key Laboratory of Science and Technology on Blind Signal ProcessingNational Key Laboratory of Science and Technology on Blind Signal ProcessingNational Key Laboratory of Science and Technology on Blind Signal ProcessingThere have been growing needs for text processing, such as classifying, retrieving and clustering. The foundation of such a process is to extract features, which can best describe the text. Great progress has been made in text modelling. However, most of the text modelling methods are based only on the content, nor only on the attributes. Although there have been some combined models proposed in recent years, the lack of universality limits such models. In this study, the authors propose a uniform attribute-content model, which uses the attributes to influence the content feature extraction process. They design the attributes as a special filter to each feature extracted from the content. Thus the mixed features contain both content information and attribute information, which can describe the text more precise. They also propose a Monte Carlo method to solve this model. Experimental results on the Enron email dataset demonstrate the effectiveness of the authors’ proposed models.https://digital-library.theiet.org/content/journals/10.1049/joe.2018.5135feature extractioninformation retrievaltext analysismonte carlo methodsuniform attribute-content modeltext processingtext modelling methodscontent feature extraction processcontent informationattribute informationmonte carlo method
collection DOAJ
language English
format Article
sources DOAJ
author Yingzhuo Xiang
Jikun Yan
Jikun Yan
Ling You
Pu An
spellingShingle Yingzhuo Xiang
Jikun Yan
Jikun Yan
Ling You
Pu An
Uniform attribute-content model
The Journal of Engineering
feature extraction
information retrieval
text analysis
monte carlo methods
uniform attribute-content model
text processing
text modelling methods
content feature extraction process
content information
attribute information
monte carlo method
author_facet Yingzhuo Xiang
Jikun Yan
Jikun Yan
Ling You
Pu An
author_sort Yingzhuo Xiang
title Uniform attribute-content model
title_short Uniform attribute-content model
title_full Uniform attribute-content model
title_fullStr Uniform attribute-content model
title_full_unstemmed Uniform attribute-content model
title_sort uniform attribute-content model
publisher Wiley
series The Journal of Engineering
issn 2051-3305
publishDate 2019-05-01
description There have been growing needs for text processing, such as classifying, retrieving and clustering. The foundation of such a process is to extract features, which can best describe the text. Great progress has been made in text modelling. However, most of the text modelling methods are based only on the content, nor only on the attributes. Although there have been some combined models proposed in recent years, the lack of universality limits such models. In this study, the authors propose a uniform attribute-content model, which uses the attributes to influence the content feature extraction process. They design the attributes as a special filter to each feature extracted from the content. Thus the mixed features contain both content information and attribute information, which can describe the text more precise. They also propose a Monte Carlo method to solve this model. Experimental results on the Enron email dataset demonstrate the effectiveness of the authors’ proposed models.
topic feature extraction
information retrieval
text analysis
monte carlo methods
uniform attribute-content model
text processing
text modelling methods
content feature extraction process
content information
attribute information
monte carlo method
url https://digital-library.theiet.org/content/journals/10.1049/joe.2018.5135
work_keys_str_mv AT yingzhuoxiang uniformattributecontentmodel
AT jikunyan uniformattributecontentmodel
AT jikunyan uniformattributecontentmodel
AT lingyou uniformattributecontentmodel
AT puan uniformattributecontentmodel
_version_ 1721560174216347648