Uniform attribute-content model

There have been growing needs for text processing, such as classifying, retrieving and clustering. The foundation of such a process is to extract features, which can best describe the text. Great progress has been made in text modelling. However, most of the text modelling methods are based only on...

Full description

Bibliographic Details
Main Authors: Yingzhuo Xiang, Jikun Yan, Ling You, Pu An
Format: Article
Language:English
Published: Wiley 2019-05-01
Series:The Journal of Engineering
Subjects:
Online Access:https://digital-library.theiet.org/content/journals/10.1049/joe.2018.5135
Description
Summary:There have been growing needs for text processing, such as classifying, retrieving and clustering. The foundation of such a process is to extract features, which can best describe the text. Great progress has been made in text modelling. However, most of the text modelling methods are based only on the content, nor only on the attributes. Although there have been some combined models proposed in recent years, the lack of universality limits such models. In this study, the authors propose a uniform attribute-content model, which uses the attributes to influence the content feature extraction process. They design the attributes as a special filter to each feature extracted from the content. Thus the mixed features contain both content information and attribute information, which can describe the text more precise. They also propose a Monte Carlo method to solve this model. Experimental results on the Enron email dataset demonstrate the effectiveness of the authors’ proposed models.
ISSN:2051-3305