Incorporating Name Entity Recognition Rules in News Topic Model
碩士 === 國立雲林科技大學 === 資訊管理系 === 104 === Unstructured information is growing rapidly. Topic models have been widely used to identify topics in unstructured corpora. It is also known that purely unsupervised models often result in topics that are not comprehensible in applications. In recent years, a nu...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/249v39 |
id |
ndltd-TW-104YUNT0396049 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104YUNT03960492019-05-15T22:53:47Z http://ndltd.ncl.edu.tw/handle/249v39 Incorporating Name Entity Recognition Rules in News Topic Model 命名實體辨識規則應用於主題模型特徵詞萃取研究 HSIAO, WEI-CHING 蕭維慶 碩士 國立雲林科技大學 資訊管理系 104 Unstructured information is growing rapidly. Topic models have been widely used to identify topics in unstructured corpora. It is also known that purely unsupervised models often result in topics that are not comprehensible in applications. In recent years, a number of knowledge-based models have been proposed, which allow the user to input prior knowledge of the domain to produce more coherent and meaningful topics. The disadvantage of last knowledge-based topic model is the requirement that the user is well aware of this domain, but this does not meet the reality of the actual application. In most cases, people want to use the topic model to find the potential topic. Also, prior knowledge-based topic model is difficult to handle large amounts of data. This study use syntactic extraction rule to extract named entities as LDA feature terms and Coherence Measure, UMass Topic Coherence and efficiency testing as an evaluation method to compare with Unigram-LDA, Compound-LDA and Mixture LDA. The results show name entities LDA’s execution efficiency superior to others and the topic results is interpretable. Name entities LDA is lightly below the other LDA model on the Umass term measure. The Coherence Measure of Name entities LDA is 0.97. HUANG, CHUEN-MIN 黃純敏 2016 學位論文 ; thesis 58 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立雲林科技大學 === 資訊管理系 === 104 === Unstructured information is growing rapidly. Topic models have been widely used to identify topics in unstructured corpora. It is also known that purely unsupervised models often result in topics that are not comprehensible in applications. In recent years, a number of knowledge-based models have been proposed, which allow the user to input prior knowledge of the domain to produce more coherent and meaningful topics. The disadvantage of last knowledge-based topic model is the requirement that the user is well aware of this domain, but this does not meet the reality of the actual application. In most cases, people want to use the topic model to find the potential topic. Also, prior knowledge-based topic model is difficult to handle large amounts of data. This study use syntactic extraction rule to extract named entities as LDA feature terms and Coherence Measure, UMass Topic Coherence and efficiency testing as an evaluation method to compare with Unigram-LDA, Compound-LDA and Mixture LDA. The results show name entities LDA’s execution efficiency superior to others and the topic results is interpretable. Name entities LDA is lightly below the other LDA model on the Umass term measure. The Coherence Measure of Name entities LDA is 0.97.
|
author2 |
HUANG, CHUEN-MIN |
author_facet |
HUANG, CHUEN-MIN HSIAO, WEI-CHING 蕭維慶 |
author |
HSIAO, WEI-CHING 蕭維慶 |
spellingShingle |
HSIAO, WEI-CHING 蕭維慶 Incorporating Name Entity Recognition Rules in News Topic Model |
author_sort |
HSIAO, WEI-CHING |
title |
Incorporating Name Entity Recognition Rules in News Topic Model |
title_short |
Incorporating Name Entity Recognition Rules in News Topic Model |
title_full |
Incorporating Name Entity Recognition Rules in News Topic Model |
title_fullStr |
Incorporating Name Entity Recognition Rules in News Topic Model |
title_full_unstemmed |
Incorporating Name Entity Recognition Rules in News Topic Model |
title_sort |
incorporating name entity recognition rules in news topic model |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/249v39 |
work_keys_str_mv |
AT hsiaoweiching incorporatingnameentityrecognitionrulesinnewstopicmodel AT xiāowéiqìng incorporatingnameentityrecognitionrulesinnewstopicmodel AT hsiaoweiching mìngmíngshítǐbiànshíguīzéyīngyòngyúzhǔtímóxíngtèzhēngcícuìqǔyánjiū AT xiāowéiqìng mìngmíngshítǐbiànshíguīzéyīngyòngyúzhǔtímóxíngtèzhēngcícuìqǔyánjiū |
_version_ |
1719137057902690304 |