Segment Similarity Based on Text Similarity: A Case Study of Four Gospels
碩士 === 國立中央大學 === 資訊管理學系 === 105 === Text Mining is known as data analysis to documents based on data mining. Main purpose of text mining is to obtain the relevance between text, through these analyzes conclude classification, comparison and discrimination. Over the past decade, search engines have...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/7qvxt8 |
id |
ndltd-TW-105NCU05396076 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105NCU053960762019-05-16T00:08:08Z http://ndltd.ncl.edu.tw/handle/7qvxt8 Segment Similarity Based on Text Similarity: A Case Study of Four Gospels 以文本相似度為基礎的段落相似度分析:聖經四福音書之案例研究 Han-Wen Chi 紀涵文 碩士 國立中央大學 資訊管理學系 105 Text Mining is known as data analysis to documents based on data mining. Main purpose of text mining is to obtain the relevance between text, through these analyzes conclude classification, comparison and discrimination. Over the past decade, search engines have emerged, and text search techniques have been more effectively applied to create new business value. With the ever-changing Internet, the accumulation of information on the network makes the development of search engines more quickly, also makes a huge on change data retrieval. Text Similarity, the degree of similarity between the text types is calculated by weighting (distance). Calculate the degree of similarity between text types and obtain information, classify or binary judgments, observe the valuable information through analysis a big quantity of articles. In this research, we raised a new method of similarity calculation. We treat any part of continuous sentences in the document as a Segment. Compare this segment with other sentences to get scores, and find the similar target segment in the same document from the rank and distribution of the scores. In this research, we use the four gospels in holy bible as cases study. The cases study demonstrate the operation of the algorithm and the expected results. Yen-Liang Chen 陳彥良 2017 學位論文 ; thesis 48 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中央大學 === 資訊管理學系 === 105 === Text Mining is known as data analysis to documents based on data mining. Main purpose of text mining is to obtain the relevance between text, through these analyzes conclude classification, comparison and discrimination. Over the past decade, search engines have emerged, and text search techniques have been more effectively applied
to create new business value. With the ever-changing Internet, the accumulation of information on the network makes the development of search engines more quickly, also makes a huge on change data retrieval.
Text Similarity, the degree of similarity between the text types is calculated by weighting (distance). Calculate the degree of similarity between text types and obtain information, classify or binary judgments, observe the valuable information through analysis a big quantity of articles.
In this research, we raised a new method of similarity calculation. We treat any part of continuous sentences in the document as a Segment. Compare this segment with other sentences to get scores, and find the similar target segment in the same document from the rank and distribution of the scores. In this research, we use the four gospels in holy bible as cases study. The cases study demonstrate the operation of the algorithm and the expected results.
|
author2 |
Yen-Liang Chen |
author_facet |
Yen-Liang Chen Han-Wen Chi 紀涵文 |
author |
Han-Wen Chi 紀涵文 |
spellingShingle |
Han-Wen Chi 紀涵文 Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
author_sort |
Han-Wen Chi |
title |
Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
title_short |
Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
title_full |
Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
title_fullStr |
Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
title_full_unstemmed |
Segment Similarity Based on Text Similarity: A Case Study of Four Gospels |
title_sort |
segment similarity based on text similarity: a case study of four gospels |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/7qvxt8 |
work_keys_str_mv |
AT hanwenchi segmentsimilaritybasedontextsimilarityacasestudyoffourgospels AT jìhánwén segmentsimilaritybasedontextsimilarityacasestudyoffourgospels AT hanwenchi yǐwénběnxiāngshìdùwèijīchǔdeduànluòxiāngshìdùfēnxīshèngjīngsìfúyīnshūzhīànlìyánjiū AT jìhánwén yǐwénběnxiāngshìdùwèijīchǔdeduànluòxiāngshìdùfēnxīshèngjīngsìfúyīnshūzhīànlìyánjiū |
_version_ |
1719160923508178944 |