SMURF: A Cross-lingual Co-derivative Detection System
碩士 === 國立清華大學 === 科技管理研究所 === 95 === An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm,...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2007
|
Online Access: | http://ndltd.ncl.edu.tw/handle/51824082648164983618 |
id |
ndltd-TW-095NTHU5230016 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-095NTHU52300162015-10-13T16:51:13Z http://ndltd.ncl.edu.tw/handle/51824082648164983618 SMURF: A Cross-lingual Co-derivative Detection System 一個偵測跨語言內容相互引用的系統 Jose P. Gonzalez-Brenes 鞏和平 碩士 國立清華大學 科技管理研究所 95 An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm, SMURF –Semantic MUltilingual Related-Document Finder, aimed to find pairs of documents in different languages that share a common source (co-derivative) which may be used to facilitate the protection of intellectual property. We demonstrate SMURF on identifying English co-derivatives on the Web of Spanish documents on several textual domains with a sentence-level precision of 88.75%. Although SMURF’s design focused on English and Spanish, the concepts applied could be easily implemented on other languages where the constituent technologies have been studied. Fu-Ren Lin 林福仁 2007 學位論文 ; thesis 46 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立清華大學 === 科技管理研究所 === 95 === An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm, SMURF –Semantic MUltilingual Related-Document Finder, aimed to find pairs of documents in different languages that share a common source (co-derivative) which may be used to facilitate the protection of intellectual property. We demonstrate SMURF on identifying English co-derivatives on the Web of Spanish documents on several textual domains with a sentence-level precision of 88.75%. Although SMURF’s design focused
on English and Spanish, the concepts applied could be easily implemented on other languages where the constituent technologies have been studied.
|
author2 |
Fu-Ren Lin |
author_facet |
Fu-Ren Lin Jose P. Gonzalez-Brenes 鞏和平 |
author |
Jose P. Gonzalez-Brenes 鞏和平 |
spellingShingle |
Jose P. Gonzalez-Brenes 鞏和平 SMURF: A Cross-lingual Co-derivative Detection System |
author_sort |
Jose P. Gonzalez-Brenes |
title |
SMURF: A Cross-lingual Co-derivative Detection System |
title_short |
SMURF: A Cross-lingual Co-derivative Detection System |
title_full |
SMURF: A Cross-lingual Co-derivative Detection System |
title_fullStr |
SMURF: A Cross-lingual Co-derivative Detection System |
title_full_unstemmed |
SMURF: A Cross-lingual Co-derivative Detection System |
title_sort |
smurf: a cross-lingual co-derivative detection system |
publishDate |
2007 |
url |
http://ndltd.ncl.edu.tw/handle/51824082648164983618 |
work_keys_str_mv |
AT josepgonzalezbrenes smurfacrosslingualcoderivativedetectionsystem AT gǒnghépíng smurfacrosslingualcoderivativedetectionsystem AT josepgonzalezbrenes yīgèzhēncèkuàyǔyánnèiróngxiānghùyǐnyòngdexìtǒng AT gǒnghépíng yīgèzhēncèkuàyǔyánnèiróngxiānghùyǐnyòngdexìtǒng |
_version_ |
1717775324161245184 |