SMURF: A Cross-lingual Co-derivative Detection System

碩士 === 國立清華大學 === 科技管理研究所 === 95 === An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm,...

Full description

Bibliographic Details
Main Authors: Jose P. Gonzalez-Brenes, 鞏和平
Other Authors: Fu-Ren Lin
Format: Others
Language:en_US
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/51824082648164983618
id ndltd-TW-095NTHU5230016
record_format oai_dc
spelling ndltd-TW-095NTHU52300162015-10-13T16:51:13Z http://ndltd.ncl.edu.tw/handle/51824082648164983618 SMURF: A Cross-lingual Co-derivative Detection System 一個偵測跨語言內容相互引用的系統 Jose P. Gonzalez-Brenes 鞏和平 碩士 國立清華大學 科技管理研究所 95 An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm, SMURF –Semantic MUltilingual Related-Document Finder, aimed to find pairs of documents in different languages that share a common source (co-derivative) which may be used to facilitate the protection of intellectual property. We demonstrate SMURF on identifying English co-derivatives on the Web of Spanish documents on several textual domains with a sentence-level precision of 88.75%. Although SMURF’s design focused on English and Spanish, the concepts applied could be easily implemented on other languages where the constituent technologies have been studied. Fu-Ren Lin 林福仁 2007 學位論文 ; thesis 46 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 科技管理研究所 === 95 === An automatic approach to detect content overlapping will mitigate the workload on the repetitiveness and tedious nature of manually checking the originality of a large pool of documents. The objective of this research is to design and evaluate a novel algorithm, SMURF –Semantic MUltilingual Related-Document Finder, aimed to find pairs of documents in different languages that share a common source (co-derivative) which may be used to facilitate the protection of intellectual property. We demonstrate SMURF on identifying English co-derivatives on the Web of Spanish documents on several textual domains with a sentence-level precision of 88.75%. Although SMURF’s design focused on English and Spanish, the concepts applied could be easily implemented on other languages where the constituent technologies have been studied.
author2 Fu-Ren Lin
author_facet Fu-Ren Lin
Jose P. Gonzalez-Brenes
鞏和平
author Jose P. Gonzalez-Brenes
鞏和平
spellingShingle Jose P. Gonzalez-Brenes
鞏和平
SMURF: A Cross-lingual Co-derivative Detection System
author_sort Jose P. Gonzalez-Brenes
title SMURF: A Cross-lingual Co-derivative Detection System
title_short SMURF: A Cross-lingual Co-derivative Detection System
title_full SMURF: A Cross-lingual Co-derivative Detection System
title_fullStr SMURF: A Cross-lingual Co-derivative Detection System
title_full_unstemmed SMURF: A Cross-lingual Co-derivative Detection System
title_sort smurf: a cross-lingual co-derivative detection system
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/51824082648164983618
work_keys_str_mv AT josepgonzalezbrenes smurfacrosslingualcoderivativedetectionsystem
AT gǒnghépíng smurfacrosslingualcoderivativedetectionsystem
AT josepgonzalezbrenes yīgèzhēncèkuàyǔyánnèiróngxiānghùyǐnyòngdexìtǒng
AT gǒnghépíng yīgèzhēncèkuàyǔyánnèiróngxiānghùyǐnyòngdexìtǒng
_version_ 1717775324161245184