Summary: | 碩士 === 國立清華大學 === 科技管理研究所 === 96 === Document-category integration (or category integration for short) is fundamental to many e-commerce applications, including information aggregation by intermediaries and integration of supply chain management. Because of the trend of globalization, the requirement for category integration has been extended from monolingual to poly-lingual settings. Poly-lingual category integration (PLCI) aims to integrate two document catalogs, each of which consists of documents written in a mix of languages. Several category integration techniques have been proposed in the literature, but these techniques focus only on monolingual category integration rather than PLCI. In this study, we first propose a feature-reinforcement-based PLCI (namely, FR-PLCI) technique that takes into account the master documents of all languages when integrating a source catalog into the master catalog. Furthermore, we also develop two extended FR-PLCI techniques (referred to as the DE-PLCI and PE-PLCI techniques) that employ integrated source documents into the FR-PLCI process in an iterative manner. Using the monolingual category integration technique (MnCI) as performance benchmarks, our empirical evaluation results show that our proposed FR-PLCI technique achieves higher integration accuracy than MnCI does in both English and Chinese category integration. In addition, the extended FR-PLCI techniques (i.e., DE-PLCI and PE-PLCI) generally improve the performance of FR-PLCI in the homogeneous and comparable scenario.
|