WordNet-based Semantic Classification for Auction Commodity Titles

碩士 === 國立屏東商業技術學院 === 資訊工程系(所) === 102 === This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automati...

Full description

Bibliographic Details
Main Authors: Wei-Jun Liu, 劉瑋竣
Other Authors: Cheng-Huang Dong
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/94199008472969980675
id ndltd-TW-102NPC05392009
record_format oai_dc
spelling ndltd-TW-102NPC053920092016-03-04T04:14:52Z http://ndltd.ncl.edu.tw/handle/94199008472969980675 WordNet-based Semantic Classification for Auction Commodity Titles 使用WordNet語意之拍賣商品標題自動分類 Wei-Jun Liu 劉瑋竣 碩士 國立屏東商業技術學院 資訊工程系(所) 102 This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automatic classification of merchandise titles. The first method first trains the keywords for each class of merchandise, and then exactly extracts the keywords for each testing merchandise title. The extracted feature vector is then classified by the SVM. The second method is designed to loosely extract the keywords for each testing merchandise title. The third method involves the use of machine translation from Chinese keywords to their English translations and the WordNet with semantic structures. In the training phase, the method first derives the best semantic for the English translations of each Chinese keyword. In the testing phase, by using the semantic information of the keywords and the titles, the method extracts extra keywords for the Chinese titles without matched keywords after processed by the second method. The fourth method involves the use of similar keywords, which are the keywords whose semantic distances are short. Based on the third method, the fourth method extends the matched keywords to their similar keywords, and thus each extracted feature vector may contain more keywords with similar meanings. At last, a feature vector can be transformed by a feature extraction method, such as FFT, DCT, etc., into a new feature vector to reduce the size of the feature vector. Then, the SVM can reduce the storage space and the running time while the recognition performance is still retained. Experimental results show the excellent performance of the proposed methods. Cheng-Huang Dong 董呈煌 2014 學位論文 ; thesis 109 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立屏東商業技術學院 === 資訊工程系(所) === 102 === This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automatic classification of merchandise titles. The first method first trains the keywords for each class of merchandise, and then exactly extracts the keywords for each testing merchandise title. The extracted feature vector is then classified by the SVM. The second method is designed to loosely extract the keywords for each testing merchandise title. The third method involves the use of machine translation from Chinese keywords to their English translations and the WordNet with semantic structures. In the training phase, the method first derives the best semantic for the English translations of each Chinese keyword. In the testing phase, by using the semantic information of the keywords and the titles, the method extracts extra keywords for the Chinese titles without matched keywords after processed by the second method. The fourth method involves the use of similar keywords, which are the keywords whose semantic distances are short. Based on the third method, the fourth method extends the matched keywords to their similar keywords, and thus each extracted feature vector may contain more keywords with similar meanings. At last, a feature vector can be transformed by a feature extraction method, such as FFT, DCT, etc., into a new feature vector to reduce the size of the feature vector. Then, the SVM can reduce the storage space and the running time while the recognition performance is still retained. Experimental results show the excellent performance of the proposed methods.
author2 Cheng-Huang Dong
author_facet Cheng-Huang Dong
Wei-Jun Liu
劉瑋竣
author Wei-Jun Liu
劉瑋竣
spellingShingle Wei-Jun Liu
劉瑋竣
WordNet-based Semantic Classification for Auction Commodity Titles
author_sort Wei-Jun Liu
title WordNet-based Semantic Classification for Auction Commodity Titles
title_short WordNet-based Semantic Classification for Auction Commodity Titles
title_full WordNet-based Semantic Classification for Auction Commodity Titles
title_fullStr WordNet-based Semantic Classification for Auction Commodity Titles
title_full_unstemmed WordNet-based Semantic Classification for Auction Commodity Titles
title_sort wordnet-based semantic classification for auction commodity titles
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/94199008472969980675
work_keys_str_mv AT weijunliu wordnetbasedsemanticclassificationforauctioncommoditytitles
AT liúwěijùn wordnetbasedsemanticclassificationforauctioncommoditytitles
AT weijunliu shǐyòngwordnetyǔyìzhīpāimàishāngpǐnbiāotízìdòngfēnlèi
AT liúwěijùn shǐyòngwordnetyǔyìzhīpāimàishāngpǐnbiāotízìdòngfēnlèi
_version_ 1718199138243313664