WordNet-based Semantic Classification for Auction Commodity Titles
碩士 === 國立屏東商業技術學院 === 資訊工程系(所) === 102 === This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automati...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/94199008472969980675 |
id |
ndltd-TW-102NPC05392009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102NPC053920092016-03-04T04:14:52Z http://ndltd.ncl.edu.tw/handle/94199008472969980675 WordNet-based Semantic Classification for Auction Commodity Titles 使用WordNet語意之拍賣商品標題自動分類 Wei-Jun Liu 劉瑋竣 碩士 國立屏東商業技術學院 資訊工程系(所) 102 This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automatic classification of merchandise titles. The first method first trains the keywords for each class of merchandise, and then exactly extracts the keywords for each testing merchandise title. The extracted feature vector is then classified by the SVM. The second method is designed to loosely extract the keywords for each testing merchandise title. The third method involves the use of machine translation from Chinese keywords to their English translations and the WordNet with semantic structures. In the training phase, the method first derives the best semantic for the English translations of each Chinese keyword. In the testing phase, by using the semantic information of the keywords and the titles, the method extracts extra keywords for the Chinese titles without matched keywords after processed by the second method. The fourth method involves the use of similar keywords, which are the keywords whose semantic distances are short. Based on the third method, the fourth method extends the matched keywords to their similar keywords, and thus each extracted feature vector may contain more keywords with similar meanings. At last, a feature vector can be transformed by a feature extraction method, such as FFT, DCT, etc., into a new feature vector to reduce the size of the feature vector. Then, the SVM can reduce the storage space and the running time while the recognition performance is still retained. Experimental results show the excellent performance of the proposed methods. Cheng-Huang Dong 董呈煌 2014 學位論文 ; thesis 109 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立屏東商業技術學院 === 資訊工程系(所) === 102 === This research aims at automatically classification merchandise in an auction website according to the Chinese titles of merchandise. Because of the shortcomings of traditional article classification, this papers proposes four methods to improve the automatic classification of merchandise titles. The first method first trains the keywords for each class of merchandise, and then exactly extracts the keywords for each testing merchandise title. The extracted feature vector is then classified by the SVM. The second method is designed to loosely extract the keywords for each testing merchandise title. The third method involves the use of machine translation from Chinese keywords to their English translations and the WordNet with semantic structures. In the training phase, the method first derives the best semantic for the English translations of each Chinese keyword. In the testing phase, by using the semantic information of the keywords and the titles, the method extracts extra keywords for the Chinese titles without matched keywords after processed by the second method. The fourth method involves the use of similar keywords, which are the keywords whose semantic distances are short. Based on the third method, the fourth method extends the matched keywords to their similar keywords, and thus each extracted feature vector may contain more keywords with similar meanings. At last, a feature vector can be transformed by a feature extraction method, such as FFT, DCT, etc., into a new feature vector to reduce the size of the feature vector. Then, the SVM can reduce the storage space and the running time while the recognition performance is still retained. Experimental results show the excellent performance of the proposed methods.
|
author2 |
Cheng-Huang Dong |
author_facet |
Cheng-Huang Dong Wei-Jun Liu 劉瑋竣 |
author |
Wei-Jun Liu 劉瑋竣 |
spellingShingle |
Wei-Jun Liu 劉瑋竣 WordNet-based Semantic Classification for Auction Commodity Titles |
author_sort |
Wei-Jun Liu |
title |
WordNet-based Semantic Classification for Auction Commodity Titles |
title_short |
WordNet-based Semantic Classification for Auction Commodity Titles |
title_full |
WordNet-based Semantic Classification for Auction Commodity Titles |
title_fullStr |
WordNet-based Semantic Classification for Auction Commodity Titles |
title_full_unstemmed |
WordNet-based Semantic Classification for Auction Commodity Titles |
title_sort |
wordnet-based semantic classification for auction commodity titles |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/94199008472969980675 |
work_keys_str_mv |
AT weijunliu wordnetbasedsemanticclassificationforauctioncommoditytitles AT liúwěijùn wordnetbasedsemanticclassificationforauctioncommoditytitles AT weijunliu shǐyòngwordnetyǔyìzhīpāimàishāngpǐnbiāotízìdòngfēnlèi AT liúwěijùn shǐyòngwordnetyǔyìzhīpāimàishāngpǐnbiāotízìdòngfēnlèi |
_version_ |
1718199138243313664 |