Extending Attribute Information to Improve Classification Performance for Small Data Sets

博士 === 國立成功大學 === 工業與資訊管理學系碩博士班 === 98 === Learning from small data sets is fundamentally difficult. In many data sets such as gene in medicine field or scheduling in the early manufacturing process, the data sizes are often not only small, but they also have high dimensions. Generally, a too small...

Full description

Bibliographic Details
Main Authors: Chiao-WenLiu, 劉巧雯
Other Authors: Der-Chiang Li
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/78366346167845175793
id ndltd-TW-098NCKU5041009
record_format oai_dc
spelling ndltd-TW-098NCKU50410092015-10-13T18:25:53Z http://ndltd.ncl.edu.tw/handle/78366346167845175793 Extending Attribute Information to Improve Classification Performance for Small Data Sets 擴充屬性資訊以提升小樣本分類之效果 Chiao-WenLiu 劉巧雯 博士 國立成功大學 工業與資訊管理學系碩博士班 98 Learning from small data sets is fundamentally difficult. In many data sets such as gene in medicine field or scheduling in the early manufacturing process, the data sizes are often not only small, but they also have high dimensions. Generally, a too small data size will detract modeling accuracy, and too many data attributes will affect the efficiency of the analysis. This research proposed a method for attribute analysis to enhance the analysis efficiency and accuracy for small data set. The proposed method includes two techniques; one called the class possibility method which uses a fuzzy membership function to build up the class possibility value for each data point in every attribute. The other technique called attribute construction aims to non-linearly create hidden attributes and combine attributes with high correlation value into principal attributes. Three data sets, an early flexible manufacturing system, Pima Indians diabetes data set, and Wisconsin breast cancer data set, are employed to prove the proposed method having better classification performance than other studies. Der-Chiang Li 利德江 2009 學位論文 ; thesis 83 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立成功大學 === 工業與資訊管理學系碩博士班 === 98 === Learning from small data sets is fundamentally difficult. In many data sets such as gene in medicine field or scheduling in the early manufacturing process, the data sizes are often not only small, but they also have high dimensions. Generally, a too small data size will detract modeling accuracy, and too many data attributes will affect the efficiency of the analysis. This research proposed a method for attribute analysis to enhance the analysis efficiency and accuracy for small data set. The proposed method includes two techniques; one called the class possibility method which uses a fuzzy membership function to build up the class possibility value for each data point in every attribute. The other technique called attribute construction aims to non-linearly create hidden attributes and combine attributes with high correlation value into principal attributes. Three data sets, an early flexible manufacturing system, Pima Indians diabetes data set, and Wisconsin breast cancer data set, are employed to prove the proposed method having better classification performance than other studies.
author2 Der-Chiang Li
author_facet Der-Chiang Li
Chiao-WenLiu
劉巧雯
author Chiao-WenLiu
劉巧雯
spellingShingle Chiao-WenLiu
劉巧雯
Extending Attribute Information to Improve Classification Performance for Small Data Sets
author_sort Chiao-WenLiu
title Extending Attribute Information to Improve Classification Performance for Small Data Sets
title_short Extending Attribute Information to Improve Classification Performance for Small Data Sets
title_full Extending Attribute Information to Improve Classification Performance for Small Data Sets
title_fullStr Extending Attribute Information to Improve Classification Performance for Small Data Sets
title_full_unstemmed Extending Attribute Information to Improve Classification Performance for Small Data Sets
title_sort extending attribute information to improve classification performance for small data sets
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/78366346167845175793
work_keys_str_mv AT chiaowenliu extendingattributeinformationtoimproveclassificationperformanceforsmalldatasets
AT liúqiǎowén extendingattributeinformationtoimproveclassificationperformanceforsmalldatasets
AT chiaowenliu kuòchōngshǔxìngzīxùnyǐtíshēngxiǎoyàngběnfēnlèizhīxiàoguǒ
AT liúqiǎowén kuòchōngshǔxìngzīxùnyǐtíshēngxiǎoyàngběnfēnlèizhīxiàoguǒ
_version_ 1718033345070235648