Feature selection for high-dimensional imbalanced microarray data
碩士 === 國立政治大學 === 統計學系 === 107 === Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/s4h27m |
id |
ndltd-TW-107NCCU5337013 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NCCU53370132019-08-27T03:42:56Z http://ndltd.ncl.edu.tw/handle/s4h27m Feature selection for high-dimensional imbalanced microarray data 高維不平衡基因資料的變數選取 Tung, Chen 董承 碩士 國立政治大學 統計學系 107 Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray data is obtained by using biochips to extract gene expression, and then analyze it. The characteristics of this data is that the sample size is small but with a very high dimension. Based on the problems above, this study selects features of high-dimensional imbalanced microarray data by the concept of biclustering algorithm, and compares it with the F-test method, the Cho's method, and using all variables. The performance of proposed method is similar to the F-test method and superior to the Cho's method and using all variables. CHOU, PEI-TING 周珮婷 2019 學位論文 ; thesis 43 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立政治大學 === 統計學系 === 107 === Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray data is obtained by using biochips to extract gene expression, and then analyze it. The characteristics of this data is that the sample size is small but with a very high dimension. Based on the problems above, this study selects features of high-dimensional imbalanced microarray data by the concept of biclustering algorithm, and compares it with the F-test method, the Cho's method, and using all variables. The performance of proposed method is similar to the F-test method and superior to the Cho's method and using all variables.
|
author2 |
CHOU, PEI-TING |
author_facet |
CHOU, PEI-TING Tung, Chen 董承 |
author |
Tung, Chen 董承 |
spellingShingle |
Tung, Chen 董承 Feature selection for high-dimensional imbalanced microarray data |
author_sort |
Tung, Chen |
title |
Feature selection for high-dimensional imbalanced microarray data |
title_short |
Feature selection for high-dimensional imbalanced microarray data |
title_full |
Feature selection for high-dimensional imbalanced microarray data |
title_fullStr |
Feature selection for high-dimensional imbalanced microarray data |
title_full_unstemmed |
Feature selection for high-dimensional imbalanced microarray data |
title_sort |
feature selection for high-dimensional imbalanced microarray data |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/s4h27m |
work_keys_str_mv |
AT tungchen featureselectionforhighdimensionalimbalancedmicroarraydata AT dǒngchéng featureselectionforhighdimensionalimbalancedmicroarraydata AT tungchen gāowéibùpínghéngjīyīnzīliàodebiànshùxuǎnqǔ AT dǒngchéng gāowéibùpínghéngjīyīnzīliàodebiànshùxuǎnqǔ |
_version_ |
1719237602592161792 |