Feature selection for high-dimensional imbalanced microarray data

碩士 === 國立政治大學 === 統計學系 === 107 === Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray...

Full description

Bibliographic Details
Main Authors: Tung, Chen, 董承
Other Authors: CHOU, PEI-TING
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/s4h27m
id ndltd-TW-107NCCU5337013
record_format oai_dc
spelling ndltd-TW-107NCCU53370132019-08-27T03:42:56Z http://ndltd.ncl.edu.tw/handle/s4h27m Feature selection for high-dimensional imbalanced microarray data 高維不平衡基因資料的變數選取 Tung, Chen 董承 碩士 國立政治大學 統計學系 107 Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray data is obtained by using biochips to extract gene expression, and then analyze it. The characteristics of this data is that the sample size is small but with a very high dimension. Based on the problems above, this study selects features of high-dimensional imbalanced microarray data by the concept of biclustering algorithm, and compares it with the F-test method, the Cho's method, and using all variables. The performance of proposed method is similar to the F-test method and superior to the Cho's method and using all variables. CHOU, PEI-TING 周珮婷 2019 學位論文 ; thesis 43 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立政治大學 === 統計學系 === 107 === Imbalanced data is a common data type in different fields, for example, novelty detection, risk management, medical diagnosis and so on. In these data types, minority class is usually the main target to study. In this study, we focus on microarray data. Microarray data is obtained by using biochips to extract gene expression, and then analyze it. The characteristics of this data is that the sample size is small but with a very high dimension. Based on the problems above, this study selects features of high-dimensional imbalanced microarray data by the concept of biclustering algorithm, and compares it with the F-test method, the Cho's method, and using all variables. The performance of proposed method is similar to the F-test method and superior to the Cho's method and using all variables.
author2 CHOU, PEI-TING
author_facet CHOU, PEI-TING
Tung, Chen
董承
author Tung, Chen
董承
spellingShingle Tung, Chen
董承
Feature selection for high-dimensional imbalanced microarray data
author_sort Tung, Chen
title Feature selection for high-dimensional imbalanced microarray data
title_short Feature selection for high-dimensional imbalanced microarray data
title_full Feature selection for high-dimensional imbalanced microarray data
title_fullStr Feature selection for high-dimensional imbalanced microarray data
title_full_unstemmed Feature selection for high-dimensional imbalanced microarray data
title_sort feature selection for high-dimensional imbalanced microarray data
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/s4h27m
work_keys_str_mv AT tungchen featureselectionforhighdimensionalimbalancedmicroarraydata
AT dǒngchéng featureselectionforhighdimensionalimbalancedmicroarraydata
AT tungchen gāowéibùpínghéngjīyīnzīliàodebiànshùxuǎnqǔ
AT dǒngchéng gāowéibùpínghéngjīyīnzīliàodebiànshùxuǎnqǔ
_version_ 1719237602592161792