An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem
Minority oversampling techniques have played a pivotal role in the field of imbalanced learning. While traditional oversampling algorithms can cause problems such as intra-class imbalance of samples, ignoring important information of boundary samples, and high similarity between new and old samples....
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9311147/ |
id |
doaj-3b33d7e789e14729a5a475e0d7fbcf12 |
---|---|
record_format |
Article |
spelling |
doaj-3b33d7e789e14729a5a475e0d7fbcf122021-03-30T15:16:46ZengIEEEIEEE Access2169-35362021-01-0195069508210.1109/ACCESS.2020.30479239311147An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification ProblemChao-Ran Wang0https://orcid.org/0000-0002-2071-2633Xin-Hui Shao1https://orcid.org/0000-0002-4120-8428College of Sciences, Northeastern University, Shenyang, ChinaCollege of Sciences, Northeastern University, Shenyang, ChinaMinority oversampling techniques have played a pivotal role in the field of imbalanced learning. While traditional oversampling algorithms can cause problems such as intra-class imbalance of samples, ignoring important information of boundary samples, and high similarity between new and old samples. Based on the situation, we proposed a new type of over-sampling method, BIRCH and Boundary Midpoint Centroid Synthetic Minority Over-Sampling Technique (BI-BMCSMOTE). First of all, the algorithm used the BIRCH clustering method to achieve quick cluster of the minority samples. After identifying and removing the noise, it marked the boundary minority samples in the label by probability. Secondly, it generated a density function for each sample cluster, calculated its density and sampling weight, performed midpoint composite sampling among the minority samples marked by probability and other minority samples in each cluster, and then calculated and analyzed the specific value of composite sampling to improve the accuracy of the model. According to the experimental results, the algorithm was proved to be valid.https://ieeexplore.ieee.org/document/9311147/Oversamplingboundaryminority sampleSMOTEBIRCHimbalanced learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chao-Ran Wang Xin-Hui Shao |
spellingShingle |
Chao-Ran Wang Xin-Hui Shao An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem IEEE Access Oversampling boundary minority sample SMOTE BIRCH imbalanced learning |
author_facet |
Chao-Ran Wang Xin-Hui Shao |
author_sort |
Chao-Ran Wang |
title |
An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem |
title_short |
An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem |
title_full |
An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem |
title_fullStr |
An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem |
title_full_unstemmed |
An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem |
title_sort |
improving majority weighted minority oversampling technique for imbalanced classification problem |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
Minority oversampling techniques have played a pivotal role in the field of imbalanced learning. While traditional oversampling algorithms can cause problems such as intra-class imbalance of samples, ignoring important information of boundary samples, and high similarity between new and old samples. Based on the situation, we proposed a new type of over-sampling method, BIRCH and Boundary Midpoint Centroid Synthetic Minority Over-Sampling Technique (BI-BMCSMOTE). First of all, the algorithm used the BIRCH clustering method to achieve quick cluster of the minority samples. After identifying and removing the noise, it marked the boundary minority samples in the label by probability. Secondly, it generated a density function for each sample cluster, calculated its density and sampling weight, performed midpoint composite sampling among the minority samples marked by probability and other minority samples in each cluster, and then calculated and analyzed the specific value of composite sampling to improve the accuracy of the model. According to the experimental results, the algorithm was proved to be valid. |
topic |
Oversampling boundary minority sample SMOTE BIRCH imbalanced learning |
url |
https://ieeexplore.ieee.org/document/9311147/ |
work_keys_str_mv |
AT chaoranwang animprovingmajorityweightedminorityoversamplingtechniqueforimbalancedclassificationproblem AT xinhuishao animprovingmajorityweightedminorityoversamplingtechniqueforimbalancedclassificationproblem AT chaoranwang improvingmajorityweightedminorityoversamplingtechniqueforimbalancedclassificationproblem AT xinhuishao improvingmajorityweightedminorityoversamplingtechniqueforimbalancedclassificationproblem |
_version_ |
1724179733677080576 |