An Improving Majority Weighted Minority Oversampling Technique for Imbalanced Classification Problem

Minority oversampling techniques have played a pivotal role in the field of imbalanced learning. While traditional oversampling algorithms can cause problems such as intra-class imbalance of samples, ignoring important information of boundary samples, and high similarity between new and old samples....

Full description

Bibliographic Details
Main Authors: Chao-Ran Wang, Xin-Hui Shao
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9311147/
Description
Summary:Minority oversampling techniques have played a pivotal role in the field of imbalanced learning. While traditional oversampling algorithms can cause problems such as intra-class imbalance of samples, ignoring important information of boundary samples, and high similarity between new and old samples. Based on the situation, we proposed a new type of over-sampling method, BIRCH and Boundary Midpoint Centroid Synthetic Minority Over-Sampling Technique (BI-BMCSMOTE). First of all, the algorithm used the BIRCH clustering method to achieve quick cluster of the minority samples. After identifying and removing the noise, it marked the boundary minority samples in the label by probability. Secondly, it generated a density function for each sample cluster, calculated its density and sampling weight, performed midpoint composite sampling among the minority samples marked by probability and other minority samples in each cluster, and then calculated and analyzed the specific value of composite sampling to improve the accuracy of the model. According to the experimental results, the algorithm was proved to be valid.
ISSN:2169-3536