How to balance the bioinformatics data: pseudo-negative sampling

Abstract Background Imbalanced datasets are commonly encountered in bioinformatics classification problems, that is, the number of negative samples is much larger than that of positive samples. Particularly, the data imbalance phenomena will make us underestimate the performance of the minority clas...

Full description

Bibliographic Details
Main Authors: Yongqing Zhang, Shaojie Qiao, Rongzhao Lu, Nan Han, Dingxiang Liu, Jiliu Zhou
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-019-3269-4