A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewh...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8664574/ |
id |
doaj-7d2d7ebdf1eb408293ac1112c2cfa798 |
---|---|
record_format |
Article |
spelling |
doaj-7d2d7ebdf1eb408293ac1112c2cfa7982021-03-29T22:22:04ZengIEEEIEEE Access2169-35362019-01-017364593647010.1109/ACCESS.2019.29044038664574A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled DataYuchen Wang0https://orcid.org/0000-0003-1151-7163Yang Yang1Yun-Xia Liu2Anil Anthony Bharath3School of Information Science and Engineering, Shandong University, Qingdao, ChinaSchool of Information Science and Engineering, Shandong University, Qingdao, ChinaSchool of Information Science and Engineering, University of Jinan, Jinan, ChinaDepartment of Biomedical Engineering, Imperial College London, London, U.K.For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results.https://ieeexplore.ieee.org/document/8664574/Noisy labelspruning strategysemi-supervised learningensemble learningdeep learningneural networks |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yuchen Wang Yang Yang Yun-Xia Liu Anil Anthony Bharath |
spellingShingle |
Yuchen Wang Yang Yang Yun-Xia Liu Anil Anthony Bharath A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data IEEE Access Noisy labels pruning strategy semi-supervised learning ensemble learning deep learning neural networks |
author_facet |
Yuchen Wang Yang Yang Yun-Xia Liu Anil Anthony Bharath |
author_sort |
Yuchen Wang |
title |
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data |
title_short |
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data |
title_full |
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data |
title_fullStr |
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data |
title_full_unstemmed |
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data |
title_sort |
recursive ensemble learning approach with noisy labels or unlabeled data |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2019-01-01 |
description |
For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results. |
topic |
Noisy labels pruning strategy semi-supervised learning ensemble learning deep learning neural networks |
url |
https://ieeexplore.ieee.org/document/8664574/ |
work_keys_str_mv |
AT yuchenwang arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT yangyang arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT yunxialiu arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT anilanthonybharath arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT yuchenwang recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT yangyang recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT yunxialiu recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata AT anilanthonybharath recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata |
_version_ |
1724191747268935680 |