A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data

For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewh...

Full description

Bibliographic Details
Main Authors: Yuchen Wang, Yang Yang, Yun-Xia Liu, Anil Anthony Bharath
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8664574/
id doaj-7d2d7ebdf1eb408293ac1112c2cfa798
record_format Article
spelling doaj-7d2d7ebdf1eb408293ac1112c2cfa7982021-03-29T22:22:04ZengIEEEIEEE Access2169-35362019-01-017364593647010.1109/ACCESS.2019.29044038664574A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled DataYuchen Wang0https://orcid.org/0000-0003-1151-7163Yang Yang1Yun-Xia Liu2Anil Anthony Bharath3School of Information Science and Engineering, Shandong University, Qingdao, ChinaSchool of Information Science and Engineering, Shandong University, Qingdao, ChinaSchool of Information Science and Engineering, University of Jinan, Jinan, ChinaDepartment of Biomedical Engineering, Imperial College London, London, U.K.For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results.https://ieeexplore.ieee.org/document/8664574/Noisy labelspruning strategysemi-supervised learningensemble learningdeep learningneural networks
collection DOAJ
language English
format Article
sources DOAJ
author Yuchen Wang
Yang Yang
Yun-Xia Liu
Anil Anthony Bharath
spellingShingle Yuchen Wang
Yang Yang
Yun-Xia Liu
Anil Anthony Bharath
A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
IEEE Access
Noisy labels
pruning strategy
semi-supervised learning
ensemble learning
deep learning
neural networks
author_facet Yuchen Wang
Yang Yang
Yun-Xia Liu
Anil Anthony Bharath
author_sort Yuchen Wang
title A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
title_short A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
title_full A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
title_fullStr A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
title_full_unstemmed A Recursive Ensemble Learning Approach With Noisy Labels or Unlabeled Data
title_sort recursive ensemble learning approach with noisy labels or unlabeled data
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results.
topic Noisy labels
pruning strategy
semi-supervised learning
ensemble learning
deep learning
neural networks
url https://ieeexplore.ieee.org/document/8664574/
work_keys_str_mv AT yuchenwang arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT yangyang arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT yunxialiu arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT anilanthonybharath arecursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT yuchenwang recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT yangyang recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT yunxialiu recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
AT anilanthonybharath recursiveensemblelearningapproachwithnoisylabelsorunlabeleddata
_version_ 1724191747268935680