An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning

The rapid development of artificial intelligence has allowed deep learning technology to change our lives and has brought considerable convenience, but deep learning cannot succeed without a sufficient quantity and quality of data. In medical systems, due to the special nature of medical data resour...

Full description

Bibliographic Details
Main Authors: Jiancun Zhou, Rui Cao, Jian Kang, Kehua Guo, Yangting Xu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9159118/
id doaj-e5fc07eaa1594174ae76328759127528
record_format Article
spelling doaj-e5fc07eaa1594174ae763287591275282021-03-30T04:52:12ZengIEEEIEEE Access2169-35362020-01-01814433114434210.1109/ACCESS.2020.30143559159118An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active LearningJiancun Zhou0Rui Cao1Jian Kang2Kehua Guo3https://orcid.org/0000-0003-4143-6399Yangting Xu4All-Solid-State Energy Storage Materials and Devices Key Laboratory of Hunan Province, Hunan City University, Yiyang, ChinaSchool of Computer Science and Engineering, Central South University, Changsha, ChinaDepartment of Dermatology, Third Xiangya Hospital, Central South University, Changsha, ChinaSchool of Computer Science and Engineering, Central South University, Changsha, ChinaDepartment of Dermatology, Third Xiangya Hospital, Central South University, Changsha, ChinaThe rapid development of artificial intelligence has allowed deep learning technology to change our lives and has brought considerable convenience, but deep learning cannot succeed without a sufficient quantity and quality of data. In medical systems, due to the special nature of medical data resources, labeling and screening require professional input from doctors at considerable cost. However, if these data cannot be used effectively, then resources are wasted. To solve this problem, this paper proposes an effective high-quality medical lesion image data labeling method based on active learning, which labels the most representative and high-quality medical image data with artificial assistance. First, we generated subregions for all unlabeled images and predicted their classifications. Second, multifactor calculations were performed on all images. Finally, the values of multiple factors were used to sort all images, and the top-ranked images were selected and labeled with artificial assistance. The above steps were repeated until a suitable number of datasets had been labeled. The experimental results showed that a model trained on the labeled high-quality dataset could achieve the same quality as the model trained on all the data and save a considerable amount of time on manual labeling, which demonstrates the effectiveness of the method. The method ensures that the labeled data are valuable, high quality and rich in information to reduce the labeling workload and avoid wasting data resources.https://ieeexplore.ieee.org/document/9159118/High-quality databiomedical engineeringactive learningdeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Jiancun Zhou
Rui Cao
Jian Kang
Kehua Guo
Yangting Xu
spellingShingle Jiancun Zhou
Rui Cao
Jian Kang
Kehua Guo
Yangting Xu
An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
IEEE Access
High-quality data
biomedical engineering
active learning
deep learning
author_facet Jiancun Zhou
Rui Cao
Jian Kang
Kehua Guo
Yangting Xu
author_sort Jiancun Zhou
title An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
title_short An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
title_full An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
title_fullStr An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
title_full_unstemmed An Efficient High-Quality Medical Lesion Image Data Labeling Method Based on Active Learning
title_sort efficient high-quality medical lesion image data labeling method based on active learning
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The rapid development of artificial intelligence has allowed deep learning technology to change our lives and has brought considerable convenience, but deep learning cannot succeed without a sufficient quantity and quality of data. In medical systems, due to the special nature of medical data resources, labeling and screening require professional input from doctors at considerable cost. However, if these data cannot be used effectively, then resources are wasted. To solve this problem, this paper proposes an effective high-quality medical lesion image data labeling method based on active learning, which labels the most representative and high-quality medical image data with artificial assistance. First, we generated subregions for all unlabeled images and predicted their classifications. Second, multifactor calculations were performed on all images. Finally, the values of multiple factors were used to sort all images, and the top-ranked images were selected and labeled with artificial assistance. The above steps were repeated until a suitable number of datasets had been labeled. The experimental results showed that a model trained on the labeled high-quality dataset could achieve the same quality as the model trained on all the data and save a considerable amount of time on manual labeling, which demonstrates the effectiveness of the method. The method ensures that the labeled data are valuable, high quality and rich in information to reduce the labeling workload and avoid wasting data resources.
topic High-quality data
biomedical engineering
active learning
deep learning
url https://ieeexplore.ieee.org/document/9159118/
work_keys_str_mv AT jiancunzhou anefficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT ruicao anefficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT jiankang anefficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT kehuaguo anefficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT yangtingxu anefficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT jiancunzhou efficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT ruicao efficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT jiankang efficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT kehuaguo efficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
AT yangtingxu efficienthighqualitymedicallesionimagedatalabelingmethodbasedonactivelearning
_version_ 1724181160999780352