Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds

An imbalanced dataset is a significant challenge when training a deep neural network (DNN) model for deep learning problems, such as weeds classification. An imbalanced dataset may result in a model that behaves robustly on major classes and is overly sensitive to minor classes. This article propose...

Full description

Bibliographic Details
Main Authors: Vo Hoang Trong, Yu GwangHyun, Kim JinYoung, Pham The Bao
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/8/3331
id doaj-45707c8174c44c7abaae4d2d8ad6eac2
record_format Article
spelling doaj-45707c8174c44c7abaae4d2d8ad6eac22021-04-07T23:06:23ZengMDPI AGApplied Sciences2076-34172021-04-01113331333110.3390/app11083331Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced WeedsVo Hoang Trong0Yu GwangHyun1Kim JinYoung2Pham The Bao3Department of ICT Convergence System Engineering, Chonnam National University, Gwangju 61186, KoreaDepartment of ICT Convergence System Engineering, Chonnam National University, Gwangju 61186, KoreaDepartment of ICT Convergence System Engineering, Chonnam National University, Gwangju 61186, KoreaFaculty of Information Technology, Saigon University, Ho Chi Minh City 72710, VietnamAn imbalanced dataset is a significant challenge when training a deep neural network (DNN) model for deep learning problems, such as weeds classification. An imbalanced dataset may result in a model that behaves robustly on major classes and is overly sensitive to minor classes. This article proposes a yielding multi-fold training (YMufT) strategy to train a DNN model on an imbalanced dataset. This strategy reduces the bias in training through a min-class-max-bound procedure (MCMB), which divides samples in the training set into multiple folds. The model is consecutively trained on each one of these folds. In practice, we experiment with our proposed strategy on two small (PlantSeedlings, small PlantVillage) and two large (Chonnam National University (CNU), large PlantVillage) weeds datasets. With the same training configurations and approximate training steps used in conventional training methods, YMufT helps the DNN model to converge faster, thus requiring less training time. Despite a slight decrease in accuracy on the large dataset, YMufT increases the F1 score in the NASNet model to 0.9708 on the CNU dataset and 0.9928 when using the Mobilenet model training on the large PlantVillage dataset. YMufT shows outstanding performance in both accuracy and F1 score on small datasets, with values of (0.9981, 0.9970) using the Mobilenet model for training on small PlantVillage dataset and (0.9718, 0.9689) using Resnet to train on the PlantSeedlings dataset. Grad-CAM visualization shows that conventional training methods mainly concentrate on high-level features and may capture insignificant features. In contrast, YMufT guides the model to capture essential features on the leaf surface and properly localize the weeds targets.https://www.mdpi.com/2076-3417/11/8/3331imbalanced datasetdeep neural networkweeds classificationGrad-CAM
collection DOAJ
language English
format Article
sources DOAJ
author Vo Hoang Trong
Yu GwangHyun
Kim JinYoung
Pham The Bao
spellingShingle Vo Hoang Trong
Yu GwangHyun
Kim JinYoung
Pham The Bao
Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
Applied Sciences
imbalanced dataset
deep neural network
weeds classification
Grad-CAM
author_facet Vo Hoang Trong
Yu GwangHyun
Kim JinYoung
Pham The Bao
author_sort Vo Hoang Trong
title Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
title_short Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
title_full Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
title_fullStr Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
title_full_unstemmed Yielding Multi-Fold Training Strategy for Image Classification of Imbalanced Weeds
title_sort yielding multi-fold training strategy for image classification of imbalanced weeds
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-04-01
description An imbalanced dataset is a significant challenge when training a deep neural network (DNN) model for deep learning problems, such as weeds classification. An imbalanced dataset may result in a model that behaves robustly on major classes and is overly sensitive to minor classes. This article proposes a yielding multi-fold training (YMufT) strategy to train a DNN model on an imbalanced dataset. This strategy reduces the bias in training through a min-class-max-bound procedure (MCMB), which divides samples in the training set into multiple folds. The model is consecutively trained on each one of these folds. In practice, we experiment with our proposed strategy on two small (PlantSeedlings, small PlantVillage) and two large (Chonnam National University (CNU), large PlantVillage) weeds datasets. With the same training configurations and approximate training steps used in conventional training methods, YMufT helps the DNN model to converge faster, thus requiring less training time. Despite a slight decrease in accuracy on the large dataset, YMufT increases the F1 score in the NASNet model to 0.9708 on the CNU dataset and 0.9928 when using the Mobilenet model training on the large PlantVillage dataset. YMufT shows outstanding performance in both accuracy and F1 score on small datasets, with values of (0.9981, 0.9970) using the Mobilenet model for training on small PlantVillage dataset and (0.9718, 0.9689) using Resnet to train on the PlantSeedlings dataset. Grad-CAM visualization shows that conventional training methods mainly concentrate on high-level features and may capture insignificant features. In contrast, YMufT guides the model to capture essential features on the leaf surface and properly localize the weeds targets.
topic imbalanced dataset
deep neural network
weeds classification
Grad-CAM
url https://www.mdpi.com/2076-3417/11/8/3331
work_keys_str_mv AT vohoangtrong yieldingmultifoldtrainingstrategyforimageclassificationofimbalancedweeds
AT yugwanghyun yieldingmultifoldtrainingstrategyforimageclassificationofimbalancedweeds
AT kimjinyoung yieldingmultifoldtrainingstrategyforimageclassificationofimbalancedweeds
AT phamthebao yieldingmultifoldtrainingstrategyforimageclassificationofimbalancedweeds
_version_ 1721535493654446080