Managing imbalanced training data by sequential segmentation in machine learning
Imbalanced training data is a common problem in machine learning applications. Thisproblem refers to datasets in which the foreground pixels are significantly fewer thanthe background pixels. By training a machine learning model with imbalanced data, theresult is typically a model that classifies al...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Linköpings universitet, Avdelningen för medicinsk teknik
2019
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155091 |
id |
ndltd-UPSALLA1-oai-DiVA.org-liu-155091 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-liu-1550912019-03-20T17:36:33ZManaging imbalanced training data by sequential segmentation in machine learningengBardolet Pettersson, SusanaLinköpings universitet, Avdelningen för medicinsk teknik2019Medical Image ProcessingMedicinsk bildbehandlingImbalanced training data is a common problem in machine learning applications. Thisproblem refers to datasets in which the foreground pixels are significantly fewer thanthe background pixels. By training a machine learning model with imbalanced data, theresult is typically a model that classifies all pixels as the background class. A result thatindicates no presence of a specific condition when it is actually present is particularlyundesired in medical imaging applications. This project proposes a sequential system oftwo fully convolutional neural networks to tackle the problem. Semantic segmentation oflung nodules in thoracic computed tomography images has been performed to evaluate theperformance of the system. The imbalanced data problem is present in the training datasetused in this project, where the average percentage of pixels belonging to the foregroundclass is 0.0038 %. The sequential system achieved a sensitivity of 83.1 % representing anincrease of 34 % compared to the single system. The system only missed 16.83% of thenodules but had a Dice score of 21.6 % due to the detection of multiple false positives. Thismethod shows considerable potential to be a solution to the imbalanced data problem withcontinued development. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155091application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Medical Image Processing Medicinsk bildbehandling |
spellingShingle |
Medical Image Processing Medicinsk bildbehandling Bardolet Pettersson, Susana Managing imbalanced training data by sequential segmentation in machine learning |
description |
Imbalanced training data is a common problem in machine learning applications. Thisproblem refers to datasets in which the foreground pixels are significantly fewer thanthe background pixels. By training a machine learning model with imbalanced data, theresult is typically a model that classifies all pixels as the background class. A result thatindicates no presence of a specific condition when it is actually present is particularlyundesired in medical imaging applications. This project proposes a sequential system oftwo fully convolutional neural networks to tackle the problem. Semantic segmentation oflung nodules in thoracic computed tomography images has been performed to evaluate theperformance of the system. The imbalanced data problem is present in the training datasetused in this project, where the average percentage of pixels belonging to the foregroundclass is 0.0038 %. The sequential system achieved a sensitivity of 83.1 % representing anincrease of 34 % compared to the single system. The system only missed 16.83% of thenodules but had a Dice score of 21.6 % due to the detection of multiple false positives. Thismethod shows considerable potential to be a solution to the imbalanced data problem withcontinued development. |
author |
Bardolet Pettersson, Susana |
author_facet |
Bardolet Pettersson, Susana |
author_sort |
Bardolet Pettersson, Susana |
title |
Managing imbalanced training data by sequential segmentation in machine learning |
title_short |
Managing imbalanced training data by sequential segmentation in machine learning |
title_full |
Managing imbalanced training data by sequential segmentation in machine learning |
title_fullStr |
Managing imbalanced training data by sequential segmentation in machine learning |
title_full_unstemmed |
Managing imbalanced training data by sequential segmentation in machine learning |
title_sort |
managing imbalanced training data by sequential segmentation in machine learning |
publisher |
Linköpings universitet, Avdelningen för medicinsk teknik |
publishDate |
2019 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155091 |
work_keys_str_mv |
AT bardoletpetterssonsusana managingimbalancedtrainingdatabysequentialsegmentationinmachinelearning |
_version_ |
1719004452050960384 |