An investigation of XGBoost-based algorithm for breast cancer classification
Breast cancer is one of the leading cancers affecting women around the world. The Computer-Aided Diagnosis (CAD) system is a powerful tool to assist pathologists during the process of diagnosing cancer, which effectively identifies the presence of cancerous cells. A standard CAD system includes proc...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2021-12-01
|
Series: | Machine Learning with Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666827021000773 |
id |
doaj-37db60bede1e4475a25952f903db01cd |
---|---|
record_format |
Article |
spelling |
doaj-37db60bede1e4475a25952f903db01cd2021-09-13T04:15:18ZengElsevierMachine Learning with Applications2666-82702021-12-016100154An investigation of XGBoost-based algorithm for breast cancer classificationXin Yu Liew0Nazia Hameed1Jeremie Clos2Corresponding author.; University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, United KingdomUniversity of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, United KingdomUniversity of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, United KingdomBreast cancer is one of the leading cancers affecting women around the world. The Computer-Aided Diagnosis (CAD) system is a powerful tool to assist pathologists during the process of diagnosing cancer, which effectively identifies the presence of cancerous cells. A standard CAD system includes processes of pre-processing, feature extraction, feature selection and classification. In this paper, we propose an enhanced breast cancer classification technique called Deep Learning and eXtreme Gradient Boosting (DLXGB) on histopathology breast cancer images using the BreaKHis dataset. This method first applies data augmentation and stain normalization for pre-processing, then pre-trained DenseNet201 will automatically learn features within an image and combine with a powerful gradient boosting classifier. The proposed classification technique is designed to classify breast cancer histology images into binary benign and malignant, and additionally one of eight non-overlapping/overlapping categories: i.e., Adenosis (A), Fibroadenoma (F), Phyllodes Tumour (PT), And Tubular Adenoma (TA) Ductal Carcinoma (DC), Lobular Carcinoma (LC), Mucinous Carcinoma (MC), And Papillary Carcinoma (PC). With DLXGB, we have obtained an accuracy of 97% for both binary and multi-classification improving the exiting work done by researchers using the BreaKHis dataset. The results indicated that this method could produce a powerful prediction for breast cancer image classification.http://www.sciencedirect.com/science/article/pii/S2666827021000773Deep learningExtreme gradient boostingXGBoostMachine learningComputer-aided diagnosisBreast cancer |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xin Yu Liew Nazia Hameed Jeremie Clos |
spellingShingle |
Xin Yu Liew Nazia Hameed Jeremie Clos An investigation of XGBoost-based algorithm for breast cancer classification Machine Learning with Applications Deep learning Extreme gradient boosting XGBoost Machine learning Computer-aided diagnosis Breast cancer |
author_facet |
Xin Yu Liew Nazia Hameed Jeremie Clos |
author_sort |
Xin Yu Liew |
title |
An investigation of XGBoost-based algorithm for breast cancer classification |
title_short |
An investigation of XGBoost-based algorithm for breast cancer classification |
title_full |
An investigation of XGBoost-based algorithm for breast cancer classification |
title_fullStr |
An investigation of XGBoost-based algorithm for breast cancer classification |
title_full_unstemmed |
An investigation of XGBoost-based algorithm for breast cancer classification |
title_sort |
investigation of xgboost-based algorithm for breast cancer classification |
publisher |
Elsevier |
series |
Machine Learning with Applications |
issn |
2666-8270 |
publishDate |
2021-12-01 |
description |
Breast cancer is one of the leading cancers affecting women around the world. The Computer-Aided Diagnosis (CAD) system is a powerful tool to assist pathologists during the process of diagnosing cancer, which effectively identifies the presence of cancerous cells. A standard CAD system includes processes of pre-processing, feature extraction, feature selection and classification. In this paper, we propose an enhanced breast cancer classification technique called Deep Learning and eXtreme Gradient Boosting (DLXGB) on histopathology breast cancer images using the BreaKHis dataset. This method first applies data augmentation and stain normalization for pre-processing, then pre-trained DenseNet201 will automatically learn features within an image and combine with a powerful gradient boosting classifier. The proposed classification technique is designed to classify breast cancer histology images into binary benign and malignant, and additionally one of eight non-overlapping/overlapping categories: i.e., Adenosis (A), Fibroadenoma (F), Phyllodes Tumour (PT), And Tubular Adenoma (TA) Ductal Carcinoma (DC), Lobular Carcinoma (LC), Mucinous Carcinoma (MC), And Papillary Carcinoma (PC). With DLXGB, we have obtained an accuracy of 97% for both binary and multi-classification improving the exiting work done by researchers using the BreaKHis dataset. The results indicated that this method could produce a powerful prediction for breast cancer image classification. |
topic |
Deep learning Extreme gradient boosting XGBoost Machine learning Computer-aided diagnosis Breast cancer |
url |
http://www.sciencedirect.com/science/article/pii/S2666827021000773 |
work_keys_str_mv |
AT xinyuliew aninvestigationofxgboostbasedalgorithmforbreastcancerclassification AT naziahameed aninvestigationofxgboostbasedalgorithmforbreastcancerclassification AT jeremieclos aninvestigationofxgboostbasedalgorithmforbreastcancerclassification AT xinyuliew investigationofxgboostbasedalgorithmforbreastcancerclassification AT naziahameed investigationofxgboostbasedalgorithmforbreastcancerclassification AT jeremieclos investigationofxgboostbasedalgorithmforbreastcancerclassification |
_version_ |
1717381493978824704 |