Using Ensemble Machine Learning Methods in Estimating Software Development Effort

Background: Software Development Effort Estimation is a process that focuses on estimating the required effort to develop a software project with a minimal budget. Estimating effort includes interpretation of required manpower, resources, time and schedule. Project managers are responsible for estim...

Full description

Bibliographic Details
Main Author: Kanneganti, Alekhya
Format: Others
Language:English
Published: Blekinge Tekniska Högskola, Institutionen för datavetenskap 2020
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20691
id ndltd-UPSALLA1-oai-DiVA.org-bth-20691
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-bth-206912020-11-26T05:26:58ZUsing Ensemble Machine Learning Methods in Estimating Software Development EffortengKanneganti, AlekhyaBlekinge Tekniska Högskola, Institutionen för datavetenskap2020Software Development EffortEnsembleEnsemble LearningStacking EnsembleSoftware Development Effort EstimationMachine LearningEstimation of Software Development EffortEffort EstimationComputer SciencesDatavetenskap (datalogi)Background: Software Development Effort Estimation is a process that focuses on estimating the required effort to develop a software project with a minimal budget. Estimating effort includes interpretation of required manpower, resources, time and schedule. Project managers are responsible for estimating the required effort. A model that can predict software development effort efficiently comes in hand and acts as a decision support system for the project managers to enhance the precision in estimating effort. Therefore, the context of this study is to increase the efficiency in estimating software development effort. Objective: The main objective of this thesis is to identify an effective ensemble method to build and implement it, in estimating software development effort. Apart from this, parameter tuning is also implemented to improve the performance of the model. Finally, we compare the results of the developed model with the existing models. Method: In this thesis, we have adopted two research methods. Initially, a Literature Review was conducted to gain knowledge on the existing studies, machine learning techniques, datasets, ensemble methods that were previously used in estimating Software Development Effort. Then a controlled Experiment was conducted in order to build an ensemble model and to evaluate the performance of the ensemble model for determining if the developed model has a better performance when compared to the existing models.   Results: After conducting literature review and collecting evidence, we have decided to build and implement stacked generalization ensemble method in this thesis, with the help of individual machine learning techniques like Support vector regressor (SVR), K-Nearest Neighbors regressor (KNN), Decision Tree Regressor (DTR), Linear Regressor (LR), Multi-Layer Perceptron Regressor (MLP) Random Forest Regressor (RFR), Gradient Boosting Regressor (GBR), AdaBoost Regressor (ABR), XGBoost Regressor (XGB). Likewise, we have decided to implement Randomized Parameter Optimization and SelectKbest function to implement feature section. Datasets like COCOMO81, MAXWELL, ALBERCHT, DESHARNAIS were used. Results of the experiment show that the developed ensemble model performs at its best, for three out of four datasets. Conclusion: After evaluating and analyzing the results obtained, we can conclude that the developed model works well with the datasets that have continuous, numeric type of values. We can also conclude that the developed ensemble model outperforms other existing models when implemented with COCOMO81, MAXWELL, ALBERCHT datasets. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:bth-20691application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Software Development Effort
Ensemble
Ensemble Learning
Stacking Ensemble
Software Development Effort Estimation
Machine Learning
Estimation of Software Development Effort
Effort Estimation
Computer Sciences
Datavetenskap (datalogi)
spellingShingle Software Development Effort
Ensemble
Ensemble Learning
Stacking Ensemble
Software Development Effort Estimation
Machine Learning
Estimation of Software Development Effort
Effort Estimation
Computer Sciences
Datavetenskap (datalogi)
Kanneganti, Alekhya
Using Ensemble Machine Learning Methods in Estimating Software Development Effort
description Background: Software Development Effort Estimation is a process that focuses on estimating the required effort to develop a software project with a minimal budget. Estimating effort includes interpretation of required manpower, resources, time and schedule. Project managers are responsible for estimating the required effort. A model that can predict software development effort efficiently comes in hand and acts as a decision support system for the project managers to enhance the precision in estimating effort. Therefore, the context of this study is to increase the efficiency in estimating software development effort. Objective: The main objective of this thesis is to identify an effective ensemble method to build and implement it, in estimating software development effort. Apart from this, parameter tuning is also implemented to improve the performance of the model. Finally, we compare the results of the developed model with the existing models. Method: In this thesis, we have adopted two research methods. Initially, a Literature Review was conducted to gain knowledge on the existing studies, machine learning techniques, datasets, ensemble methods that were previously used in estimating Software Development Effort. Then a controlled Experiment was conducted in order to build an ensemble model and to evaluate the performance of the ensemble model for determining if the developed model has a better performance when compared to the existing models.   Results: After conducting literature review and collecting evidence, we have decided to build and implement stacked generalization ensemble method in this thesis, with the help of individual machine learning techniques like Support vector regressor (SVR), K-Nearest Neighbors regressor (KNN), Decision Tree Regressor (DTR), Linear Regressor (LR), Multi-Layer Perceptron Regressor (MLP) Random Forest Regressor (RFR), Gradient Boosting Regressor (GBR), AdaBoost Regressor (ABR), XGBoost Regressor (XGB). Likewise, we have decided to implement Randomized Parameter Optimization and SelectKbest function to implement feature section. Datasets like COCOMO81, MAXWELL, ALBERCHT, DESHARNAIS were used. Results of the experiment show that the developed ensemble model performs at its best, for three out of four datasets. Conclusion: After evaluating and analyzing the results obtained, we can conclude that the developed model works well with the datasets that have continuous, numeric type of values. We can also conclude that the developed ensemble model outperforms other existing models when implemented with COCOMO81, MAXWELL, ALBERCHT datasets.
author Kanneganti, Alekhya
author_facet Kanneganti, Alekhya
author_sort Kanneganti, Alekhya
title Using Ensemble Machine Learning Methods in Estimating Software Development Effort
title_short Using Ensemble Machine Learning Methods in Estimating Software Development Effort
title_full Using Ensemble Machine Learning Methods in Estimating Software Development Effort
title_fullStr Using Ensemble Machine Learning Methods in Estimating Software Development Effort
title_full_unstemmed Using Ensemble Machine Learning Methods in Estimating Software Development Effort
title_sort using ensemble machine learning methods in estimating software development effort
publisher Blekinge Tekniska Högskola, Institutionen för datavetenskap
publishDate 2020
url http://urn.kb.se/resolve?urn=urn:nbn:se:bth-20691
work_keys_str_mv AT kannegantialekhya usingensemblemachinelearningmethodsinestimatingsoftwaredevelopmenteffort
_version_ 1719362561892155392