Assessment of data-driven bayesian networks in software effort prediction

Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application o...

Full description

Bibliographic Details
Main Author: Tierno, Ivan Alexandre Paiz
Other Authors: Nunes, Daltro José
Format: Others
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10183/71952
id ndltd-IBICT-oai-lume56.ufrgs.br-10183-71952
record_format oai_dc
spelling ndltd-IBICT-oai-lume56.ufrgs.br-10183-719522018-09-30T04:14:33Z Assessment of data-driven bayesian networks in software effort prediction Tierno, Ivan Alexandre Paiz Nunes, Daltro José Redes bayesianas Aprendizagem : Maquina Redes : Computadores Engenharia : Software Software effort prediction Bayesian networks Machine learning Data mining Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction. 2013-05-25T01:46:34Z 2013 info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/masterThesis http://hdl.handle.net/10183/71952 000881231 eng info:eu-repo/semantics/openAccess application/pdf reponame:Biblioteca Digital de Teses e Dissertações da UFRGS instname:Universidade Federal do Rio Grande do Sul instacron:UFRGS
collection NDLTD
language English
format Others
sources NDLTD
topic Redes bayesianas
Aprendizagem : Maquina
Redes : Computadores
Engenharia : Software
Software effort prediction
Bayesian networks
Machine learning
Data mining
spellingShingle Redes bayesianas
Aprendizagem : Maquina
Redes : Computadores
Engenharia : Software
Software effort prediction
Bayesian networks
Machine learning
Data mining
Tierno, Ivan Alexandre Paiz
Assessment of data-driven bayesian networks in software effort prediction
description Software prediction unveils itself as a difficult but important task which can aid the manager on decision making, possibly allowing for time and resources sparing, achieving higher software quality among other benefits. One of the approaches set forth to perform this task has been the application of machine learning techniques. One of these techniques are Bayesian Networks, which have been promoted for software projects management due to their special features. However, the pre-processing procedures related to their application remain mostly neglected in this field. In this context, this study presents an assessment of automatic Bayesian Networks (i.e., Bayesian Networks solely based on data) on three public data sets and brings forward a discussion on data pre-processing procedures and the validation approach. We carried out a comparison of automatic Bayesian Networks against mean and median baseline models and also against ordinary least squares regression with a logarithmic transformation, which has been recently deemed in a comprehensive study as a top performer with regard to accuracy. The results obtained through careful validation procedures support that automatic Bayesian Networks can be competitive against other techniques, but still need improvements in order to catch up with linear regression models accuracy-wise. Some current limitations of Bayesian Networks are highlighted and possible improvements are discussed. Furthermore, this study provides some guidelines on the exploration of data. These guidelines can be useful to any Bayesian Networks that use data for model learning. Finally, this study also confirms the potential benefits of feature selection in software effort prediction.
author2 Nunes, Daltro José
author_facet Nunes, Daltro José
Tierno, Ivan Alexandre Paiz
author Tierno, Ivan Alexandre Paiz
author_sort Tierno, Ivan Alexandre Paiz
title Assessment of data-driven bayesian networks in software effort prediction
title_short Assessment of data-driven bayesian networks in software effort prediction
title_full Assessment of data-driven bayesian networks in software effort prediction
title_fullStr Assessment of data-driven bayesian networks in software effort prediction
title_full_unstemmed Assessment of data-driven bayesian networks in software effort prediction
title_sort assessment of data-driven bayesian networks in software effort prediction
publishDate 2013
url http://hdl.handle.net/10183/71952
work_keys_str_mv AT tiernoivanalexandrepaiz assessmentofdatadrivenbayesiannetworksinsoftwareeffortprediction
_version_ 1718751318045097984