Local versus Global Models for Just-In-Time Software Defect Prediction
Just-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it ca...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2019-01-01
|
Series: | Scientific Programming |
Online Access: | http://dx.doi.org/10.1155/2019/2384706 |
id |
doaj-26bbabbd2cb9478fa48369145564d9d4 |
---|---|
record_format |
Article |
spelling |
doaj-26bbabbd2cb9478fa48369145564d9d42021-07-02T14:15:53ZengHindawi LimitedScientific Programming1058-92441875-919X2019-01-01201910.1155/2019/23847062384706Local versus Global Models for Just-In-Time Software Defect PredictionXingguang Yang0Huiqun Yu1Guisheng Fan2Kai Shi3Liqiong Chen4Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, ChinaJust-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it can help improve the performance of prediction models. However, previous studies have focused on module-level defect prediction. Whether local models are still valid in the context of JIT-SDP is an important issue. To this end, we compare the performance of local and global models through a large-scale empirical study based on six open-source projects with 227417 changes. The experiment considers three evaluation scenarios of cross-validation, cross-project-validation, and timewise-cross-validation. To build local models, the experiment uses the k-medoids to divide the training set into several homogeneous regions. In addition, logistic regression and effort-aware linear regression (EALR) are used to build classification models and effort-aware prediction models, respectively. The empirical results show that local models perform worse than global models in the classification performance. However, local models have significantly better effort-aware prediction performance than global models in the cross-validation and cross-project-validation scenarios. Particularly, when the number of clusters k is set to 2, local models can obtain optimal effort-aware prediction performance. Therefore, local models are promising for effort-aware JIT-SDP.http://dx.doi.org/10.1155/2019/2384706 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen |
spellingShingle |
Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen Local versus Global Models for Just-In-Time Software Defect Prediction Scientific Programming |
author_facet |
Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen |
author_sort |
Xingguang Yang |
title |
Local versus Global Models for Just-In-Time Software Defect Prediction |
title_short |
Local versus Global Models for Just-In-Time Software Defect Prediction |
title_full |
Local versus Global Models for Just-In-Time Software Defect Prediction |
title_fullStr |
Local versus Global Models for Just-In-Time Software Defect Prediction |
title_full_unstemmed |
Local versus Global Models for Just-In-Time Software Defect Prediction |
title_sort |
local versus global models for just-in-time software defect prediction |
publisher |
Hindawi Limited |
series |
Scientific Programming |
issn |
1058-9244 1875-919X |
publishDate |
2019-01-01 |
description |
Just-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it can help improve the performance of prediction models. However, previous studies have focused on module-level defect prediction. Whether local models are still valid in the context of JIT-SDP is an important issue. To this end, we compare the performance of local and global models through a large-scale empirical study based on six open-source projects with 227417 changes. The experiment considers three evaluation scenarios of cross-validation, cross-project-validation, and timewise-cross-validation. To build local models, the experiment uses the k-medoids to divide the training set into several homogeneous regions. In addition, logistic regression and effort-aware linear regression (EALR) are used to build classification models and effort-aware prediction models, respectively. The empirical results show that local models perform worse than global models in the classification performance. However, local models have significantly better effort-aware prediction performance than global models in the cross-validation and cross-project-validation scenarios. Particularly, when the number of clusters k is set to 2, local models can obtain optimal effort-aware prediction performance. Therefore, local models are promising for effort-aware JIT-SDP. |
url |
http://dx.doi.org/10.1155/2019/2384706 |
work_keys_str_mv |
AT xingguangyang localversusglobalmodelsforjustintimesoftwaredefectprediction AT huiqunyu localversusglobalmodelsforjustintimesoftwaredefectprediction AT guishengfan localversusglobalmodelsforjustintimesoftwaredefectprediction AT kaishi localversusglobalmodelsforjustintimesoftwaredefectprediction AT liqiongchen localversusglobalmodelsforjustintimesoftwaredefectprediction |
_version_ |
1721328140768247808 |