Local versus Global Models for Just-In-Time Software Defect Prediction

Just-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it ca...

Full description

Bibliographic Details
Main Authors:	Xingguang Yang, Huiqun Yu, Guisheng Fan, Kai Shi, Liqiong Chen
Format:	Article
Language:	English
Published:	Hindawi Limited 2019-01-01
Series:	Scientific Programming
Online Access:	http://dx.doi.org/10.1155/2019/2384706

id	doaj-26bbabbd2cb9478fa48369145564d9d4
record_format	Article
spelling	doaj-26bbabbd2cb9478fa48369145564d9d42021-07-02T14:15:53ZengHindawi LimitedScientific Programming1058-92441875-919X2019-01-01201910.1155/2019/23847062384706Local versus Global Models for Just-In-Time Software Defect PredictionXingguang Yang0Huiqun Yu1Guisheng Fan2Kai Shi3Liqiong Chen4Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, ChinaDepartment of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, ChinaJust-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it can help improve the performance of prediction models. However, previous studies have focused on module-level defect prediction. Whether local models are still valid in the context of JIT-SDP is an important issue. To this end, we compare the performance of local and global models through a large-scale empirical study based on six open-source projects with 227417 changes. The experiment considers three evaluation scenarios of cross-validation, cross-project-validation, and timewise-cross-validation. To build local models, the experiment uses the k-medoids to divide the training set into several homogeneous regions. In addition, logistic regression and effort-aware linear regression (EALR) are used to build classification models and effort-aware prediction models, respectively. The empirical results show that local models perform worse than global models in the classification performance. However, local models have significantly better effort-aware prediction performance than global models in the cross-validation and cross-project-validation scenarios. Particularly, when the number of clusters k is set to 2, local models can obtain optimal effort-aware prediction performance. Therefore, local models are promising for effort-aware JIT-SDP.http://dx.doi.org/10.1155/2019/2384706
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen
spellingShingle	Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen Local versus Global Models for Just-In-Time Software Defect Prediction Scientific Programming
author_facet	Xingguang Yang Huiqun Yu Guisheng Fan Kai Shi Liqiong Chen
author_sort	Xingguang Yang
title	Local versus Global Models for Just-In-Time Software Defect Prediction
title_short	Local versus Global Models for Just-In-Time Software Defect Prediction
title_full	Local versus Global Models for Just-In-Time Software Defect Prediction
title_fullStr	Local versus Global Models for Just-In-Time Software Defect Prediction
title_full_unstemmed	Local versus Global Models for Just-In-Time Software Defect Prediction
title_sort	local versus global models for just-in-time software defect prediction
publisher	Hindawi Limited
series	Scientific Programming
issn	1058-9244 1875-919X
publishDate	2019-01-01
description	Just-in-time software defect prediction (JIT-SDP) is an active topic in software defect prediction, which aims to identify defect-inducing changes. Recently, some studies have found that the variability of defect data sets can affect the performance of defect predictors. By using local models, it can help improve the performance of prediction models. However, previous studies have focused on module-level defect prediction. Whether local models are still valid in the context of JIT-SDP is an important issue. To this end, we compare the performance of local and global models through a large-scale empirical study based on six open-source projects with 227417 changes. The experiment considers three evaluation scenarios of cross-validation, cross-project-validation, and timewise-cross-validation. To build local models, the experiment uses the k-medoids to divide the training set into several homogeneous regions. In addition, logistic regression and effort-aware linear regression (EALR) are used to build classification models and effort-aware prediction models, respectively. The empirical results show that local models perform worse than global models in the classification performance. However, local models have significantly better effort-aware prediction performance than global models in the cross-validation and cross-project-validation scenarios. Particularly, when the number of clusters k is set to 2, local models can obtain optimal effort-aware prediction performance. Therefore, local models are promising for effort-aware JIT-SDP.
url	http://dx.doi.org/10.1155/2019/2384706
work_keys_str_mv	AT xingguangyang localversusglobalmodelsforjustintimesoftwaredefectprediction AT huiqunyu localversusglobalmodelsforjustintimesoftwaredefectprediction AT guishengfan localversusglobalmodelsforjustintimesoftwaredefectprediction AT kaishi localversusglobalmodelsforjustintimesoftwaredefectprediction AT liqiongchen localversusglobalmodelsforjustintimesoftwaredefectprediction
_version_	1721328140768247808

Local versus Global Models for Just-In-Time Software Defect Prediction

Similar Items