Clustering Classes in Packages for Program Comprehension
During software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for compr...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2017-01-01
|
Series: | Scientific Programming |
Online Access: | http://dx.doi.org/10.1155/2017/3787053 |
id |
doaj-e6f041937f874a71bd1633b1a0c7ae2b |
---|---|
record_format |
Article |
spelling |
doaj-e6f041937f874a71bd1633b1a0c7ae2b2021-07-02T05:42:26ZengHindawi LimitedScientific Programming1058-92441875-919X2017-01-01201710.1155/2017/37870533787053Clustering Classes in Packages for Program ComprehensionXiaobing Sun0Xiangyue Liu1Bin Li2Bixin Li3David Lo4Lingzhi Liao5School of Information Engineering, Yangzhou University, Yangzhou, ChinaSchool of Information Engineering, Yangzhou University, Yangzhou, ChinaSchool of Information Engineering, Yangzhou University, Yangzhou, ChinaSchool of Computer Science and Engineering, Southeast University, Nanjing, ChinaSchool of Information Systems, Singapore Management University, SingaporeNanjing University of Information Science & Technology, Nanjing, ChinaDuring software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for comprehension, developers may first focus on the package comprehension. The packages in the system are of different sizes. For small-sized packages in the system, developers can easily comprehend them. However, for large-sized packages, they are difficult to understand. In this article, we focus on understanding these large-sized packages and propose a novel program comprehension approach for large-sized packages, which utilizes the Latent Dirichlet Allocation (LDA) model to cluster large-sized packages. Thus, these large-sized packages are separated as small-sized clusters, which are easier for developers to comprehend. Empirical studies on four real-world software projects demonstrate the effectiveness of our approach. The results show that the effectiveness of our approach is better than Latent Semantic Indexing- (LSI-) and Probabilistic Latent Semantic Analysis- (PLSA-) based clustering approaches. In addition, we find that the topic that labels each cluster is useful for program comprehension.http://dx.doi.org/10.1155/2017/3787053 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Xiaobing Sun Xiangyue Liu Bin Li Bixin Li David Lo Lingzhi Liao |
spellingShingle |
Xiaobing Sun Xiangyue Liu Bin Li Bixin Li David Lo Lingzhi Liao Clustering Classes in Packages for Program Comprehension Scientific Programming |
author_facet |
Xiaobing Sun Xiangyue Liu Bin Li Bixin Li David Lo Lingzhi Liao |
author_sort |
Xiaobing Sun |
title |
Clustering Classes in Packages for Program Comprehension |
title_short |
Clustering Classes in Packages for Program Comprehension |
title_full |
Clustering Classes in Packages for Program Comprehension |
title_fullStr |
Clustering Classes in Packages for Program Comprehension |
title_full_unstemmed |
Clustering Classes in Packages for Program Comprehension |
title_sort |
clustering classes in packages for program comprehension |
publisher |
Hindawi Limited |
series |
Scientific Programming |
issn |
1058-9244 1875-919X |
publishDate |
2017-01-01 |
description |
During software maintenance and evolution, one of the important tasks faced by developers is to understand a system quickly and accurately. With the increasing size and complexity of an evolving system, program comprehension becomes an increasingly difficult activity. Given a target system for comprehension, developers may first focus on the package comprehension. The packages in the system are of different sizes. For small-sized packages in the system, developers can easily comprehend them. However, for large-sized packages, they are difficult to understand. In this article, we focus on understanding these large-sized packages and propose a novel program comprehension approach for large-sized packages, which utilizes the Latent Dirichlet Allocation (LDA) model to cluster large-sized packages. Thus, these large-sized packages are separated as small-sized clusters, which are easier for developers to comprehend. Empirical studies on four real-world software projects demonstrate the effectiveness of our approach. The results show that the effectiveness of our approach is better than Latent Semantic Indexing- (LSI-) and Probabilistic Latent Semantic Analysis- (PLSA-) based clustering approaches. In addition, we find that the topic that labels each cluster is useful for program comprehension. |
url |
http://dx.doi.org/10.1155/2017/3787053 |
work_keys_str_mv |
AT xiaobingsun clusteringclassesinpackagesforprogramcomprehension AT xiangyueliu clusteringclassesinpackagesforprogramcomprehension AT binli clusteringclassesinpackagesforprogramcomprehension AT bixinli clusteringclassesinpackagesforprogramcomprehension AT davidlo clusteringclassesinpackagesforprogramcomprehension AT lingzhiliao clusteringclassesinpackagesforprogramcomprehension |
_version_ |
1721338335434113024 |