Mining Key Classes in Java Projects by Examining a Very Small Number of Classes: A Complex Network-Based Approach

Key classes have become excellent starting points for developers to understand unknown software systems. Up to now, a variety of approaches have been proposed to mine key classes in a software project. Many of them are based on a network representation (namely, software networks) of the software pro...

Full description

Bibliographic Details
Main Authors: Hao Li, Tian Wang, Weifeng Pan, Muchou Wang, Chunlai Chai, Pengyu Chen, Jiale Wang, Jing Wang
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9351963/
Description
Summary:Key classes have become excellent starting points for developers to understand unknown software systems. Up to now, a variety of approaches have been proposed to mine key classes in a software project. Many of them are based on a network representation (namely, software networks) of the software projects. However, the software networks they used are usually un-weighted and un-directed, which is not consistent with the reality in a real software project where the coupling actually has direction and strength. Worse still, the number of key class candidates returned by existing approaches is usually very large. Thus, it is usually infeasible for developers to start the comprehension process from these classes, especially when there are tight time and resource constraints. To tackle these problems, in this paper, we propose an approach named MinClass, to Mine key Classes in Java projects by examining a very small number of classes. First, the software structure at the class level is represented by a weighted directed software network, which considers both the coupling strength and direction between every pair of classes. Second, we propose a new metric, OSE (One-order Structural Entropy), and use it to calculate the importance of each class in the system. Finally, we sort classes in descending order according to their OSE values, and a small number of top-ranked classes are treated as the key class candidates identified by MinClass. Experiments are performed on six open-source Java projects, and comparison studies with other eight state-of-the-art approaches are also performed. Results show that, although no one method performs best in all software systems, MinClass is the most promising one. It performs best in the whole set of software systems according to the average ranking of the Friedman test. Thus, MinClass is a valuable technique that can be used to mine the key classes.
ISSN:2169-3536