Obtaining Architectural Descriptions from Legacy Systems: The Architectural Synthesis Process (ASP)

A majority of software development today involves maintenance or evolution of legacy systems. Evolving these legacy systems, while maintaining good software design principles, is a significant challenge. Research has shown the benefits of using software architecture as an abstraction to analyze qu...

Full description

Bibliographic Details
Main Author: Waters, Robert Lee
Format: Others
Language:en_US
Published: Georgia Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1853/4832
Description
Summary:A majority of software development today involves maintenance or evolution of legacy systems. Evolving these legacy systems, while maintaining good software design principles, is a significant challenge. Research has shown the benefits of using software architecture as an abstraction to analyze quality attributes of proposed designs. Unfortunately, for most legacy systems, a documented software architecture does not exist. Developing a good architectural description frequently requires extensive experience on the part of the developer trying to recover the legacy system's architecture. This work first describes a four-phase process that provides a framework within which architectural recovery activities can be automated. These phases consist of: extraction (obtaining a subset of information about the legacy system from a single source), classification (partitioning the information based upon its viewpoint), union (combining all the information in a particular viewpoint into a candidate view), and fusion (cross-checking all candidate views for consistency. The work then concentrates on the major problem facing automated architectural recovery---the concept assignment problem. To overcome this problem, a technique called semantic approximation is presented and validated via experimental results. Semantic approximation uses a combination of text data mining and a mathematical technique called concept analysis to build a lattice of similar concepts between higher-level domain information and low-level code concepts. The experimental data reveals that while semantic approximation does improve results over the more traditional lexical and topological approaches, it does not yet fully solve the concept assignment problem.