Integrating event- and state-based approaches to the debugging of parallel programs

This dissertation presents a comprehensive solution to the problem of debugging of parallel programs. The proposed strategy employs an event-based modeling phase at the highest level where gross patterns of process interactions are investigated for anomalous behaviors. The results of the modeling ph...

Full description

Bibliographic Details
Main Author: Kundu, Joydip
Language:ENG
Published: ScholarWorks@UMass Amherst 1996
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI9638985
Description
Summary:This dissertation presents a comprehensive solution to the problem of debugging of parallel programs. The proposed strategy employs an event-based modeling phase at the highest level where gross patterns of process interactions are investigated for anomalous behaviors. The results of the modeling phase are used to guide the investigation of the local states of the processes by stopping the computation at a consistent global state specified in terms of abstract events. The primary contributions of the dissertation are: (1) A simple modeling language that is capable of providing precise feedback on an error; (2) A scalable visual feedback mechanism that allows the user to control the balance between the scale and the precision of the feedback with the help of a query language; and (3) A novel abstract event-based breakpoint specification scheme that integrates two disparate strategies of debugging parallel programs. Parallel programs are hard to debug. State-based techniques lack mechanisms to accommodate the sparsity of consistent global states, a large volume of trace data and the asynchronous execution that characterize parallel program behaviors. Event-based techniques provide abstraction mechanisms to contend with large quantities of data, and can use logical time to filter out the perturbations due to asynchrony. They, however, do not provide sufficient feedback on match failures, they do not scale well for massive parallelism, and they cannot track errors back to the source code. Our approach is to integrate the two techniques in a meaningful whole: event-based modeling is used for its abstraction facilities, and state-based techniques are used to explore local process behaviors in detail. The user is thus presented with a unified debugging framework where an anomaly detected at the level of abstract behaviors can be traced back to the offending source code. The need for such an integrated environment has long been recognized, and this dissertation introduces a method by which it can be implemented.