Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems

Application down time is one of the major reasons for revenue loss in the modern enterprise. While aggressive release schedules cause frail software to be released, application failures occurring in the field cost millions to the technical support organizations in personnel time. Since developers us...

Full description

Bibliographic Details
Main Author: Subhraveti, Dinesh Kumar
Language:English
Published: 2012
Subjects:
Online Access:https://doi.org/10.7916/D8SQ96C7
id ndltd-columbia.edu-oai-academiccommons.columbia.edu-10.7916-D8SQ96C7
record_format oai_dc
collection NDLTD
language English
sources NDLTD
topic Computer science
spellingShingle Computer science
Subhraveti, Dinesh Kumar
Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
description Application down time is one of the major reasons for revenue loss in the modern enterprise. While aggressive release schedules cause frail software to be released, application failures occurring in the field cost millions to the technical support organizations in personnel time. Since developers usually don't have direct access to the field environment for a variety of privacy and security reasons, problems are reproduced, analyzed and fixed in very different lab environments. However, the complexity and diversity of application environments make it difficult to accurately replicate the production environment. The indiscriminate collection of data provided by the bug reports often overwhelm or even mislead the developer. A typical issue requires time consuming rounds of clarifications and interactions with the end user, even after which the issue may not manifest. This dissertation introduces vPlay, a software problem determination system which captures software bugs as they occur in the field into small and self-contained recordings, and allows them to be deterministically reproduced across different operating systems and heterogeneous environments. vPlay makes two key advances over the state of the art. First, the recorded bug can be reproduced in a completely different operating system environment without any kind of dependency on the source. vPlay packages up every piece of data necessary to correctly reproduce the bug on any stateless target machine in the developer environment, without the application, its binaries, and other support data. Second, the data captured by vPlay is small, typically amounting to a few megabytes. vPlay achieves this without requiring changes to the applications, base kernel or hardware. vPlay employs a recording mechanism which provides data level independence between the application and its source environment by adopting a state machine model of the application to capture every piece of state accessed by the application. vPlay minimizes the size of the recording through a new technique called partial checkpointing, to efficiently capture the partial intermediate state of the application required to replay just the last few moments of its execution prior to the failure. The recorded state is saved as a partial checkpoint along with metadata representing the information specific to the source environment, such as call- ing convention used for the system calls on the source system, to make it portable across operating systems. A partial checkpoint is loaded by a partial checkpoint loader, which itself is designed to be portable across different operating systems. Partial checkpointing is combined with a logging mechanism, which monitors the application to identify and record relevant accessed state for root cause analysis and to record application's nondeterministic events. vPlay introduces a new type of virtualization abstraction called vPlay Container, to natively replay an application built for one operating system on another. vPlay Container relies on the self-contained recording produced by vPlay to decouple the application from the target operating system environment in three key areas. The application is decoupled from (1) the address space and its content by transparently fulfilling its memory accesses, (2) the instructions and the processor MMU structures such as segment descriptor tables through a binary translation technique designed specifically for user application code, (3) the operating system interface and its services by abstracting the system call interface through emulation and replay. To facilitate root cause analysis, vPlay Container integrates with a standard debugger to enable the user to set breakpoints and single step the replayed execution of the application to examine the contents of variables and other program state at each source line. We have implemented a vPlay prototype which can record unmodified Linux applications and natively replay them on different versions of Linux as well as Windows. Experiments with several applications including Apache and MySQL show that vPlay can reproduce real bugs and be used in production with modest recording overhead.
author Subhraveti, Dinesh Kumar
author_facet Subhraveti, Dinesh Kumar
author_sort Subhraveti, Dinesh Kumar
title Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
title_short Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
title_full Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
title_fullStr Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
title_full_unstemmed Record and vPlay: Problem Determination with Virtual Replay Across Heterogeneous Systems
title_sort record and vplay: problem determination with virtual replay across heterogeneous systems
publishDate 2012
url https://doi.org/10.7916/D8SQ96C7
work_keys_str_mv AT subhravetidineshkumar recordandvplayproblemdeterminationwithvirtualreplayacrossheterogeneoussystems
_version_ 1719045605824659456
spelling ndltd-columbia.edu-oai-academiccommons.columbia.edu-10.7916-D8SQ96C72019-05-09T15:13:42ZRecord and vPlay: Problem Determination with Virtual Replay Across Heterogeneous SystemsSubhraveti, Dinesh Kumar2012ThesesComputer scienceApplication down time is one of the major reasons for revenue loss in the modern enterprise. While aggressive release schedules cause frail software to be released, application failures occurring in the field cost millions to the technical support organizations in personnel time. Since developers usually don't have direct access to the field environment for a variety of privacy and security reasons, problems are reproduced, analyzed and fixed in very different lab environments. However, the complexity and diversity of application environments make it difficult to accurately replicate the production environment. The indiscriminate collection of data provided by the bug reports often overwhelm or even mislead the developer. A typical issue requires time consuming rounds of clarifications and interactions with the end user, even after which the issue may not manifest. This dissertation introduces vPlay, a software problem determination system which captures software bugs as they occur in the field into small and self-contained recordings, and allows them to be deterministically reproduced across different operating systems and heterogeneous environments. vPlay makes two key advances over the state of the art. First, the recorded bug can be reproduced in a completely different operating system environment without any kind of dependency on the source. vPlay packages up every piece of data necessary to correctly reproduce the bug on any stateless target machine in the developer environment, without the application, its binaries, and other support data. Second, the data captured by vPlay is small, typically amounting to a few megabytes. vPlay achieves this without requiring changes to the applications, base kernel or hardware. vPlay employs a recording mechanism which provides data level independence between the application and its source environment by adopting a state machine model of the application to capture every piece of state accessed by the application. vPlay minimizes the size of the recording through a new technique called partial checkpointing, to efficiently capture the partial intermediate state of the application required to replay just the last few moments of its execution prior to the failure. The recorded state is saved as a partial checkpoint along with metadata representing the information specific to the source environment, such as call- ing convention used for the system calls on the source system, to make it portable across operating systems. A partial checkpoint is loaded by a partial checkpoint loader, which itself is designed to be portable across different operating systems. Partial checkpointing is combined with a logging mechanism, which monitors the application to identify and record relevant accessed state for root cause analysis and to record application's nondeterministic events. vPlay introduces a new type of virtualization abstraction called vPlay Container, to natively replay an application built for one operating system on another. vPlay Container relies on the self-contained recording produced by vPlay to decouple the application from the target operating system environment in three key areas. The application is decoupled from (1) the address space and its content by transparently fulfilling its memory accesses, (2) the instructions and the processor MMU structures such as segment descriptor tables through a binary translation technique designed specifically for user application code, (3) the operating system interface and its services by abstracting the system call interface through emulation and replay. To facilitate root cause analysis, vPlay Container integrates with a standard debugger to enable the user to set breakpoints and single step the replayed execution of the application to examine the contents of variables and other program state at each source line. We have implemented a vPlay prototype which can record unmodified Linux applications and natively replay them on different versions of Linux as well as Windows. Experiments with several applications including Apache and MySQL show that vPlay can reproduce real bugs and be used in production with modest recording overhead.Englishhttps://doi.org/10.7916/D8SQ96C7