Qualitative Performance Analysis for Large-Scale Scientific Workflows

<p>Today, large-scale scientific applications are both data driven and distributed. To support the scale and inherent distribution of these applications, significant heterogeneous and geographically distributed resources are required over long periods of time to ensure adequate performance. F...

Full description

Bibliographic Details
Main Author:	Buneci, Emma
Other Authors:	Reed, Daniel A
Format:	Others
Language:	en_US
Published:	2008
Subjects:	Computer Science feature selection time series analysis performance analysis signatures scientific applications
Online Access:	http://hdl.handle.net/10161/695

id	ndltd-DUKE-oai-dukespace.lib.duke.edu-10161-695
record_format	oai_dc
spelling	ndltd-DUKE-oai-dukespace.lib.duke.edu-10161-6952013-01-07T20:07:07ZQualitative Performance Analysis for Large-Scale Scientific WorkflowsBuneci, EmmaComputer ScienceComputer Sciencefeature selectiontime series analysisperformance analysissignaturesscientific applications<p>Today, large-scale scientific applications are both data driven and distributed. To support the scale and inherent distribution of these applications, significant heterogeneous and geographically distributed resources are required over long periods of time to ensure adequate performance. Furthermore, the behavior of these applications depends on a large number of factors related to the application, the system software, the underlying hardware, and other running applications, as well as potential interactions among these factors.</p> <p>Most Grid application users are primarily concerned with obtaining the result of the application as fast as possible, without worrying about the details involved in monitoring and understanding factors affecting application performance. In this work, we aim to provide the application users with a simple and intuitive performance evaluation mechanism during the execution time of their long-running Grid applications or workflows. Our performance evaluation mechanism provides a qualitative and periodic assessment of the application's behavior by informing the user whether the application's performance is expected or unexpected. Furthermore, it can help improve overall application performance by informing and guiding fault-tolerance services when the application exhibits persistent unexpected performance behaviors.</p> <p>This thesis addresses the hypotheses that in order to qualitatively assess application behavioral states in long-running scientific Grid applications: (1) it is necessary to extract temporal information in performance time series data, and that (2) it is sufficient to extract variance and pattern as specific examples of temporal information. Evidence supporting these hypotheses can lead to the ability to qualitatively assess the overall behavior of the application and, if needed, to offer a most likely diagnostic of the underlying problem.</p> <p>To test the stated hypotheses, we develop and evaluate a general <em> qualitative performance analysis</em> framework that incorporates (a) techniques from time series analysis and machine learning to extract and learn from data, structural and temporal features associated with application performance in order to reach a qualitative interpretation of the application's behavior, and (b) mechanisms and policies to reason over time and across the distributed resource space about the behavior of the application. </p> <p>Experiments with two scientific applications from meteorology and astronomy comparing signatures generated from instantaneous values of performance data versus those generated from temporal characteristics support the former hypothesis that temporal information is necessary to extract from performance time series data to be able to accurately interpret the behavior of these applications. Furthermore, temporal signatures incorporating variance and pattern information generated for these applications reveal signatures that have distinct characteristics during well-performing versus poor-performing executions. This leads to the framework's accurate classification of instances of similar behaviors, which represents supporting evidence for the latter hypothesis. The proposed framework's ability to generate a qualitative assessment of performance behavior for scientific applications using temporal information present in performance time series data represents a step towards simplifying and improving the quality of service for Grid applications.</p>DissertationReed, Daniel A2008-05-30Dissertation4788576 bytesapplication/pdfhttp://hdl.handle.net/10161/695en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
topic	Computer Science Computer Science feature selection time series analysis performance analysis signatures scientific applications
spellingShingle	Computer Science Computer Science feature selection time series analysis performance analysis signatures scientific applications Buneci, Emma Qualitative Performance Analysis for Large-Scale Scientific Workflows
description	<p>Today, large-scale scientific applications are both data driven and distributed. To support the scale and inherent distribution of these applications, significant heterogeneous and geographically distributed resources are required over long periods of time to ensure adequate performance. Furthermore, the behavior of these applications depends on a large number of factors related to the application, the system software, the underlying hardware, and other running applications, as well as potential interactions among these factors.</p> <p>Most Grid application users are primarily concerned with obtaining the result of the application as fast as possible, without worrying about the details involved in monitoring and understanding factors affecting application performance. In this work, we aim to provide the application users with a simple and intuitive performance evaluation mechanism during the execution time of their long-running Grid applications or workflows. Our performance evaluation mechanism provides a qualitative and periodic assessment of the application's behavior by informing the user whether the application's performance is expected or unexpected. Furthermore, it can help improve overall application performance by informing and guiding fault-tolerance services when the application exhibits persistent unexpected performance behaviors.</p> <p>This thesis addresses the hypotheses that in order to qualitatively assess application behavioral states in long-running scientific Grid applications: (1) it is necessary to extract temporal information in performance time series data, and that (2) it is sufficient to extract variance and pattern as specific examples of temporal information. Evidence supporting these hypotheses can lead to the ability to qualitatively assess the overall behavior of the application and, if needed, to offer a most likely diagnostic of the underlying problem.</p> <p>To test the stated hypotheses, we develop and evaluate a general <em> qualitative performance analysis</em> framework that incorporates (a) techniques from time series analysis and machine learning to extract and learn from data, structural and temporal features associated with application performance in order to reach a qualitative interpretation of the application's behavior, and (b) mechanisms and policies to reason over time and across the distributed resource space about the behavior of the application. </p> <p>Experiments with two scientific applications from meteorology and astronomy comparing signatures generated from instantaneous values of performance data versus those generated from temporal characteristics support the former hypothesis that temporal information is necessary to extract from performance time series data to be able to accurately interpret the behavior of these applications. Furthermore, temporal signatures incorporating variance and pattern information generated for these applications reveal signatures that have distinct characteristics during well-performing versus poor-performing executions. This leads to the framework's accurate classification of instances of similar behaviors, which represents supporting evidence for the latter hypothesis. The proposed framework's ability to generate a qualitative assessment of performance behavior for scientific applications using temporal information present in performance time series data represents a step towards simplifying and improving the quality of service for Grid applications.</p> === Dissertation
author2	Reed, Daniel A
author_facet	Reed, Daniel A Buneci, Emma
author	Buneci, Emma
author_sort	Buneci, Emma
title	Qualitative Performance Analysis for Large-Scale Scientific Workflows
title_short	Qualitative Performance Analysis for Large-Scale Scientific Workflows
title_full	Qualitative Performance Analysis for Large-Scale Scientific Workflows
title_fullStr	Qualitative Performance Analysis for Large-Scale Scientific Workflows
title_full_unstemmed	Qualitative Performance Analysis for Large-Scale Scientific Workflows
title_sort	qualitative performance analysis for large-scale scientific workflows
publishDate	2008
url	http://hdl.handle.net/10161/695
work_keys_str_mv	AT buneciemma qualitativeperformanceanalysisforlargescalescientificworkflows
_version_	1716473372421914624

Qualitative Performance Analysis for Large-Scale Scientific Workflows

Similar Items