Memory Optimizations for Distributed Stream-based Applications

Distributed stream-based applications manage large quantities of data and exhibit unique production and consumption patterns that set them apart from general-purpose applications. This dissertation examines possible ways of creating more efficient memory management schemes. Specifically, it looks at...

Full description

Bibliographic Details
Main Author: Harel, Nissim
Format: Others
Language:en_US
Published: Georgia Institute of Technology 2007
Subjects:
Online Access:http://hdl.handle.net/1853/13988
id ndltd-GATECH-oai-smartech.gatech.edu-1853-13988
record_format oai_dc
spelling ndltd-GATECH-oai-smartech.gatech.edu-1853-139882013-01-07T20:16:22ZMemory Optimizations for Distributed Stream-based ApplicationsHarel, NissimSimulationsResource managementStreaming applicationsGarbage collectionMemory managementStreaming technology (Telecommunications)Computer simulationElectronic data processingDistributed stream-based applications manage large quantities of data and exhibit unique production and consumption patterns that set them apart from general-purpose applications. This dissertation examines possible ways of creating more efficient memory management schemes. Specifically, it looks at the memory reclamation problem. It takes advantage of special traits of streaming applications to extend the definition of the garbage collection problem for those applications and include not only data items that are not reachable but also items that have no effect on the final outcome of the application. Streaming applications typically fully process only a portion of the data, and resources directed towards the remaining data items (i.e., those that dont affect the final outcome) can be viewed as wasted resources that should be minimized. Two complementary approaches are suggested: 1. Garbage Identification 2. Adaptive Resource Utilization Garbage Identification is concerned with an analysis of dynamic data dependencies to infer those items that the application is no longer going to access. Several garbage identification algorithms are examined. Each one of the algorithms uses a set of application properties (possibly distinct from one another) to reduce the memory consumption of the application. The performance of these garbage identification algorithms is compared to the performance of an ideal garbage collector, using a novel logging/post-mortem analyzer. The results indicate that the algorithms that achieve a low memory footprint (close to that of an ideal garbage collector) perform their garbage identification decisions locally; however, they base these decisions on best-effort global information obtained from other components of the distributed application. The Adaptive Resource Utilization (ARU) algorithm analyzes the dynamic relationships between the production and consumption of data items. It uses this information to infer the capacity of the system to process data items and adjusts data generation accordingly. The ARU algorithm makes local capacity decisions based on best-effort global information. This algorithm is found to be as effective as the most successful garbage identification algorithm in reducing the memory footprint of stream-based applications, thus confirming the observation that using best-effort global information to perform local decisions is fundamental in reducing memory consumption for stream-based applications.Georgia Institute of Technology2007-03-27T18:11:39Z2007-03-27T18:11:39Z2006-11-01Dissertation1920704 bytesapplication/pdfhttp://hdl.handle.net/1853/13988en_US
collection NDLTD
language en_US
format Others
sources NDLTD
topic Simulations
Resource management
Streaming applications
Garbage collection
Memory management
Streaming technology (Telecommunications)
Computer simulation
Electronic data processing
spellingShingle Simulations
Resource management
Streaming applications
Garbage collection
Memory management
Streaming technology (Telecommunications)
Computer simulation
Electronic data processing
Harel, Nissim
Memory Optimizations for Distributed Stream-based Applications
description Distributed stream-based applications manage large quantities of data and exhibit unique production and consumption patterns that set them apart from general-purpose applications. This dissertation examines possible ways of creating more efficient memory management schemes. Specifically, it looks at the memory reclamation problem. It takes advantage of special traits of streaming applications to extend the definition of the garbage collection problem for those applications and include not only data items that are not reachable but also items that have no effect on the final outcome of the application. Streaming applications typically fully process only a portion of the data, and resources directed towards the remaining data items (i.e., those that dont affect the final outcome) can be viewed as wasted resources that should be minimized. Two complementary approaches are suggested: 1. Garbage Identification 2. Adaptive Resource Utilization Garbage Identification is concerned with an analysis of dynamic data dependencies to infer those items that the application is no longer going to access. Several garbage identification algorithms are examined. Each one of the algorithms uses a set of application properties (possibly distinct from one another) to reduce the memory consumption of the application. The performance of these garbage identification algorithms is compared to the performance of an ideal garbage collector, using a novel logging/post-mortem analyzer. The results indicate that the algorithms that achieve a low memory footprint (close to that of an ideal garbage collector) perform their garbage identification decisions locally; however, they base these decisions on best-effort global information obtained from other components of the distributed application. The Adaptive Resource Utilization (ARU) algorithm analyzes the dynamic relationships between the production and consumption of data items. It uses this information to infer the capacity of the system to process data items and adjusts data generation accordingly. The ARU algorithm makes local capacity decisions based on best-effort global information. This algorithm is found to be as effective as the most successful garbage identification algorithm in reducing the memory footprint of stream-based applications, thus confirming the observation that using best-effort global information to perform local decisions is fundamental in reducing memory consumption for stream-based applications.
author Harel, Nissim
author_facet Harel, Nissim
author_sort Harel, Nissim
title Memory Optimizations for Distributed Stream-based Applications
title_short Memory Optimizations for Distributed Stream-based Applications
title_full Memory Optimizations for Distributed Stream-based Applications
title_fullStr Memory Optimizations for Distributed Stream-based Applications
title_full_unstemmed Memory Optimizations for Distributed Stream-based Applications
title_sort memory optimizations for distributed stream-based applications
publisher Georgia Institute of Technology
publishDate 2007
url http://hdl.handle.net/1853/13988
work_keys_str_mv AT harelnissim memoryoptimizationsfordistributedstreambasedapplications
_version_ 1716474547435208704