Resource allocation in streaming environments

The proliferation of the Internet and sensor networks has fueled the development of applications that process, analyze, and react to continuous data streams in a near-real-time manner. Examples of such stream applications include network traffic monitoring, intrusion detection, financial services, l...

Full description

Bibliographic Details
Main Author: Tian, Lu
Format: Others
Published: 2006
Online Access:https://thesis.library.caltech.edu/2114/1/thesis.pdf
Tian, Lu (2006) Resource allocation in streaming environments. Master's thesis, California Institute of Technology. doi:10.7907/35Y5-H853. https://resolver.caltech.edu/CaltechETD:etd-05262006-165801 <https://resolver.caltech.edu/CaltechETD:etd-05262006-165801>
Description
Summary:The proliferation of the Internet and sensor networks has fueled the development of applications that process, analyze, and react to continuous data streams in a near-real-time manner. Examples of such stream applications include network traffic monitoring, intrusion detection, financial services, large-scale reconnaissance, and surveillance. Unlike tasks in traditional scheduling problems, these stream processing applications are interacting repeating tasks, where iterations of computation are triggered by the arrival of new inputs. Furthermore, these repeated tasks are elastic in the quality of service, and the economic value of a computation depends on the time taken to execute it; for example, an arbitrage opportunity can disappear in seconds. Given limited resources, it is not possible to process all streams without delay. The more resource available to a computation, the less time it takes to process the input, and thus the more value it generates. Therefore, efficiently utilizing a network of limited distributed resources to optimize the net economic value of computations forms a new paradigm in the well-studied field of resource allocation. We propose using a new performance model and resource reservation system as the solution space, and present two scheduling/resource allocation heuristics for processing streams in a distributed heterogenous computing environment to optimize economic value. Both heuristics are based on market mechanisms; one uses a centralized market and the other decentralized markets. We prove bounds on performance and present measurements to show that the performances of these two heuristics are near-optimal and significantly better than straightforward load-balancing heuristics.