Scheduling in distributed stream processing systems

Stream processing systems receive continuous streams of messages with relatively raw information and produce streams of messages with processed information. The utility of a stream-processing system depends, in part, on the accuracy and timeliness of the output. Streams in complex event processing s...

Full description

Bibliographic Details
Main Author: Khorlin, Andrey
Format: Others
Published: 2006
Online Access:https://thesis.library.caltech.edu/2012/1/thesis.pdf
Khorlin, Andrey (2006) Scheduling in distributed stream processing systems. Master's thesis, California Institute of Technology. doi:10.7907/4MH9-9104. https://resolver.caltech.edu/CaltechETD:etd-05242006-175006 <https://resolver.caltech.edu/CaltechETD:etd-05242006-175006>
id ndltd-CALTECH-oai-thesis.library.caltech.edu-2012
record_format oai_dc
spelling ndltd-CALTECH-oai-thesis.library.caltech.edu-20122019-12-22T03:06:58Z Scheduling in distributed stream processing systems Khorlin, Andrey Stream processing systems receive continuous streams of messages with relatively raw information and produce streams of messages with processed information. The utility of a stream-processing system depends, in part, on the accuracy and timeliness of the output. Streams in complex event processing systems are processed on distributed systems; several steps are taken on different processors to process each incoming message, and messages may be enqueued between steps. This work explores the problem of distributed dynamic control of streams to optimize the total utility provided by the system. A system can be controlled using central control or distributed control. In the former case a single central controller maintains the state of the entire system and controls the operation of all processors. In distributed control systems, each processor controls itself based on its state and information from other processors. A challenge of distributed control is that timeliness of output depends only on the total end-to-end time and is otherwise independent of the delays at each separate processor whereas the controller for each processor takes action to control only the steps on that processor and cannot directly control the entire network. In this work, we discuss a framework for design and analysis of the control-based scheduling algorithms for a distributed stream processing system and illustrate our framework with two concrete scheduling algorithms. 2006 Thesis NonPeerReviewed application/pdf https://thesis.library.caltech.edu/2012/1/thesis.pdf https://resolver.caltech.edu/CaltechETD:etd-05242006-175006 Khorlin, Andrey (2006) Scheduling in distributed stream processing systems. Master's thesis, California Institute of Technology. doi:10.7907/4MH9-9104. https://resolver.caltech.edu/CaltechETD:etd-05242006-175006 <https://resolver.caltech.edu/CaltechETD:etd-05242006-175006> https://thesis.library.caltech.edu/2012/
collection NDLTD
format Others
sources NDLTD
description Stream processing systems receive continuous streams of messages with relatively raw information and produce streams of messages with processed information. The utility of a stream-processing system depends, in part, on the accuracy and timeliness of the output. Streams in complex event processing systems are processed on distributed systems; several steps are taken on different processors to process each incoming message, and messages may be enqueued between steps. This work explores the problem of distributed dynamic control of streams to optimize the total utility provided by the system. A system can be controlled using central control or distributed control. In the former case a single central controller maintains the state of the entire system and controls the operation of all processors. In distributed control systems, each processor controls itself based on its state and information from other processors. A challenge of distributed control is that timeliness of output depends only on the total end-to-end time and is otherwise independent of the delays at each separate processor whereas the controller for each processor takes action to control only the steps on that processor and cannot directly control the entire network. In this work, we discuss a framework for design and analysis of the control-based scheduling algorithms for a distributed stream processing system and illustrate our framework with two concrete scheduling algorithms.
author Khorlin, Andrey
spellingShingle Khorlin, Andrey
Scheduling in distributed stream processing systems
author_facet Khorlin, Andrey
author_sort Khorlin, Andrey
title Scheduling in distributed stream processing systems
title_short Scheduling in distributed stream processing systems
title_full Scheduling in distributed stream processing systems
title_fullStr Scheduling in distributed stream processing systems
title_full_unstemmed Scheduling in distributed stream processing systems
title_sort scheduling in distributed stream processing systems
publishDate 2006
url https://thesis.library.caltech.edu/2012/1/thesis.pdf
Khorlin, Andrey (2006) Scheduling in distributed stream processing systems. Master's thesis, California Institute of Technology. doi:10.7907/4MH9-9104. https://resolver.caltech.edu/CaltechETD:etd-05242006-175006 <https://resolver.caltech.edu/CaltechETD:etd-05242006-175006>
work_keys_str_mv AT khorlinandrey schedulingindistributedstreamprocessingsystems
_version_ 1719304678349471744