Optimization of proactive services with uncertain predictions

The Internet faces significant challenges from the dramatic growth in traffic and computation workload from highly diverse applications. With the evolution of technologies such as machine learning and data science, proactive services with the aid of predictive information have been recognized as a p...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/D20399924
id ndltd-NEU--neu-bz60fb405
record_format oai_dc
collection NDLTD
sources NDLTD
description The Internet faces significant challenges from the dramatic growth in traffic and computation workload from highly diverse applications. With the evolution of technologies such as machine learning and data science, proactive services with the aid of predictive information have been recognized as a promising method to exploit network bandwidth, storage, and computation resources to achieve improved user experiences. To better understand the challenges facing the Internet, we introduce the background in Chapter 1, followed by a thorough survey of the existing literature on prediction algorithms, proactive caching, and proactive computing. Specifically, we discuss the analytical works of optimization problems in proactive algorithms, which are closely related to our work. Our primary goal is to investigate the fundamental performance improvement that can be achieved from proactive services under uncertain predictions. We aim to analyze the queueing behavior of a proactive system and design proactive strategies to optimize system performance in terms of the limiting fraction of proactive work and the average delay. In Chapter 2, we study a proactive caching system where files can be served partially under uncertain predictions. We first propose a potential request process where each request is realized with a certain probability, characterizing the uncertainties in predictions. In a general proactive system, we derive an upper bound for the average amount of proactive service per request that the system can support. Then we analyze the behavior of a family of threshold-based proactive strategies and show that the average amount of proactive service per request can be maximized by properly selecting the threshold. Finally, we propose the UNIFORM strategy, which is the optimal threshold-based strategy, and show that it outperforms the most commonly used Earliest-Deadline-First (EDF) type proactive strategies in terms of delay. We perform extensive numerical experiments to demonstrate the influence of thresholds on delay performance under the threshold-based strategies and specifically compare the EDF strategy and the UNIFORM strategy to verify our results. In Chapter 3, we study a more generalized proactive service problem under uncertain predictions. We propose a more generalized service model where service time follows an exponential distribution, where services cannot be partially finished. Similarly, we derive an upper bound for the fraction of services that can be finished proactively under uncertain predictions in a general proactive service system. Specifically, we analyze a family of fixed-probability (FIXP) proactive strategies in two proactive systems: the Genie-Aided system and the Realistic Proactive system. We obtain optimal FIXP strategies in both systems and prove that the optimal FIXP strategies maximize the limiting fraction of proactive service among all proactive strategies and minimize average delay among FIXP strategies. Extensive numerical experiments demonstrate the influence of the parameter of FIXP on the performance of limiting fraction of proactive work and the average delay in both proactive systems and verify our theoretical results in multiple scenarios. As a complementary chapter, we introduce our work on the SDN-Aided NDN for Data-Intensive Experiments in Chapter 4. This project aims to apply the novel Named-Data Networking architecture as a networking solution for data-intensive scientific applications. We first introduce data-intensive high-energy physics applications of CMS at CERN, the Named-Data Networking architecture, the VIP joint caching and forwarding framework, and the overview of the SANDIE project as the background. We then elaborate on our designs and achievements in this project following the chronological order.We first study the CMS experimental data formats and workflow from several data analysis systems in CERN to understand CMS traffic patterns. We then demonstrate results of our packet-level network simulations of the VIP framework on the CMS network topology: one at an early stage of this project to explore the potential beneficial caching locations, and the other updated one based on our analysis of CMS datasets and workflows to verify the potential system performance improvement from VIP algorithms. To accommodate the specific patterns in the CMS network, we modified our VIP framework to address the challenges of CMS applications. In the last section, we introduce the deployment of the continental SANDIE testbed, implementations of the VIP framework in two NDN forwarder software, a milestone demonstration at SuperComputing 19', and an overview of the latest progress and future directions on the SANDIE project.
title Optimization of proactive services with uncertain predictions
spellingShingle Optimization of proactive services with uncertain predictions
title_short Optimization of proactive services with uncertain predictions
title_full Optimization of proactive services with uncertain predictions
title_fullStr Optimization of proactive services with uncertain predictions
title_full_unstemmed Optimization of proactive services with uncertain predictions
title_sort optimization of proactive services with uncertain predictions
publishDate
url http://hdl.handle.net/2047/D20399924
_version_ 1719406526375919616
spelling ndltd-NEU--neu-bz60fb4052021-05-26T05:11:05ZOptimization of proactive services with uncertain predictionsThe Internet faces significant challenges from the dramatic growth in traffic and computation workload from highly diverse applications. With the evolution of technologies such as machine learning and data science, proactive services with the aid of predictive information have been recognized as a promising method to exploit network bandwidth, storage, and computation resources to achieve improved user experiences. To better understand the challenges facing the Internet, we introduce the background in Chapter 1, followed by a thorough survey of the existing literature on prediction algorithms, proactive caching, and proactive computing. Specifically, we discuss the analytical works of optimization problems in proactive algorithms, which are closely related to our work. Our primary goal is to investigate the fundamental performance improvement that can be achieved from proactive services under uncertain predictions. We aim to analyze the queueing behavior of a proactive system and design proactive strategies to optimize system performance in terms of the limiting fraction of proactive work and the average delay. In Chapter 2, we study a proactive caching system where files can be served partially under uncertain predictions. We first propose a potential request process where each request is realized with a certain probability, characterizing the uncertainties in predictions. In a general proactive system, we derive an upper bound for the average amount of proactive service per request that the system can support. Then we analyze the behavior of a family of threshold-based proactive strategies and show that the average amount of proactive service per request can be maximized by properly selecting the threshold. Finally, we propose the UNIFORM strategy, which is the optimal threshold-based strategy, and show that it outperforms the most commonly used Earliest-Deadline-First (EDF) type proactive strategies in terms of delay. We perform extensive numerical experiments to demonstrate the influence of thresholds on delay performance under the threshold-based strategies and specifically compare the EDF strategy and the UNIFORM strategy to verify our results. In Chapter 3, we study a more generalized proactive service problem under uncertain predictions. We propose a more generalized service model where service time follows an exponential distribution, where services cannot be partially finished. Similarly, we derive an upper bound for the fraction of services that can be finished proactively under uncertain predictions in a general proactive service system. Specifically, we analyze a family of fixed-probability (FIXP) proactive strategies in two proactive systems: the Genie-Aided system and the Realistic Proactive system. We obtain optimal FIXP strategies in both systems and prove that the optimal FIXP strategies maximize the limiting fraction of proactive service among all proactive strategies and minimize average delay among FIXP strategies. Extensive numerical experiments demonstrate the influence of the parameter of FIXP on the performance of limiting fraction of proactive work and the average delay in both proactive systems and verify our theoretical results in multiple scenarios. As a complementary chapter, we introduce our work on the SDN-Aided NDN for Data-Intensive Experiments in Chapter 4. This project aims to apply the novel Named-Data Networking architecture as a networking solution for data-intensive scientific applications. We first introduce data-intensive high-energy physics applications of CMS at CERN, the Named-Data Networking architecture, the VIP joint caching and forwarding framework, and the overview of the SANDIE project as the background. We then elaborate on our designs and achievements in this project following the chronological order.We first study the CMS experimental data formats and workflow from several data analysis systems in CERN to understand CMS traffic patterns. We then demonstrate results of our packet-level network simulations of the VIP framework on the CMS network topology: one at an early stage of this project to explore the potential beneficial caching locations, and the other updated one based on our analysis of CMS datasets and workflows to verify the potential system performance improvement from VIP algorithms. To accommodate the specific patterns in the CMS network, we modified our VIP framework to address the challenges of CMS applications. In the last section, we introduce the deployment of the continental SANDIE testbed, implementations of the VIP framework in two NDN forwarder software, a milestone demonstration at SuperComputing 19', and an overview of the latest progress and future directions on the SANDIE project.http://hdl.handle.net/2047/D20399924