Summary: | Communication services such as telephony, broadband and TV are increasingly migrating into Internet Protocol(IP) based networks because of the consolidation of telephone and data networks. Meanwhile, the increasingly wide application of Cloud Computing enables the accommodation of tens of thousands of applications from the general public or enterprise users which make use of Cloud services on-demand through IP networks such as the Internet. Real-Time services over IP (RTIP) have also been increasingly significant due to the convergence of network services, and the real-time needs of the Internet of Things (IoT) will strengthen this trend. Such Real-Time applications have strict Quality of Service (QoS) constraints, posing a major challenge for IP networks. The Cognitive Packet Network (CPN) has been designed as a QoS-driven protocol that addresses user-oriented QoS demands by adaptively routing packets based on online sensing and measurement. Thus in this thesis we first describe our design for a novel ''Real-Time (RT) traffic over CPN'' protocol which uses QoS goals that match the needs of voice packet delivery in the presence of other background traffic under varied traffic conditions; we present its experimental evaluation via measurements of key QoS metrics such as packet delay, delay variation (jitter) and packet loss ratio. Pursuing our investigation of packet routing in the Internet, we then propose a novel Big Data and Machine Learning approach for real-time Internet scale Route Optimisation based on Quality-of-Service using an overlay network, and evaluate is performance. Based on the collection of data sampled each $2$ minutes over a large number of source-destinations pairs, we observe that intercontinental Internet Protocol (IP) paths are far from optimal with respect to metrics such as end-to-end round-trip delay. On the other hand, our machine learning based overlay network routing scheme exploits large scale data collected from communicating node pairs to select overlay paths, while it uses IP between neighbouring overlay nodes. We report measurements over a week long experiment with several million data points shows substantially better end-to-end QoS than is observed with pure IP routing. Pursuing the machine learning approach, we then address the challenging problem of dispatching incoming tasks to servers in Cloud systems so as to offer the best QoS and reliable job execution; an experimental system (the Task Allocation Platform) that we have developed is presented and used to compare several task allocation schemes, including a model driven algorithm, a reinforcement learning based scheme, and a ''sensible’’ allocation algorithm that assigns tasks to sub-systems that are observed to provide lower response time. These schemes are compared via measurements both among themselves and against a standard round-robin scheduler, with two architectures (with homogenous and heterogenous hosts having different processing capacities) and the conditions under which the different schemes offer better QoS are discussed. Since Cloud systems include both locally based servers at user premises and remote servers and multiple Clouds that can be reached over the Internet, we also describe a smart distributed system that combines local and remote Cloud facilities, allocating tasks dynamically to the service that offers the best overall QoS, and it includes a routing overlay which minimizes network delay for data transfer between Clouds. Internet-scale experiments that we report exhibit the effectiveness of our approach in adaptively distributing workload across multiple Clouds.
|