Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster
Context. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware an...
Main Authors: | , |
---|---|
Format: | Others |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544 |
id |
ndltd-UPSALLA1-oai-DiVA.org-bth-14544 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-bth-145442017-06-22T05:34:59ZExperimental Investigation of Container-based Virtualization Platforms For a Cassandra ClusterengSulewski, PatrykJesper, Hallborg2017Container VirtualizationCassandraDockerLXCBig dataMicroservicesLinux distributionsComputer SystemsDatorsystemContext. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware and software, which in turn makes the process isolation expensive. Newtechniques, known as Microservices or containers, has been developed to deal with the overhead.The infrastructure is conjoint with storing, processing and serving vast and unstructureddata sets. The overall cloud system needs to have high performance while providing scalabilityand easy deployment. Microservices can be introduced for all kinds of applications in a cloudcomputing network, and be a better fit for certain products.Objectives. In this study we investigate how a small system consisting of a Cassandra clusterperform while encapsulated in LXC and Docker containers, compared to a non virtualizedstructure. A specific loader is built to stress the cluster to find the limits of the containers.Methods. We constructed an experiment on a three node Cassandra cluster. Test data is sentfrom the Cassandra-loader from another server in the network. The Cassandra processes are thendeployed in the different architectures and tested. During these tests the metrics CPU, disk I/O,network I/O are monitored on the four servers. The data from the metrics is used in statisticalanalysis to find significant deviations.Results. Three experiments are being conducted and monitored. The Cluster test pointed outthat isolated Docker container indicate major latency during disk reads. A local stress test furtherconfirmed those results. The step-wise test in turn, implied that disk read latencies happened dueto isolated Docker containers needs to read more data to handle these requests. All Microservicesprovide some overheads, but fall behind the most for read requests.Conclusions. The results in this study show that virtualization of Cassandra nodes in a clusterbring latency in comparison to a non virtualized solution for write operations. However, thoselatencies can be neglected if scalability in a system is the main focus. For read operationsall microservices had reduced performance and isolated Docker containers brought out thehighest overhead. This is due to the file system used in those containers, which makes disk I/Oslower compared to the other structures. If a Cassandra cluster is to be launched in a containerenvironment we recommend a Docker container with mounted disks to bypass Dockers filesystem or a LXC solution. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Container Virtualization Cassandra Docker LXC Big data Microservices Linux distributions Computer Systems Datorsystem |
spellingShingle |
Container Virtualization Cassandra Docker LXC Big data Microservices Linux distributions Computer Systems Datorsystem Sulewski, Patryk Jesper, Hallborg Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
description |
Context. Cloud computing is growing fast and has established itself as the next generationsoftware infrastructure. A major role in cloud computing is the virtualization of hardware toisolate systems from each other. This virtualization is often done with Virtual Machines thatemulate both hardware and software, which in turn makes the process isolation expensive. Newtechniques, known as Microservices or containers, has been developed to deal with the overhead.The infrastructure is conjoint with storing, processing and serving vast and unstructureddata sets. The overall cloud system needs to have high performance while providing scalabilityand easy deployment. Microservices can be introduced for all kinds of applications in a cloudcomputing network, and be a better fit for certain products.Objectives. In this study we investigate how a small system consisting of a Cassandra clusterperform while encapsulated in LXC and Docker containers, compared to a non virtualizedstructure. A specific loader is built to stress the cluster to find the limits of the containers.Methods. We constructed an experiment on a three node Cassandra cluster. Test data is sentfrom the Cassandra-loader from another server in the network. The Cassandra processes are thendeployed in the different architectures and tested. During these tests the metrics CPU, disk I/O,network I/O are monitored on the four servers. The data from the metrics is used in statisticalanalysis to find significant deviations.Results. Three experiments are being conducted and monitored. The Cluster test pointed outthat isolated Docker container indicate major latency during disk reads. A local stress test furtherconfirmed those results. The step-wise test in turn, implied that disk read latencies happened dueto isolated Docker containers needs to read more data to handle these requests. All Microservicesprovide some overheads, but fall behind the most for read requests.Conclusions. The results in this study show that virtualization of Cassandra nodes in a clusterbring latency in comparison to a non virtualized solution for write operations. However, thoselatencies can be neglected if scalability in a system is the main focus. For read operationsall microservices had reduced performance and isolated Docker containers brought out thehighest overhead. This is due to the file system used in those containers, which makes disk I/Oslower compared to the other structures. If a Cassandra cluster is to be launched in a containerenvironment we recommend a Docker container with mounted disks to bypass Dockers filesystem or a LXC solution. |
author |
Sulewski, Patryk Jesper, Hallborg |
author_facet |
Sulewski, Patryk Jesper, Hallborg |
author_sort |
Sulewski, Patryk |
title |
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
title_short |
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
title_full |
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
title_fullStr |
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
title_full_unstemmed |
Experimental Investigation of Container-based Virtualization Platforms For a Cassandra Cluster |
title_sort |
experimental investigation of container-based virtualization platforms for a cassandra cluster |
publishDate |
2017 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14544 |
work_keys_str_mv |
AT sulewskipatryk experimentalinvestigationofcontainerbasedvirtualizationplatformsforacassandracluster AT jesperhallborg experimentalinvestigationofcontainerbasedvirtualizationplatformsforacassandracluster |
_version_ |
1718461968867655680 |