Summary: | 碩士 === 國立虎尾科技大學 === 資訊工程系碩士班 === 105 === In recent years, internet and network platform fast-growing, such as social networks, network traffic, search engine, online audio and video content, online transactions and more data to generate the rapid growth of a large amount of data, called Big Data. If we want top process and analytics, it need to use cloud computing, and the Apache Software Foundation of Hadoop project is by far the most commonly used for macros of data analysis of open source cluster computing framework, Hadoop its full infrastructure for the cloud computing offers distributed computing technologies.
The virtualizations a cloud computing architecture of the key technology, dynamic resource adapter to the virtual machine by using virtualization, the resources of dispatch greater flexibility, and fault tolerance can avoid hardware in the planning of downtime and result in a disruption of service.
In this paper, we will use VMware vSphere virtualization software, deploy Hadoop ecosystem on the virtualization platform and design the network architecture for server cluster and storage cluster. Install the Hadoop nodes on the virtual machine, and deploy multi Hadoop cluster of different nodes, executive program for computing and analytics by using dataset and evaluate the cluster performance.
|