Using parallelization techniques to solve the one-to-one shortest path problem

碩士 === 淡江大學 === 資訊管理學系碩士班 === 101 === In recent years as more mobile devices allow user to get online more easily, there is increasing demand for more powerful cloud services to serve more requests with shorter response time. In the past, the server often uses CPU to handle the requests. As the CPU...

Full description

Bibliographic Details
Main Authors: Yung-Hsin Ho, 何永欣
Other Authors: Shih-Chieh Wei
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/81474601782401380327
Description
Summary:碩士 === 淡江大學 === 資訊管理學系碩士班 === 101 === In recent years as more mobile devices allow user to get online more easily, there is increasing demand for more powerful cloud services to serve more requests with shorter response time. In the past, the server often uses CPU to handle the requests. As the CPU equips with more cores and threads, use of the parallel computation can provide more speedup. Currently, OpenMP is the most common parallelization technique for utilizing CPU threads. In the meantime, general purpose GPUs evolving from the graphics display card also see rapid development as a mature parallel compute technology. Currently, CUDA is the most popular compute architecture to exploit the powerful and cheap GPU cores. This work will utilize these parallelization techniques to accelerate the computation of one-to-one shortest path in a clustered map. To make full use of the CPU and GPU hardware resources, the request is divided into the inter-cluster task and the intra-cluster task based on the clusters of the start and end locations. As the inter-cluster task takes long compute time, GPU using CUDA is introduced to accelerate the computation. For intra-cluster tasks, each task is simply served by a CPU thread running the Dijkstra algorithm with a goal to maximize the number of tasks serviced per second. Our experiment is performed on a real Taiwan roadmap with 275,195 nodes and 381,172 edges where clusters are formed by the METIS clustering method. Given a platform with double CPUs of Intel Xeon E5620 and double GPUs of NVidia Tesla C2050, we measure speedup relative to a single thread counterpart. For only inter-cluster tasks, we obtain a speedup of 7.7x or 0.08ms when computing only the shortest distance. When the shortest path is also computed, we obtain a speedup of 5.2x or 0.13ms. For the whole spectrum of tasks including inter-cluster and intra-cluster tasks, we can service 25,756 tasks in one second when computing only the shortest distance. When the shortest path is also computed, we can service 24,128 tasks in one second. For comparison with maps without clusters, several parallel Dijkstra algorithms are tested too. When the shortest path is also computed, our parallel algorithm can obtain a speedup of 5.2x on real Taiwan roadmap and a speedup of 8x on a map of random graph.