Summary: | 碩士 === 國立彰化師範大學 === 資訊工程學系 === 96 === Most job scheduling policies for Grid systems focused on minimizing the makespan of the whole system, neglecting user demands on the reliability. Therefore, this work mainly aims at the designs of the genetic algorithm based scheduling strategies by considering four different fault tolerance techniques in the Grid environment, including Retry, Migration, Checkpoint, Replication. We also take into account the risk relationship between jobs and nodes to improve the system reliability in the scheduling algorithm.
Beside the four fault-tolerant strategies describe above, we also design a new fault-tolerant algorithm called Integration. We integrate the four fault-tolerant schemes in the Integration algorithm, and because of its heterogeneity so this algorithm can extensively apply in many kinds of Grid systems.
According to the simulation results, we can find out that the performance of fault tolerant algorithms is better than risky algorithm whether in makespan, average turnaround time, or the job failure rate. Checkpoint algorithm has the best performance in all algorithms. This is because its fault scheme can save large waste job execution time. On the other hand, retry algorithm is recommended for the system where the job sizes are usually smaller because of its simplicity. Finally, replicated algorithm is not suitable for the Grid since it imposes too much overhead.
|