Summary: | 碩士 === 大同大學 === 資訊工程研究所 === 89 === With advances in widespread networking, public WWW environment, and platform-independent Java bytecode, millions of Java-capable computers can be connected for sharing computing ability now. These heterogeneous supercomputers, workstations, personal computers, and laptops, can be merged as a pool of distributed Java virtual machines and exploit their large number of computing cycles for CPU-intensive applications. In order to provide a robust distributed environment, a Fault-Tolerance Framework for Java-Based Distributed Computing System (FJDCS) has been proposed in this thesis. The most important advantage of our system is providing an enhanced and configurable fault-tolerance mechanism to all of legacy Java applications. In the very unreliable networking environment like public computing pool, the RMI mechanism still lacks a robust fault-tolerance mechanism to ensure that every computation can be completed in an iteration. We extended the RMI API and combined the replication mechanism that can be categorized to active replication mechanism to build our FJDCS API. Programmers can just extend our API directly and do not need to modify their legacy applications to get our robust fault-tolerance mechanism. In most cases, an application is completed by many cooperated tasks. In the proposed system, we replicate every task by two or more instances and dispatch them to the different computing nodes concurrently. When one of the computing nodes that process the instances of the same task has completed its operation, this task is completed. In the very unreliable network, we can configure the number of clones for one task to ensure that at least one computing node can complete this task.
|