Dependability and performance analysis of distributed algorithms for managing replicated data

博士 === 國立成功大學 === 資訊工程學系碩博士班 === 91 === Data replication is a proven technique for improving data availability of distributed systems. Historically the past research focused mainly on the development of replicated data management algorithms that can be proven correct and result in improved d...

Full description

Bibliographic Details
Main Authors: Ding-Chau Wang, 王鼎超
Other Authors: Chih-Ping Chu
Format: Others
Language:en_US
Published: 2003
Online Access:http://ndltd.ncl.edu.tw/handle/21474899552073101005
Description
Summary:博士 === 國立成功大學 === 資訊工程學系碩博士班 === 91 === Data replication is a proven technique for improving data availability of distributed systems. Historically the past research focused mainly on the development of replicated data management algorithms that can be proven correct and result in improved data availability, with the performance issues associated with data maintenance largely ignored. In this thesis, we analyze both dependability and performance characteristics of distributed algorithms for managing replicated data by developing generic modeling techniques based on Petri nets, with the goal to identify environmental conditions under which these replicated data management algorithms can be used to satisfy system dependability and performance requirements. First, we investigate an effective technique for calculating the access time distribution for requests that access replicated data maintained by the distributed system using the majority voting as a case. The technique can be used to estimate the reliability of real-time applications which must access replicated data with a deadline requirement. Then we enhance this technique to analyze user-perceived dependability and performance properties of quorum-based algorithms. User-perceived dependability and performance metrics are very different from conventional ones in that the dependability and performance properties must be assessed from the perspective of users accessing the system. A feature of the enhanced techniques is that no assumption is made regarding the interconnection topology, the number of replicas, or the quorum definition used by the replicated system, thus making it applicable to a wide class of quorum-based algorithms. Our analysis shows that when the user-perceiveness is taken into consideration, the effect of increasing the network connectivity and number of replicas on the availability and dependability properties perceived by users is very different from that under conventional metrics. Thus, unlike conventional metrics, user-perceived metrics allow a tradeoff to be exploited between the hardware invested, i.e., higher network connectivity and number of replicas, and the performance and dependability properties perceived by users. Next we analyze reconfigurable algorithms to determine how often the system should detect and react to failure conditions so that reorganization operations can be performed by the system at the appropriate time to improve the availability of replicated data without adversely compromising the performance of the system. We use dynamic voting as a case study to reveal design trade-offs for designing such reconfigurable algorithms and illustrate how often failure detection and reconfiguration activities should be performed, by means of using dummy updates, so as to maximize data availability. Dummy updates are system-initiated maintenance updates that will only update the state of the system regarding the availability of replicated data without actually changing the value of replicated data. However, because of using locks, dummy updates can hinder normal user-initiated updates during the execution of the conventional 2-phase commitment (2PC) protocol. We develop a modified 2PC protocol to be used by dummy updates and show that the modified 2PC protocol greatly improves the availability of replicated data compared to the conventional 2PC protocol. Lastly, we examine the availability and performance characteristics of replicated data in wireless cellular environments in which users access replicated data through base stations of the network as they roam in and out of those base stations. We address the issues of when, where and how to place replicas on the base stations by developing a performance model to analyze periodic maintenance strategies for managing replicated objects in mobile wireless client-server environments. Under a periodical maintenance strategy, the system periodically checks local cells to determine if a replicated object should be allocated or deallocated in a cell to reduce the access cost. Our performance model considers the missing-read cost, write-propagation cost and the periodic maintenance cost with the objective to identify optimal periodic maintenance intervals to minimize the overall cost. Our analysis results show that the overall cost is high when the user arrival-departure ratio and the read-write ratio work against each other and is low otherwise. Under the fixed periodic maintenance strategy, i.e., the maintenance interval is a constant, there exists an optimal periodic maintenance interval that would yield the minimum cost. Further, the optimal periodic maintenance interval increases as the arrival-departure ratio and the read-write ratio work in harmony. We also discover that by adjusting the periodic intervals dynamically in response to state changes of the system at run time, it can further reduce the overall cost obtainable by the fixed periodic maintenance strategy at optimizing conditions.