Design and evaluation of new search paradigms and power management for peer-to-peer file sharing

Current estimates are that more than nine million PCs in the U.S. are part of peer-to-peer (P2P) file sharing overlay networks on the Internet. These P2P hosts generate about 20% of the traffic on the Internet and consume about 7.8 TWh/yr equal to $630 million per year. File search in a P2P network...

Full description

Bibliographic Details
Main Author: Perera, Graciela
Format: Others
Published: Scholar Commons 2007
Subjects:
P2P
Online Access:http://scholarcommons.usf.edu/etd/2319
http://scholarcommons.usf.edu/cgi/viewcontent.cgi?article=3318&context=etd
Description
Summary:Current estimates are that more than nine million PCs in the U.S. are part of peer-to-peer (P2P) file sharing overlay networks on the Internet. These P2P hosts generate about 20% of the traffic on the Internet and consume about 7.8 TWh/yr equal to $630 million per year. File search in a P2P network is based on a wasteful paradigm of broadcasting query messages. Reducing P2P overhead traffic to reduce bandwidth waste and enabling power management to reduce electricity usage are clearly of great interest. In this dissertation, two new search paradigms with reduced overhead traffic are investigated. The new Targeted Search method uses statistics from previous searches to target future searches. Targeted Search is shown to reduce query overhead traffic when compared to broadcast-based search used by Gnutella. The new Broadcast Updates with Local Look-up Search (BULLS) protocol enables new capabilities including power management and reduces overhead traffic by enabling a local look-up of shared files. BULLS hosts periodically broadcast changes in their list of files shared and build a table of shared files by all other hosts. Power management in P2P networks is studied as an application of the minimum set cover problem. A reduction in overall energy consumption is achieved by powering down hosts that have all of their shared files fully shared (or covered) by other hosts. A new set cover heuristic -- called the Random Map Out (RMO) algorithm --is introduced and compared to the well-known Greedy heuristic. The algorithms are evaluated for minimum set cover size and computational complexity (number of comparisons). The RMO algorithm requires significantly less comparisons than Greedy and still achieves a set cover size within a few percent of that of Greedy. Additionally, the RMO algorithm can be distributed and independently executed by each host with reduced complexity per host where the Greedy heuristic does not reduce in complexity by being distributed. With RMO there is a non-zero probability of a given file being "lost" (not in set cover). The probability of this event is modeled and numerical results show that the probability of a file being lost is practically insignificant.