1-2hit |
Kai-Hsiang YANG Chi-Jen WU Jan-Ming HO
The most prevalent peer-to-peer (P2P) application till today is file sharing, and unstructured P2P networks can support inherent heterogeneity of peers, are highly resilient to peers' failures, and incur low overhead at peer arrivals and departures. Dynamic querying (DQ) is a new flooding technique which could estimate a proper time-to-live (TTL) value for a query flooding by estimating the popularity of the searched files, and retrieve sufficient results under controlled flooding range for reducing network traffic. Recent researches show that a large amount of peers in the P2P file sharing system are the free-riders, and queries are seldom hit by those peers. The free-riding problem causes a large amount of redundant messages in the DQ-like search algorithm. In this paper, we proposed a new search algorithm, called "AntSearch," to solve the problem. In AntSearch, each peer maintains its hit rate of previous queries, and records a list of pheromone values of its immediate neighbors. Based on the pheromone values, a query is only flooded to those peers which are not likely to be the free-riders. Our simulation results show that, compared with DQ and its enhanced algorithm DQ+, the AntSearch algorithm averagely reduces 50% network traffic at almost the same search latency as DQ+, while retrieving sufficient results for a query with a given required number of results.
In this paper we focus on building a large scale keyword search service over structured Peer-to-Peer (P2P) networks. Current state-of-the-art keyword search approaches for structured P2P systems are based on inverted list intersection. However, the biggest challenge in those approaches is that when the indices are distributed over peers, a simple query may cause a large amount of data to be transmitted over the network. We propose in this paper a new P2P keyword search scheme, called "Proof," which aims to reduce the network traffic generated during the intersection process. We applied three main ideas in Proof to reduce network traffic, including (1) using a sorted query flow, (2) storing content summaries in the inverted lists, and (3) setting a stop condition for the checking of content summaries. We also discuss the advantages and limitations of Proof, and conducted extensive experiments to evaluate the search performance and the quality of search results. Our simulation results showed that, compared with previous solutions, Proof can dramatically reduce network traffic while providing 100% precision and high recall of search results, at some additional storage overhead.