Revisiting Dynamic Query Protocols in Unstructured Peer-to-Peer Networks – projects 2012
Parallel Distributed System, projects 2012 Jan
In unstructured peer-to-peer networks, the average response latency and traffic cost of a query are two main performance metrics. Controlled-flooding resource query algorithms are widely used in unstructured networks such as peer-to-peer networks. In this paper, we propose a novel algorithm named Selective Dynamic Query (SDQ). Based on mathematical programming, SDQ calculates the optimal combination of an integer TTL value and a set of neighbors to control the scope of the next query. Our results demonstrate that SDQ provides finer grained control than other algorithms: its response latency is close to the well-known minimum one via Expanding Ring; in the mean time, its traffic cost is also close to the minimum. To our best knowledge, this is the first work capable of achieving a best trade-off between response latency and traffic cost.
Peer – To – Peer overlay networks, running at the application layer, perform scheduling and routing without any knowledge of the underlying physical networks. Various peer-to-peer systems have became the most popular Internet applications and a major portion of the Internet traffic is attributed to them. Topologies of peer-to-peer systems can be divided into three different categories: centralized, decentralized but structured, and decentralized and unstructured. Napsterlike centralized systems have their resource directories hosted at some central servers. A centralized topology scales poorly and suffers from the single-point-of-failure problem. Chord and Tapestry are decentralized, but their network topologies are highly structured and their resources are placed by distributed-hash-table algorithms. It is not surprising that these topologies are sensitive to the extremely transient join/leave/failure behaviors of peers, which is, unfortunately, an intrinsic characteristic of Internet peer-to-peer applications.
Nowadays, the unstructured topology is a popular model in some peer-to-peer systems since: 1) the unstructured peerto- peer systems are highly resilient to peers’ failures and incur a very low overhead at peer arrivals and departures; 2) they are simple to be implemented and have little overhead in topology maintenance. Gnutella  and Limewire  are examples of such file-sharing systems. In decentralized and unstructured systems, neither central servers nor any precise managements over network topology/resources placement are required.
We first state some definitions used in this paper. In each query packet, the TTL value indicates the hops from the farthest reached node to the inquiry node. For simplicity, we also use nTTL value ðnTTL ¼ TTL
1Þ to denote the hops from the farthest reached node to one of the inquiry node’s direct neighbor which is passed by the packet. Consistent with the state of the art, we assume that the inquiry node could (only) know its direct neighbors’ degree information (number of direct neighbors), which is likely to be the case in practice. The average degree of the network is D, which can be estimated. As the degrees of intermediate nodes are unknown, the inquiry node can adopt the average value D as their estimation. Horizon H refers to the expected number of queried peers in a query round.
Two main categories of query protocols are developed.
Controlled-flooding-based algorithms, as the name suggests, control the iterative flooding process: instead of blind flooding, an integer TTL value is carried in each packet of an individual query round; the scope of the flooding can then be controlled. Controlled-flooding-based algorithms are widely used in unstructured networks such as wireless ad hoc networks. Expanding Ring (ER) is the first such protocol.
The second category of query protocols is random-walk based. The query node sends out a query packet, which is then forwarded in a random fashion until it finally hits the target. In biased random walks, a node has statistical preference to forward the walker toward the target, so as to reduce the excepted number of steps before the target is reached. In general, random-walk-based algorithms can reduce network traffic and enhance the system scalability.
The problem of a resource query in unstructured peer-to-peer networks is proposed. That is, given an overlay network G =(V,E), m copies of a data item published in V , and a predefined parameter (m>=N), the goal of the study is to enable any inquiry peer in V to discover at least N identical data copies published in G at the expense of the minimum traffic cost and response latency. Here, the traffic cost is defined as the number of messages required to complete a query, and the response latency is the time spent to collect at least N identical data copies. Since unstructured networks do not offer any clue to facilitate a resource query, researchers are facing considerable challenges when designing the query algorithms in those networks. A novel algorithm named Selective Dynamic Query (SDQ) is proposed in this paper. By solving a Knapsack programming problem, SDQ calculates the optimal combination of a group of neighbors with a proper integer TTL value for the next query round.