Balakrishnan, H., Kaashoek, M. F., Karger, D., Morris, R., and Stoica, I. 2003. Looking up data in P2P systems. Commun. ACM 46, 2 (Feb. 2003), 43-48. DOI=http://doi.acm.org/10.1145/606272.606299
Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (San Diego, California, United States). SIGCOMM ‘01. ACM, New York, NY, 149-160. DOI=http://doi.acm.org/10.1145/383059.383071
These two papers deal with the problem of storing and locating data in P2P systems, the first surveying a number of different systems, and the second describing the Chord system in detail.
The principal question that the first paper addresses is that of lookup: how is data to be located in a P2P system efficiently, lacking any centralised index or hierarchy? The authors take us through a number of different approaches to lookup, mainly structured and symmetric lookup.
In a structured lookup system, data may be stored in a central server, or a hierarchy may be imposed to provide scalability. These systems can provide guarantees for location of data, but are vulnerable to the failure of the central database, or of higher-level nodes. In addition, a disproportionately high percentage of load on the system may have to be borne by higher level nodes.
In a symmetric lookup system, no node is more important than any other, and each node participates in handling a fraction of the total load on the system. These vary from systems where lookup requests are broadcast to all neighbours (introducing significant overhead, and scaling problems), to others where “superpeers” are used to introduce some hierarchy to the system to scale up more efficiently. However, more significant superpeers become points of failure, and such systems provide no guarantees for lookup.
Modern P2P algorithms, including CAN, Chord, Kademlian, Pastry, Tapestry and Viceroy, all implement the Distributed Hash Table (DHT) abstraction, with implementations driven by considerations of scaling well to large numbers of nodes, low latency lookup, efficient handling of dynamic node arrival and departure and an even distribution of keys amongst nodes.
A DHT identifies both keys and nodes using a common hash function. Keys are then stored on nodes “closer” to them; “closeness” is evaluated using a function that compares the key and node identifiers. When a node receives a lookup request, it needs sufficient information to forward the request on to a node which is closer to the requested key. Routing table to maintain this information may be one dimensional (implemented a skip list in Chord, and as a tree in Pastry, Tapestry and Kademlia), or multi-dimensional (as in CAN). The multidimensional approach is interesting: I could imagine this being used for a data retrieval system, where queries may be run to locate items based on arbitrary combinations of dimensions.
The discussion of the unidirectionality and symmetry properties of the different algorithms was interesting, but also a little confusing: why, for instance, do symmetric protocols not require stabilization routines? I particularly liked the way in which malicious activity can be dealt with: since all these algorithms can easily perform cross-checks to verify correct routing and so identify malicious nodes, the worst that can happen is that a malicious node denies the existence of available data.
The second paper describes an implementation of Chord, which the authors suggest has advantages of simplicity, provable correctness, and provable performance over other P2P lookup protocols. The Chord implementation uses SHA-1 hashing to generate identifiers for keys and nodes, proceeding on the assumption that given a sufficient number of nodes, there will be a roughly equal distribution of keys over nodes. Nodes form their routing (”finger”) tables by maintaining information about successor nodes: each successor node is at increasingly exponential distances from the source node. This creates the interesting property that a node “knows” more of its closer neighbours than its distant neighbours. Keys are assigned to the first nodes with an identifier which is equal to, or follows the key. By maintaining a relatively small amount of information (about O(log N) successors), nodes can efficiently route lookup requests to the node holding the key.
When a node joins or leaves Chord, two properties must be preserved: each node’s successor must be correctly maintained, and keys must be distributed properly amongst nodes. Chord cannot transfer keys itself; it only notifies application software that this transfer is necessary, so that the application layer can take care of also transferring associated data. Chord periodically runs a stabilization protocol to update the finger tables for all nodes.
Since keys are uniquely associated with nodes, applications are also responsible for maintaining replicas of data, by associating them with different keys. I’m not sure if this is a reasonable expectation, since this means that the application layer would have to be aware of the key distribution across nodes, to ensure that replicas are actually stored on different nodes.
In simulations, it was found that mean path length increases logarithmically with the number of nodes, as expected. Under simultaneous node failure conditions, the lookup failure rate is proportional to the fraction of failed nodes, again as expected, since these would map to the fraction of lost keys. Lookup failures during stabilization depend on the number of failures, the average path length and number of nodes. In experimental testing on the RON testbed, lookup latency was found to grow slowly with the total number of nodes, confirming simulation results, and demonstrating Chord’s scalability.