Distributed hash table
A distributed hash table (DHT) is a class of a decentralized distributed system that provides a lookup service similar to a hash table: (key, value) pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.
DHTs form an infrastructure that can be used to build more complex services, such as anycast, cooperative Web caching, distributed file systems, domain name services, instant messaging, multicast, and also peertopeer file sharing and content distribution systems. Notable distributed networks that use DHTs include BitTorrent’s distributed tracker, the Coral Content Distribution Network, the Kad network, the Storm botnet, the Tox instant messenger, Freenet, the YaCy search engine, and the InterPlanetary File System.
Contents
Localitypreserving hashing
Localitypreserving hashing ensures that similar keys are assigned to similar objects. This can enable a more efficient execution of range queries. SelfChord decouples object keys from peer IDs and sorts keys along the ring with a statistical approach based on the swarm intelligence paradigm. Sorting ensures that similar keys are stored by neighbour nodes and that discovery procedures, including range queries, can be performed in logarithmic time.
Overlay network
Each node maintains a set of to other nodes (its neighbors or ) . Together, these links form the . A node picks its neighbors according to a certain structure, called the .
All DHT topologies share some variant of the most essential property: for any key , each node either has a node ID that owns or has a link to a node whose node ID is closer to , in terms of the keyspace distance defined above. It is then easy to route a message to the owner of any key using the following (that is not necessarily globally optimal): at each step, forward the message to the neighbor whose ID is closest to . When there is no such neighbor, then we must have arrived at the closest node, which is the owner of as defined above. This style of routing is sometimes called keybased routing.
Beyond basic routing correctness, two important constraints on the topology are to guarantee that the maximum number of in any route (route length) is low, so that requests complete quickly; and that the maximum number of neighbors of any node (maximum node ) is low, so that maintenance overhead is not excessive. Of course, having shorter routes requires higher . Some common choices for maximum degree and route length are as follows, where is the number of nodes in the DHT, using :
Max. degree  Route length  Used in  Note 

<math>O(1)</math>  <math>O(n)</math>  
<math>O(log n)</math>  <math>O(log n/log (log n))</math>  Koorde  
<math>O(log n)</math>  <math>O(log n)</math>  Chord  most common, but not optimal (degree/route length) 
<math>O(1)</math>  <math>O(log n)</math>  
<math>O(sqrt{n})</math>  <math>O(1)</math> 
The most common choice, <math>O(log n)</math> degree/route length, is not optimal in terms of degree/route length tradeoff, but such topologies typically allow more flexibility in choice of neighbors. Many DHTs use that flexibility to pick neighbors that are close in terms of latency in the physical underlying network.
Maximum route length is closely related to : the maximum number of hops in any shortest path between nodes. Clearly, the network’s worst case route length is at least as large as its diameter, so DHTs are limited by the degree/diameter tradeoff that is fundamental in . Route length can be greater than diameter, since the greedy routing algorithm may not find shortest paths.
Algorithms for overlay networks
Aside from routing, there exist many algorithms that exploit the structure of the overlay network for sending a message to all nodes, or a subset of nodes, in a DHT. These algorithms are used by applications to do , range queries, or to collect statistics. Two systems that are based on this approach are Structella, which implements flooding and random walks on a Pastry overlay, and DQDHT, which implements a dynamic querying search algorithm over a Chord network.
Security
Because of the decentralization, fault tolerance, and scalability of DHTs, they are inherently more resilient against a hostile attacker than a typical centralized system.
Open systems for that are robust against massive hostile attackers are feasible.
A DHT system that is carefully designed to have can defend against a .
DHT implementations
Most notable differences encountered in practical instances of DHT implementations include at least the following:
 The address space is a parameter of DHT. Several real world DHTs use 128bit or 160bit key space
 Some realworld DHTs use hash functions other than SHA1.
 In the real world the key <math>k</math> could be a hash of a file’s content rather than a hash of a file’s name to provide , so that renaming of the file does not prevent users from finding it.
 Some DHTs may also publish objects of different types. For example, key <math>k</math> could be the node <math>ID</math> and associated data could describe how to contact this node. This allows publicationofpresence information and often used in IM applications, etc. In the simplest case, <math>ID</math> is just a random number that is directly used as key <math>k</math> (so in a 160bit DHT <math>ID</math> will be a 160bit number, usually randomly chosen). In some DHTs, publishing of nodes’ IDs is also used to optimize DHT operations.
 Redundancy can be added to improve reliability. The <math>(k, data)</math> key pair can be stored in more than one node corresponding to the key. Usually, rather than selecting just one node, real world DHT algorithms select <math>i</math> suitable nodes, with <math>i</math> being an implementationspecific parameter of the DHT. In some DHT designs, nodes agree to handle a certain keyspace range, the size of which may be chosen dynamically, rather than hardcoded.
 Some advanced DHTs like Kademlia perform iterative lookups through the DHT first in order to select a set of suitable nodes and send <math>put(k, data)</math> messages only to those nodes, thus drastically reducing useless traffic, since published messages are only sent to nodes that seem suitable for storing the key <math>k</math>; and iterative lookups cover just a small set of nodes rather than the entire DHT, reducing useless forwarding. In such DHTs, forwarding of <math>put(k, data)</math> messages may only occur as part of a selfhealing algorithm: if a target node receives a <math>put(k, data)</math> message, but believes that <math>k</math> is out of its handled range and a closer node (in terms of DHT keyspace) is known, the message is forwarded to that node. Otherwise, data are indexed locally. This leads to a somewhat selfbalancing DHT behavior. Of course, such an algorithm requires nodes to publish their presence data in the DHT so the iterative lookups can be performed.
Examples
 DHT protocols and implementations
 Applications employing DHTs
<!– NOTE ABOUT ADDING ITEMS TO THIS SECTION: Please provide a reference so the editors know that (1) the application actually uses a DHT, and (2) the project is notable as opposed to just someone’s project from networking class. You can do this by providing a wikilink here or putting a note or external link on the Talk page. PLEASE DO NOT ADD BITTORRENT CLIENTS as these are already covered via the link to “Comparison of BitTorrent clients”; there’s no reason to duplicate that list here. –>

 : DHT search engine
 : routing engine for meshbased networks
 : a decentralized web application deployment platform
 : web caching
 Coral Content Distribution Network
 FAROO: peertopeer Web search engine
 Freenet: a censorshipresistant anonymous network
 : a distributed file system used for storage virtualization
 GNUnet: Freenetlike distribution network including a DHT implementation
 I2P: An opensource anonymous peertopeer network.
 I2PBote: serverless secure anonymous email.
 : A contentaddressable, peertopeer hypermedia distribution protocol
 JXTA: opensource P2P platform
 : an inmemory data grid built on top of a Java DHT implementation
 : a peertopeer application from Japan
 : a Friendtofriend network
 : a privacypreserving voice, video and chat communication platform, based on a Kademlialike DHT
 : an system intended to function as a replacement
 Twister: a peertopeer platform
 YaCy: a distributed
See Also on BitcoinWiki
 : a persistent, replicated, clustered distributed object storage system compatible with memcached protocol.
 : a highperformance, distributed memory object caching system.
 Prefix hash tree: sophisticated querying over DHTs.
 Merkle tree: tree having every nonleaf node labelled with the hash of the labels of its children nodes.
 Most distributed data stores employ some form of DHT for lookup.