Information about Distributed Hash Table

Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (name, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, peer-to-peer file sharing and content distribution systems, cooperative web caching, multicast, anycast, domain name services, and instant messaging. Applications that use DHTs include BitTorrent, eMule, YaCy, and the Coral Content Distribution Network.

History

DHT research was originally motivated, in part, by peer-to-peer systems such as Napster, Gnutella, and Freenet, which took advantage of resources distributed across the Internet to provide a single useful application. In particular, they took advantage of increased bandwidth and hard disk capacity to provide a file sharing service.

These systems differed in how they found the data their peers contained. Napster had a central index server: each node, upon joining, would send a list of locally held files to the server, which would perform searches and refer the querier to the nodes that held the results. This central component left the system vulnerable to attacks and lawsuits. Gnutella and similar networks moved to a flooding query model—in essence, each search would result in a message being broadcast to every other machine in the network. While avoiding a single point of failure, this method was significantly less efficient than Napster. Finally, Freenet was also fully distributed, but employed a heuristic key based routing in which each file was associated with a key, and files with similar keys tended to cluster on a similar set of nodes. Queries were likely to be routed through the network to such a cluster without needing to visit many peers. However, Freenet did not guarantee that data would be found.

Distributed hash tables use a more structured key based routing in order to attain both the decentralization of Gnutella and Freenet, and the efficiency and guaranteed results of Napster. One drawback is that, like Freenet, DHTs only directly support exact-match search, rather than keyword search, although that functionality can be layered on top of a DHT.

The first four DHTs—CAN, Chord,[1] Pastry, and Tapestry—were introduced about the same time in 2001. Since then this area of research has been quite active. Outside academia, DHT technology has been adopted as a component of BitTorrent and in the Coral Content Distribution Network.

Properties

DHTs characteristically emphasize the following properties:
  • Decentralisation: the nodes collectively form the system without any central coordination.
  • Scalability: the system should function efficiently even with thousands or millions of nodes.
  • Fault tolerance: the system should be reliable (in some sense) even with nodes continuously joining, leaving, and failing.
A key technique used to achieve these goals is that any one node needs to coordinate with only a few other nodes in the system – most commonly, of the participants (see below) – so that only a limited amount of work needs to be done for each change in membership.

Some DHT designs seek to be secure against malicious participants and to allow participants to remain anonymous, though this is less common than in many other peer-to-peer (especially file sharing) systems; see anonymous P2P.

Finally, DHTs must deal with more traditional distributed systems issues such as load balance, data integrity, and performance (in particular, ensuring that operations such as routing and data storage or retrieval complete quickly).

Structure

The structure of a DHT can be decomposed into several main components.[2][3] The foundation is an abstract keyspace, such as the set of 160-bit strings. A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes. An overlay network then connects the nodes, allowing them to find the owner of any given key in the keyspace.

Once these components are in place, a typical use of the DHT for storage and retrieval might proceed as follows. Suppose the keyspace is the set of 160-bit strings. To store a file with given and in the DHT, the SHA1 hash of is found, producing a 160-bit key , and a message is sent to any node participating in the DHT. The message is forwarded from node to node through the overlay network until it reaches the single node responsible for key as specified by the keyspace partitioning, where the pair is stored. Any other client can then retrieve the contents of the file by again hashing to produce and asking any DHT node to find the data associated with with a message . The message will again be routed through the overlay to the node responsible for , which will reply with the stored .

The keyspace partitioning and overlay network components are described below with the goal of capturing the principal ideas common to most DHTs; many designs differ in the details.

Keyspace partitioning

Most DHTs use some variant of consistent hashing to map keys to nodes. This technique employs a function which defines an abstract notion of the distance from key to key , which is unrelated to geographical distance or network latency. Each node is assigned a single key called its identifier (ID). A node with ID owns all the keys for which is the closest ID, measured according to .

Example. The Chord DHT treats keys as points on a circle, and is the distance traveling clockwise around the circle from to . Thus, the circular keyspace is split into contiguous segments whose endpoints are the node identifiers. If and are two adjacent IDs, then the node with ID owns all the keys that fall between and .


Consistent hashing has the essential property that removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected. Contrast this with a traditional hash table in which addition or removal of one bucket causes nearly the entire keyspace to be remapped. Since any change in ownership typically corresponds to bandwidth-intensive movement of objects stored in the DHT from one node to another, minimizing such reorganization is required to efficiently support high rates of churn (node arrival and failure).

Overlay network

Each node maintains a set of links to other nodes (its neighbors or routing table). Together these links form the overlay network. A node picks its neighbors according to a certain structure, called the network's topology.

All DHT topologies share some variant of the most essential property: for any key , the node either owns or has a link to a node that is closer to in terms of the keyspace distance defined above. It is then easy to route a message to the owner of any key using the following greedy algorithm: at each step, forward the message to the neighbor whose ID is closest to . When there is no such neighbor, then we must have arrived at the closest node, which is the owner of as defined above. This style of routing is sometimes called key based routing.

Beyond basic routing correctness, two key constraints on the topology are to guarantee that the maximum number of hops in any route (route length) is low, so that requests complete quickly; and that the maximum number of neighbors of any node (maximum node degree) is low, so that maintenance overhead is not excessive. Of course, having shorter routes requires higher maximum degree. Some common choices for maximum degree and route length are as follows, where is the number of nodes in the DHT, using Big O notation:
  • Degree , route length
  • Degree , route length
  • Degree , route length
  • Degree , route length
The third choice is the most common, even though it is not quite optimal in terms of degree/route length tradeoff, because such topologies typically allow more flexibility in choice of neighbors. Many DHTs use that flexibility to pick neighbors which are close in terms of latency in the physical underlying network.

Maximum route length is closely related to diameter: the maximum number of hops in any shortest path between nodes. Clearly the network's route length is at least as large as its diameter, so DHTs are limited by the degree/diameter tradeoff[4] which is fundamental in graph theory. Route length can be greater than diameter since the greedy routing algorithm may not find shortest paths.[5]

Algorithms for Overlay Networks

Aside from routing, there exist many algorithms which exploit the structure of the overlay network for sending a message to all nodes, or a subset of nodes, in a DHT.[6] These algorithms are used by applications to do overlay multicast, range queries, or to collect statistics.

Examples

DHT protocols and implementations

Applications employing DHTs

See also

References

1. ^ Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica. Looking up data in P2P systems. In Communications of the ACM, February 2003.
2. ^ Moni Naor and Udi Wieder. Novel Architectures for P2P Applications: the Continuous-Discrete Approach. Proc. SPAA, 2003.
3. ^ Gurmeet Singh Manku. Dipsea: A Modular Distributed Hash Table. Ph. D. Thesis (Stanford University), August 2004.
4. ^ [1]
5. ^ Gurmeet Singh Manku, Moni Naor, and Udi Wieder. Know thy Neighbor's Neighbor: the Power of Lookahead in Randomized P2P Networks. Proc. STOC, 2004.
6. ^ Ali Ghodsi. Distributed k-ary System: Algorithms for Distributed Hash Tables. KTH-Royal Institute of Technology, 2006.

External links

Distributed computing is a method of computer processing in which different parts of a program run simultaneously on two or more computers that are communicating with each other over a network.
..... Click the link for more information.
In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person's name), find the corresponding value (e.g. that person's telephone number).
..... Click the link for more information.
A node is a device that is connected as part of a computer network. For example, a node may be a computer, personal digital assistant, cell phone, router, switch, or hub.
..... Click the link for more information.
In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged.
..... Click the link for more information.
network file system is any computer file system that supports sharing of files, printers and other resources as persistent storage over a computer network. The first file servers were developed in the 1970s, and in 1985 Sun Microsystems created the Network File System (NFS) which
..... Click the link for more information.
peer-to-peer (or "P2P") computer network exploits diverse connectivity between participants in a network and the cumulative bandwidth of network participants rather than conventional centralized resources where a relatively low number of servers provide the core value to a
..... Click the link for more information.
See Shared resource for the conventional meaning of file sharing
File sharing is the practice of making files available for other users to download over the Internet and smaller networks.
..... Click the link for more information.
Content delivery describes the delivery of digital media "content" such as digital audio or digital video or computer software and games over a delivery medium such as broadcasting or the Internet.
..... Click the link for more information.
Web caching is the caching of web documents (e.g., HTML pages, images) in order to reduce bandwidth usage, server load, and perceived lag. A web cache stores copies of documents passing through it; subsequent requests may be satisfied from the cache if certain conditions are met.
..... Click the link for more information.
Multicast is the delivery of information to a group of destinations simultaneously using the most efficient strategy to deliver the messages over each link of the network only once, creating copies only when the links to the destinations split.
..... Click the link for more information.
Anycast is a network addressing and routing scheme whereby data is routed to the "nearest" or "best" destination as viewed by the routing topology.

The term is intended to echo the terms unicast, broadcast and multicast.
..... Click the link for more information.
On the Internet, the Domain Name System (DNS) associates various sorts of information with so-called domain names; most importantly, it serves as the "phone book" for the Internet by translating human-readable computer hostnames, e.g. en.wikipedia.
..... Click the link for more information.
Instant messaging (IM) is a form of real-time communication between two or more people based on typed text. The text is conveyed via computers connected over a network such as the Internet.
..... Click the link for more information.
BitTorrent is a peer-to-peer file sharing (P2P) communications protocol. BitTorrent is a method of distributing large amounts of data widely without the original distributor incurring the entire costs of hardware, hosting and bandwidth resources.
..... Click the link for more information.
eMule is a peer-to-peer file sharing application for Microsoft Windows. Started in May 2002 as an alternative to eDonkey2000, eMule now connects to both the eDonkey network and the Kad network.
..... Click the link for more information.
YaCy (read "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. Its core is a computer program written in Java distributed on several hundred of computers, as of September 2006, so-called YaCy-peers.
..... Click the link for more information.
The Coral Content Distribution Network, sometimes called Coral Cache or Coral for short, is a free peer-to-peer content distribution network designed to mirror web content.
..... Click the link for more information.
peer-to-peer (or "P2P") computer network exploits diverse connectivity between participants in a network and the cumulative bandwidth of network participants rather than conventional centralized resources where a relatively low number of servers provide the core value to a
..... Click the link for more information.
Napster was a file sharing service that paved the way for decentralized P2P file-sharing programs such as Kazaa, Limewire, iMesh, Morpheus, and BearShare, which are now used for many of the same reasons and can download music, pictures, and other files.
..... Click the link for more information.
Gnutella (pronounced: /nʊˈtɛlə/ with a silent g, or alternatively /gnʊˈtɛlə/) is a file sharing network.
..... Click the link for more information.
Maintainer: Ian Clarke

OS: Cross-platform

Use: Anonymity, Peer to peer, Friend to friend
License: GNU General Public License
Website: [1]

In computer science, Freenet
..... Click the link for more information.
Bandwidth is the difference between the upper and lower cutoff frequencies of, for example, a filter, a communication channel, or a signal spectrum, and is typically measured in hertz.
..... Click the link for more information.
Hard disk drive

An IBM hard disk drive with the metal cover removed. The platters are highly reflective.
Date Invented: September 13 1956
Invented By: An IBM team led by Reynold Johnson
Connects to:
..... Click the link for more information.
Reliability engineering is an engineering field, that deals with the study reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time.[1] It is often reported in terms of a probability.
..... Click the link for more information.
Key based routing (KBR) is a lookup method used in conjunction with distributed hash tables (DHTs) and certain other overlay networks. While DHTs provide a method to find a host responsible for a certain piece of data, KBR provides a method to find the closest
..... Click the link for more information.
The Content Addressable Network (CAN) was one of the original four distributed hash table proposals (Ratnasamy 2001), introduced concurrently with Chord, Pastry, and Tapestry.
..... Click the link for more information.
Chord is one of the original distributed hash table protocols. Chord is being developed at MIT and the current Chord source code can be downloaded and used under the MIT License.

Overview

Using the Chord lookup protocol, node keys are arranged in a circle.
..... Click the link for more information.
Pastry is an overlay and routing network for the implementation of a distributed hash table similar to Chord. The key-value pairs are stored in a redundant peer-to-peer network of connected Internet hosts.
..... Click the link for more information.
Tapestry is a distributed hash table which provides a decentralized object location, routing, and multicasting infrastructure for distributed applications. It is composed of a peer-to-peer overlay network offering efficient, scalable, self-repairing, location-aware routing to
..... Click the link for more information.
BitTorrent is a peer-to-peer file sharing (P2P) communications protocol. BitTorrent is a method of distributing large amounts of data widely without the original distributor incurring the entire costs of hardware, hosting and bandwidth resources.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter