Está en la página 1de 25

The Chord Protocol

Haimonti Dutta
MGS 655
Some slides are obtained from
www.cs.berkeley.edu/~kubitron/courses/cs294-4-F03/slides/lec03-chord.ppt
What is Chord?
A peer-to-peer lookup service

Solves problem of locating a data item in a collection of distributed nodes,


considering frequent node arrivals and departures

Core operation in most P2P systems is efficient location of data items

Supports just one operation: given a key, it maps the key onto a node

2
The lookup problem

N2
N1 N3

Key=title Internet
Value=MP3 data
Publisher

N4 N6
N5 ?
Client
Lookup(title)
Characteristics
Simplicity, provable correctness, and provable performance

Each Chord node needs routing information about only a few other nodes

Resolves lookups via messages to other nodes (iteratively or recursively)

Maintains routing information as nodes join and leave the system

4
Addressed Difficult Problems

Load balance: distributed hash function, spreading keys evenly over nodes

Decentralization: chord is fully distributed, no node more important than


other, improves robustness

Scalability: logarithmic growth of lookup costs with number of nodes in


network, even very large systems are feasible

Availability: chord automatically adjusts its internal tables to ensure that


the node responsible for a key can always be found

Flexible naming: no constraints on the structure of the keys key-space is


flat, flexibility in how to map names to Chord keys

5
Mapping onto Nodes vs. Values
Traditional name and location services provide a direct mapping between
keys and values

What are examples of values? A value can be an address, a document, or


an arbitrary data item

Chord can easily implement a mapping onto values by storing each


key/value pair at node to which that key maps

6
The Base Chord Protocol
Specifies how to find the locations of keys

How new nodes join the system

How to recover from the failure or planned


departure of existing nodes
Construction of the Chord Ring
identifiers are arranged on a identifier
circle modulo 2m => Chord ring
a key k is assigned to the node whose
identifier is equal to or greater than the
keys identifier
this node is called successor(k) and is the
first node clockwise from k.
Successor Nodes
identifier
node
6
X key
1
0 successor(1) = 1
7 1

identifier
successor(6) = 0 6 6 circle 2 2 successor(2) = 3

5 3
4
2

9
Scalable Key Location
m: number of bits in key/node identifier
Finger Table: Each node n has a routing table
with up to m entries
ith entry of finger table at node n contains the
identity of the first node s that suceeds n by at
least 2i-1 on the identifier circle.
S= successor (n + 2i-1)
Definition of variables for node n,
using m bit identifiers
Finger Table: Characteristics
Each node stores information about only a
small number of other nodes
Nodes finger table generally does not contain
enough information to directly determine the
successor of any arbitrary key k
Repetitive queries to nodes that immediately
precede the given key will lead to the keys
successor eventually
Average lookup time = log N
Finger Tables - characteristics
Each node stores information about only a small number of other nodes,
and knows more about nodes closely following it than about nodes farther
away

A nodes finger table generally does not contain enough information to


determine the successor of an arbitrary key k

Repetitive queries to nodes that immediately precede the given key will
lead to the keys successor eventually

13
Finger Tables : An Example
finger table keys
start int. succ. 6
1 [1,2) 1
2 [2,4) 3
4 [4,0) 0

finger table keys


0 start int. succ. 1

7 1 2 [2,3) 3
3 [3,5) 3
5 [5,1) 0

6 2

finger table keys


5 3 start int. succ. 2
4 [4,5) 0
4 5 [5,7) 0
7 [7,3) 0

14
Node Joins: How does it happen
When node n joins, calls n.join(n) [n is a
known chord node]
It can also call n.create()
Join: Find immediate successor of n
Stabilize: Run procedure to learn of new
nodes that have joined.
The Stabilize function for node n
Ask successor for its predecessor p
Decide whether p should be ns successor
Notifies ns successors of ns existence giving
successor the chance to change predecessor
to n
Fix-fingers: Updates finger tables
Check-predecessor: clear nodes predecessor
pointer if n.predecessor fails
Node Joins and Departures

6 1
successor(6) = 7 0
7 1 successor(1) = 3

6 2

5 3
4
2 1

17
Node Joins with Finger Tables
finger table keys
start int. succ. 6
1 [1,2) 1
2 [2,4) 3
4 [4,0) 06

finger table keys


0 start int. succ. 1

7 1 2 [2,3) 3
3 [3,5) 3
5 [5,1) 06
finger table keys
start int. succ. 6 2
7 [7,0) 0
0 [0,2) 0 finger table keys
2 [2,6) 3
5 3 start int. succ. 2
4 [4,5) 06
4 5 [5,7) 06
7 [7,3) 0

18
Node Departures with Finger Tables

finger table keys


start int. succ.
1 [1,2) 13
2 [2,4) 3
4 [4,0) 06

finger table keys


0 start int. succ. 1

7 1 2 [2,3) 3
3 [3,5) 3
5 [5,1) 06
finger table keys
start int. succ. 6 6 2
7 [7,0) 0
0 [0,2) 0 finger table keys
2 [2,6) 3
5 3 start int. succ. 2
4 [4,5) 6
4 5 [5,7) 6
7 [7,3) 00

19
Source of Inconsistencies:
Concurrent Operations and Failures

Basic stabilization protocol is used to keep nodes successor pointers up


to date, which is sufficient to guarantee correctness of lookups

Those successor pointers can then be used to verify the finger table
entries

Every node runs stabilize periodically to find newly joined nodes

20
Stabilization after Join
n joins
ns
predecessor = nil
n acquires ns as successor via some n
n notifies ns being the new predecessor
ns acquires n as its predecessor
pred(ns) = n

np runs stabilize
np asks ns for its predecessor (now n)
n
np acquires n as its successor
np notifies n
succ(np) = ns

pred(ns) = np

n will acquire np as its predecessor

nil all predecessor and successor pointers are


succ(np) = n

now correct

fingers still need to be fixed, but old fingers


will still work
np
21
When does the protocol not work?
All r successors of a node n fail
simultaneously!
Failure Recovery
Key step in failure recovery is maintaining correct successor pointers

To help achieve this, each node maintains a successor-list of its r nearest


successors on the ring

If node n notices that its successor has failed, it replaces it with the first live entry
in the list

stabilize will correct finger table entries and successor-list entries pointing to failed
node

Performance is sensitive to the frequency of node joins and leaves versus the
frequency at which the stabilization protocol is invoked

23
Chord The Math
Every node is responsible for about K/N keys (N nodes, K keys)

When a node joins or leaves an N-node network, only O(K/N) keys change
hands (and only to and from joining or leaving node)

Lookups need O(log N) messages

To reestablish routing invariants and finger tables after node joining or


leaving, only O(log2N) messages are required

24
Experimental Results

Latency grows slowly with the


total number of nodes

Path length for lookups is about


log2N

Chord is robust in the face of


multiple node failures

25