Está en la página 1de 61

TRNG I HC CNG NGH THNG TIN KHOA K THUT MY TNH

PEER TO PEER SYSTEM


MN: X L SONG SONG V H THNG PHN TN

SVTH:

HNG QUC MINH - 08520230 TRN HONG LUN 08520221 GVGD: THIU XUN KHNH
1

Ni dung
1. 2. 3.

Gii thiu Napster Peer-to-peer middleware

4.
5.

Routing overlays
Overlay case studies: Pastry, Tapestry

6.

Application case studies: Squirel, OceanStore, Ivy

1. Gii thiu

Peer to peer system (p2p) l g?


- Dng ch nhng h thng phn tn m khng c my tnh iu khin trung tm. - Tt c cc my tnh tham gia (node, peer) u c chc nng ging nhau. - Mt node c chc nng va l client va l server ca cc node khc.

Peer to peer system l g?

Cc ng dng

Phn loi

Unstructure P2P
Ni lu gi file khng lin quan n overlay topology (cu trc hnh hc ca mng). - K thut tm kim:
-

- n gin (ch yu):


- flooding vi cc gii thut u tin theo chiu rng (breadthfirst) hoc chiu su (depth-first).

- Phc tp:
- bc nhy ngu nhin (random walk), - ch s routing (routing indices)

Ph hp vi h thng c cc node vo ra thng xuyn ty .

Unstructure P2P

Th h th nht
9

Unstructure P2P

Th h th hai Mng ng dng: Gnutela 0.6, Kazaa, Skype


10

Structure P2P
-

Cung cp nh x gia ni dung (id ca file) vi v tr ca node (a ch ca node). Khc phc nhc im tm kim ca mng khng cu trc bng cch s dng h thng bng bm phn tn (DHTDistributed Hash Table)

11

Structure P2P
Lin kt gia cc nt mng trong mng ph theo mt thut ton c th - Mi nt mng s chu trch nhim i vi mt phn d liu chia s trong mng. - Mng ng dng:
-

- Pastry, Tapestry, CAN, Chord, Kademlia.

12

C ch ca DHT

13

Cc c trng ca P2P system


m bo mi user u c th ng gp ti nguyn cho h thng. Tt c cc node trong h thng c chc nng v trch nhim nh nhau. Hot ng khng ph thuc vo h thng qun l trung tm. C th gii hn mc n danh i vi nh cung cp v ngi s dng Cho php la chn thut ton v v tr ca d liu trn nhiu my ch v sau truy cp vo n c cn bng v sn c m khng phi thm chi ph no.

14

Cc c trng ca P2P system

u im:
- Khng cn server ring, cc client chia s ti nguyn, khi mng cng m rng th kh nng hot ng cng tt. - R. - D ci t v bo tr - Thun li cho vic chia s file, my in, CDROM

Nhc im:
- Chm - Khng tt cho cc ng dng CSDL. - Km tin cy.

15

NAPSTER
L mng P2P c quy m ln du tin trn th gii Thnh lp nm 1999 Chia s nhc qua mng internet Cc file nhc c to v chia s bi c nhn, thng l copy t CD.

16

Napster: chia s file ngang hng vi ch mc trung tm, bn sao

17

P2p middleware
-

Cung cp c ch gip client truy cp ti nguyn nhanh v c lp v tr ca chng . Cc nodes cn xc nh v tr v truy xut bt k ti nguyn sn c no mc d ti nguyn c phn b rng khp v lin tc c thm mi hoc xa b.

18

P2P MIDLEWARE

Cc yu cu chc nng:

n gin ha vic xy dng dch v l thc hin trn nhiu host c phn b rng khp. - Kh nng thm mi v xa b ti nguyn cng nh thm host n dch v v xa chng. - Tng t middleware, p2p middleware cung cp giao din ngi lp trnh c lp vi loi ti nguyn phn b m chng trnh thao tc.

19

P2P midleware

Cc yu cu phi chc nng: Global Scalability Load Balancing Local Optimization Accommodating to high dynamic host availability Security of data Anonymity, deniability, and resistance to censorship
20

Routing overlay
L thut ton phn b ca p2p midleware chu trch nhim nh v node v i tng. Mt node c th truy cp ti nguyn bng cch nh hng yu cu qua mt chui cc node. Mt yu cu c nh hng n node gn nht c cha bn sao ca ti nguyn yu cu.

21

Routing overlay
Global User IDs (GUID) nh danh node v object. GUID thng c lu di dng s 128 bits, hin th bng s hexa 32

V d: 21EC2020-3AEA-1069-A2DD08002B30309D

c tnh ton bng m bm secure hash: SHA-1

22

Figure 10.1: Distinctions between IP and overlay routing for peer-to-peer applications
IP Scale IP v4 is li m ited to 232 addressablenodes. The IP v6 name space is much more generous (2128), but addresses in both versions are hierarchically structured n a d much of the space is pre-allocated accordi ng to administrative requirements. Loads on routers are determin ed by network topologyand associated traffic patterns. Application -level routing over lay P eer-to-peer systems can addressmore objects. The GUID name space is very largeand flat (>2128), allowing it to be much morefully occupied.

Load balanc ing Network dynamics ( addition/deletion of objects/no des) Fault tolerance

Target identificatio n Security andanonymity

Object locations can be ra ndomized and hence traffic patterns are divorced from the network topology. IP routingtables are updated asy nchronouslyon Routing tables can be u pdated synchronously or a best-efforts basis with time constants on the asynchronously with fractions of a second order of 1 hour. delays. Redundancy is designed into the IP network by Routes and object refer ences can be replicated its managers, ensuring toleran ce of a single n-fold, ensuring toleran ce of n failures of nodes router or network co nnectivityfailure. n-fold or connections. replication is costly. Each IP address maps to exactly one target Messages can be rout ed to the nearest replica of node. a target object. Addressing is only secu re when all nodes are Security can be achiev ed even in environm ents trusted. Anonymity for the owners of addresses with lim ited trust.A limited degree of is not achievable. anonymity can be provided.

Instructors Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012

23

Figure 10.3: Distribution of information in a routing overlay


As rou ti ng kno wle dge Ds rou ti ng kno wle dge

A D

B Obj ect: No de: Bs rou ti ng kno wle dge Cs rou ti ng kno wle dge

Instructors Guide for Coulouris, Dollimore, Kindberg and Blair, Distributed Systems: Concepts and Design Edn. 5 Pearson Education 2012

24

Basic programming interface for a distributed hash table (DHT) as implemented by the PAST API over Pastry:

Put(GUID, data) Remove(GUID)

Publish an object with GUID. The data is stored in all the nodes responsible for a replica. Deletes all references to GUID and the associated data.

Value = get(GUID)

The data associated with GUID is retrieved from one of the nodes responsible it.

The DHT layer take responsibility for choosing a location for data item, storing it (with replicas to ensure availability) and providing access to it via get() operation.

25

ROUTING OVERLAY
C ch hot ng: When a client requires to publish a resource, it has to. . .
1. compute the GUID 2. ask the routing overlay to publish it

When the routing overlay is asked to publish a resource, it. . .


1. stores the resource in the node whose GUID is closest to that of the resource 2. stores r replicas of the resources in the r nodes whose GUIDs are closest to that of the resource. r is the replication factor
26

Basic programming interface for distributed object location and routing (DOLR) as implemented by Tapestry

Publish(GUID)

GUID can be computed from the object. This function makes the node performing a publish operation the host for the object corresponding to GUID. Makes the object corresponding to GUID inaccessible. Sent a message msg to n replicas of object whose GUID is GUID

Unpublish(GUID)

SendToObj(msg, GUID, [n])

Object can be stored anywhere and the DOLR layer is responsible for maintaining a mapping between GUIDs and the addresses of the nodes at which replicas of the objects are located.
27

Overlay case studies: Pastry and Tapestry


C 2 lp nh tuyn dng cho mng ng ng c cu trc v cho php nh tuyn tin t (the prefix routing approach)

28

Pastry
Pastry l mt lp nh tuyn c a ra bi [Rowstron and Druschel 2001, Castro et al. 2002a, freepastry.org ]. Tt c cc node v i tng trong Pastry c gn vi 128 bit GUIDs. GUID c tnh ton bng cch p dng mt hm bm an ton i vi:

Public key: nu l cc node Objects name hoc objects storage state: nu l cc i tng (chng hn nh files).

29

Pastry
Trong mt mng c N node, thut ton nh tuyn Pastry s gi mt gi tin chnh xc n bt k GUID no trong O(logN) bc. Nu GUID xc nh mt node ang hot ng th tin nhn c gi n node .

Nu khng, tin nhn s c gi n node c s GUID gn n nht.

Cc node ang hot ng c trch nhim x l cc yu cu ca cc i tng ln cn.

30

Pastry
Thut ton nh tuyn y s s dng mt bng nh tuyn ti mi node chuyn tip thng ip n ch mt cch hiu qu nht. ( gii thch thut ton, ta c th chia n lm 2 giai on)

Giai on u m t hnh thc n gin ca thut ton (ch nhm mc ch gii thch). Giai on 2 m t hon chnh thut ton (s dng trong thc t).
31

Pastry

Giai on 1:

Mi node lu tr:
mt leaf set (tp l) mt vector L (kch thc 2l) cha GUIDs a ch IP ca cc node c s GUIDs nm v 2 bn gn n nht.

Cc leaf set c duy tr bi Pastry mi khi c node tham gia hoc ri khi mng.

Thm ch sau khi mt node gp li, n cng c th c sa cha rt nhanh. (Vn sa cha s c tho lun sau).

32

Figure 10.6: Circular routing alone is correct but inefficient


Based on Rowstron and Druschel [2001]

33

Pastry

Giai on 2:

S dng bng nh tuyn:


mi node s duy tr mt bng nh tuyn vi cu trc cy lu gi GUIDs. a ch IP cho mt tp 2128 gi tr GUIDs.

34

Figure 10.7: First four rows of a Pastry routing table

35

Figure 10.8: Pastry routing example

36

Figure 10.9: Pastrys routing algorithm

37

Pastry

Join to Pastry

38

Tapestry
Tng t Pastry Nhng c khc bit:

phng php nh x cc kha vo cc node cch qun l vic nhn rng mng li.

39

Tapestry

40

From structured to unstructured peer-to-peer


Structured peer-to-peer Unstructured peer-to-peer

u im

Bo m xc nh v tr cc C th t t chc v phc i tng, c th cung cp hi mt cch t nhin cc thi gian v phc tp node b li. ca hot ng nh tuyn.

Nhc im

Cn duy tr thng xuyn cu trc lp phc tp, gy kh khn v kh tn chi ph, c bit l trong mi trng m cc node tham gia mt cch linh ng.

C tnh xc sut nn khng th m bo xc nh c v tr ca cc i tng. Chi ph cho cc thng ip qu nhiu do nh hng n kh nng m rng.
41

GNUtella

Phn loi node:


Node l:
Ultrapeer:
ch duy tr mt kt ni duy nht n mt ultrapeer. c th duy tr nhiu kt ni vi cc node l (10-100) mt s lng nh kt ni n cc ultrapeer khc (<10).

42

GNUtella

nh tuyn vi ultrapeer c th thc hin theo 2 cch:


Reflector indexing (lp ch mc i chiu):
Ultrapeer gi cc truy vn lp ch mc nh k xung cc node l, cp nht cc tp tin c chia s thay mt cc node l p ng cc lu lng truy vn. Dng gim s truy cn ca cc node

nh tuyn QRP (Query Routing Protocol): QRT (Query Routing Table)

43

GNUtella

44

GNUtella

45

Application case studies: Squirrel, OceanStore, Ivy


Cc lp ph nh tuyn m t trn c th nghim trong mt s ng dng v kt qu c nh gi rng ri. 3 trong s cc ng dng tiu biu s c cp sau y l:

Squirrel web caching service: OceanStore. Ivy file stores.


da trn Pastry.

46

Squirrel web cache

Cc tc gi ca Pastry pht trin Squirrel web caching s dng trong mng ni b ca my tnh cc nhn. Trong mt mng cc b va v ln, web caching thng c to ra bng cch dng mt hay mt cm cc my ch chuyn dng.

H thng Squirrel cng thc hin nhim v tng t nhng bng cch khai thc vic lu tr v cc ti nguyn tnh ton sn c ti cc my tnh c nhn.

47

Squirrel
Web cache

48

Squirrel
Trong Squirrel, mi node trong mng cho php cc node khc truy xut n web cache ca n. Nh vy, mi node ng c 2 vai tr l web browsing v web cache.

49

Squirrel

50

Squirrel

51

Squirrel

Kt qu thu c khi m phng m hnh ti trng trong hai mi trng thc t vi Microsoft (105 active clients in Cambridge and 36,000 in Redmond) c nh gi theo 3 tiu ch:

52

Squirrel

Gim bng thng ngoi mng:

tr khi truy cp:

Web cache server: 29%(Remond), 38%(Cambridge). Squirrel: 28%(Remond), 37%(Cambridge). Mi client ng gp 100MBytes vng nh lu tr web cache. Web cache service: 1 message duy nht truy cp cache. Squirrel: trung bnh 4,11 ln chuyn thng ip (Redmond) v 1,8 ln (Cambridge). Tuy nhin xt theo phn cng Ethernet th tr truy cp c xt theo mili giy (10-100), (cc tc gi ca Squirrel tranh lun v tr truy cp khi c nhiu i tng khng c tm thy trong b nh)

53

Squirrel

Chi ph cho vic tnh ton v lu tr t ln cc client nodes:

Trung bnh ch c 0,31 yu cu gi n mi node trong 1 pht (Remond) => t l ti nguyn tiu th l rt thp

54

OceanStore file store


Cc nh pht trin Tapestry xy dng mt nguyn mu cho vic lu tr cc tp tin ngang hng. N cho php lu tr cc tp tin c th thay i c. Thit k OceanStore [Kubiatowicz et al. 2000; Kubiatowicz 2003; Rhea et al. 2001, 2003] cung cp mt quy m rt ln cho vic m rng c s lu tr mt cch bn b v lu di trong mt mi trng lin tc thay i v kt ni mng v cc ti nguyn tnh ton.

55

OceanStore file store

56

OceanStore file store

57

OceanStore file store

58

Ivy file system


Tng t nh OceanStore Ivy l mt h thng file h tr a ngi dng trong mt lp ph nh tuyn da trn bng bm cc a ch d liu lu tr. Tuy nhin, im khc bit l h thng file ca Ivy c m phng nh l mt my ch Sun NFS.

N lu tr trng thi ca cc file di dng cc logs yu cu cp nht bi cc Ivy clients

59

Ivy file system

60

Ivy file system


Mt h thng tp tin Ivy ch bao gm mt tp hp cc bn ghi (logs), mi ngi tham d ch c cp duy nht 1 log. Cc logs c lu tr trong Dhash. Mi ngi tm kim d liu trong cc logs nhng sa i trong chnh logs ca mnh @@. Mc tiu ca s sp xp ny l gip Ivy duy tr siu d liu h thng m khng cn kha

61