Está en la página 1de 50

Distributed Systems

Important Questions Given By Sir


Solutions
Question 1 - Goals of distributed system. Define distributed system.
Question 2 - Different forms of transparency
Question 3 - Multiprocessor and Multicomputer
Question 4 - Threads (Client + Server)
Question 5 - Software Agents in distributed systems
Question 7 -RPC semantics
Question 8 -. Extended RPC – DOORS
Doors are RPCs implemented for processes on the same machine
A single mechanism for communication: procedure calls (but with doors, it is not
transparent)
Question 9 - RMI – static Vs Dynamic
Question 10 -. Parameter passing in RMI
Question 11. Name Resolution: a. Recursive b. Iterative
Recursive Name Resolution
Comparison
Question 12 - DNS or X-500 Name resolution or issues
DNS -
Question 13. Clock synchronization. Also Berkeley algorithm, Averaging
A few points on clock synchronization
Why need to synchronize clocks
Problem with Physical clock
Relative Clock Synchronization (Berkeley Algorithm)
Averaging algorithm
Question 14 - Logical clocks with example.
Lamport’s Algorithm
Question 15 - Vector Clocks
Vector clocks
Fidge’s Algorithm
Example
Question 16 - Lamport logical clock [problems with this approach]
Question 17. Election of a coordinator - a. Bully algorithm && b. Ring algorithm
Bully algorithm
Ring algorithm
Question 18 - Distributed Mutual Exclusion Algorithm a. Ricart-Agrawala algorithm b.
Token ring algorithm
Ricart-Agrawala algorithm
Terminology
Algorithm
Token ring algorithm
Question 19. Explain Transaction Models
QP-Sep2010. Distinguish between Strict and sequential consistency with an example for
each
Data-Centric Consistency Models
Strict consistency (related to absolute global time)
Sequential consistency (what we are used to - serializability)
Sequential consistency is a slightly weaker consistency model than strict consistency.
A data store is said to be sequentially consistent when it satisfies the following condition:
The result of any execution is the same as if the (read and write) operations by all
processes on the data store were executed in some sequential order, and the operations
of each individual process appear in this sequence in the order specified by its program.
Question 20 - Client centric consistency/ Monotonic read and write
Client centric consistency model
Monotonic-Read Consistency
Question 21 -. Replica placement – 3 types of replicas a. Client driven, b. Server driven,
c. Permanent
Question 31 - NFS – Architecture (basic), file system operations supported OR Explain
the basic NFS architecture for unix systems. Also list any eight file system operations
supported by NFS
Question 32 - Naming scheme in NFS with different types of mounting
Appendix
Questions from old question papers (Test 1)
Introduction
RPC
DNS
Model
RMI
Mobile
Communication
Test 3 Syllabus Questions
DSM(distributed shared memory) Algorithms (different types of algorithms)
Discuss granularity and replacement algorithms in DSM.
Different distribution algorithm components (Transfer policy/Location policy) etc –
Load distribution algorithms - Sender initiated distributed algorithms
Load distribution algorithms -Discuss Adaptive load distributive algorithms.

Important Questions Given By Sir


Distributed Systems
Red - Not attempted

1. Goals of distributed system. Define distributed system.


2. Different forms of transparency
3. Multicomputer / Multiprocessor
4. Threads (Client + Server)
5. Software Agents in distributed systems
6. X- windows system
7. RPC semantics
8. Extended RPC – DOORS
9. RMI – static Vs Dynamic
10. Parameter passing in RMI
11. Name Resolution:
a. Recursive
b. Iterative
12. DNS or X-500
Name resolution or issues
QP-Sep2010 Write short notes on DNS and X-500
13. Clock synchronization. Also Berkeley algorithm, Averaging
14. Logical clocks with example.
15. Vector Clocks
16. Lamport logical clock [problems with this approach]
17. Election of a coordinator
a. Bully algorithm
b. Ring algorithm
18. Distributed Election Algorithm
a. Ricart-Agrawala algorithm
b. Token ring algorithm
19. Explain Transaction Models
QP-Sep2010. Distinguish between Strict and sequential consistency with an example for each
20. Client centric consistency/ Monotonic read and write
21. Replica placement – 3 types of replicas
a. Client driven
b. Server driven
c. Permanent
22. Primary based consistency protocols
23. Fault tolerance – Different types of failures
24. Design issues – Failure masks (Process Resilience)
25. Five different types of classes of failure in RPC system - Solution along with the listing
a. Scalable reliable multicasting
b. Hierarchical – Non hierarchical scalable
26. Explain Virtual synchrony
27. General CORBA architecture and CORBA services
28. Messaging – Interoperability
29. DCOM – client server processor architecture of it.
QP-Sep2010.With the supporting diagram, explain in detail D-COM
30. Globe Object Model – Architecture and Services
31. NFS – Architecture (basic), file system operations supported
32. Naming scheme in NFS with different types of mounting
33. Caching and replication scheme in NFS.
34. Organization of CODA file system – Fault tolerance and security
QP-Sep2010 - With reference to CODA file system, explain communication, process and server
replication
35. DFS System – Different algorithms, Granularity & page replacement
36. List and explain load distribution algorithms in distributed systems and 4 components.
37. Sender/Receiver distributed algorithm
38. Adaptive load distribution algorithm
39. Authentication distribution technique using:
a. Key distribution centre
b. Public key cryptography
40. Access control – General issues
41. Explain key establishment or key distribution techniques
42. Kerberos authentication issues

43. Explain advantages of distributed systems


QP-Mar2010 - Explain about the architectural model of Distributed system
a. 3 different distributed systems architecture
i. Minicomputers model
ii.Workstation model
iii.Processor pool model

b. Issues in designing distributed systems (Any 3 issues)

44. Access protocol – Security for bus and ring topology – CDMA
45. Message passing models used to develop communication primitives
46. Compatibility/Resource management for specific issues

Exam Questions
QP-Sep2010. Briefly explain reliable client server communication
QP-Sep2010. Write short notes on memory coherence
QP-Sep2010.Explaing in detail the block cipher DES

Solutions
Question 1 - Goals of distributed system. Define
distributed system.
Question 2 - Different forms of transparency
Question 3 - Multiprocessor and Multicomputer

Flynn’s Classification of multiple-processor machines:

{SI, MI} x {SD, MD} = {SISD, SIMD, MISD, MIMD}

· SISD = Single Instruction Single Data


E.g. Classical Von Neumann machines.
· SIMD = Single Instruction Multiple Data
E.g. Array Processors or Data Parallel machines.
· MISD Does not exist
· MIMD Multiple Instruction Multiple Data Control parallelism.
Question 4 - Threads (Client + Server)
Question 5 - Software Agents in distributed systems
Question 7 -RPC semantics
During a RPC, one of the following problems may occur:

1. Request msg may be lost


2. Reply msg may be lost
3. Server & client may crash

Some strategies for different RPC msg delivery guarantees

1. Retry request message-retransmit the request msg until either a reply is received or
the server is assumed to have failed

2. Duplicate filtering-filtering duplicate requests at the server when retransmissions are


used

3. Retransmission of replies-keep a history of reply messages to enable lost replies to be


Question 8 -. Extended RPC – DOORS
Essence - Try to use the RPC mechanism as the only mechanism for interprocess
communication (IPC).

Doors are RPCs implemented for processes on the same machine

A single mechanism for communication: procedure calls (but with doors, it is not
transparent)
Question 9 - RMI – static Vs Dynamic
Reference : Page 3 of http://www.cs.gmu.edu/~setia/cs707/slides/rmi-imp.pdf

Question 10 -. Parameter passing in RMI


Question 11. Name Resolution: a. Recursive b.
Iterative

This question is related to ‘Implementation of Name Resolution’. Let us take an example of


ftp://ftp.cs.vu.nl/pub/globe/index.txt. There are now two ways to implement name
resolution.

Iterative Name Resolution


In iterative name resolution, a name resolver hands over the complete name to the root name
server. It is assumed that the address where the root server can be contacted, is well known.
The root server will resolve the path name as far as it can, and return the result to the client. In
our example, the root server can resolve only the label nl, for which it will return the address of
the associated name server.

Recursive Name Resolution

An alternative to iterative name resolution is to use recursion during name resolution. Instead of
returning each intermediate result back to the client’s name resolver, with recursive name
resolution, a name server passes the result to the next name server it finds. So, for example,
when the root name server finds the address of the name server implementing the node named
nl, it requests that name server to resolve the path name nl:<vu, cs, ftp, pub, globe, index.txt>.
Using recursive name resolution as well, this next server will resolve the complete path and
eventually return the file index.txt to the root server, which, in turn, will pass that file to the
client’s name resolve
Comparison
1. The main drawback of recursive name resolution is that it puts a higher performance
demand on each name server
2. There are two important advantages to recursive name resolution.
a. caching results is more effective compared to iterative name resolution.
b. communication costs may be reduced
3. With iterative name resolution, caching is necessarily restricted to the client’s name
resolver.

Reference: http://www.cs.vu.nl/~ast/books/ds1/04.pdf

Question 12 - DNS or X-500 Name resolution or


issues
DNS -

Question 13. Clock synchronization. Also Berkeley


algorithm, Averaging

A few points on clock synchronization

1. Distributed Synchronization
a. Communication between processes in a distributed system can have
unpredictable delays
b. No common clock or other precise global time source exists for distributed
algorithms
c. Requirement: We have to establish causality, i.e., each observer must see
d. event 1 before event 2

Why need to synchronize clocks

Problem with Physical clock


● Clock skew(offset): the difference between the times on two clocks
● Clock drift : they count time at different rates

Relative Clock Synchronization (Berkeley Algorithm)


Purpose
If you need a uniform time (without a UTC-receiver per computer), but you cannot establish a
central time-server:

1. Peers elect a master

2. Master polls all nodes to give him their times by the clock

3. The master estimates the local times of all nodes regarding the involved message
transfer times.

4. Master uses the estimated local times for building the arithmetic mean
a. Add fault tolerance

5. The deviations from the mean are sent to the nodes


a. Is this better than sending the actual time?

Averaging algorithm
a) The time daemon asks all the other machines for their clock values.
b) The machines answer.
c) The Time daemon tells everyone how to adjust their clock

Ref:http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch6-synch1-PhysicalClock-v2.pdf

Question 14 - Logical clocks with example.


A logical clock is a mechanism for capturing chronological and causal
relationships in a distributed system

Logical clock algorithms of note are:


● Lamport timestamps, which are monotonically increasing software
counters.
● Vector clocks, that allow for total ordering of events in a distributed
system.

Logical clock solves the following two problems

Problem: How do we maintain a global view on the system’s behavior that is consistent with the
happened-before relation?
Solution:attach a timestamp

Problem:How do we attach a timestamp to an event when there’s no global clock?


Solution: maintain a consistent set of logical clocks, one per process.

Lamport’s Algorithm
Each process Pi maintains a local counter Ci and adjusts this counter according to the following
rules:

1. For any two successive events that take place within Pi, Ci is incremented by 1.
2. Each time a message m is sent by process Pi, the message receives a timestamp Tm =
Ci.
3. Whenever a message m is received by a process Pj, Pj adjusts its local counter Cj:

Question 15 - Vector Clocks


Vector clocks
It is an algorithm for generating a partial ordering of events in a distributed system and detecting
causality violations. The vector clocks algorithm was developed byColin Fidge in
1988
Causality is the relationship between an event (the cause) and a second event (the effect), where the second event
is understood as a consequence of the first

Drawback of Lamport’s clocks


With Lamport’s clocks, one cannot directly compare the timestamps of two events to determine their
precedence relationship.

Fidge’s Algorithm

The Fidge’s logical clock is maintained as follows:

1. Initially all clock values are set to the smallest value.

2. The local clock value is incremented at least once before each primitive event in
a process.

3. The current value of the entire logical clock vector is delivered to the receiver for
every outgoing message.

4. Values in the timestamp vectors are never decremented.

5. Upon receiving a message, the receiver sets the value of each entry in its local
timestamp vector to the maximum of the two corresponding values in the local
vector and in the remote vector received.

The element corresponding to the sender is a special case; it is set to one


greater than the value received, but only if the local value is not greater
than that received.

Example

Assign the Fidge’s logical clock values for all the events in the below timing diagram. Assume
that each process’s logical clock is set to 0 initially.
Question 16 - Lamport logical clock [problems
with this approach]
Assign the Lamport’s logical clock values for all the events in the below timing diagram. Assume
that each process’s logical clock is set to 0 initially.
Solution
Question 17. Election of a coordinator - a. Bully
algorithm && b. Ring algorithm

Bully algorithm
The bully algorithm is a method in distributed computing for dynamically
selecting a coordinator by process ID number.
When a process P determines that the current coordinator is down because of
message timeouts or failure of the coordinator to initiate a handshake, it performs
the following sequence of actions:

1. P broadcasts an election message (inquiry) to all other processes with


higher process IDs.

2. If P hears from no process with a higher process ID than it, it wins the
election and broadcasts victory.

3. If P hears from a process with a higher ID, P waits a certain amount of


time for that process to broadcast itself as the leader. If it does not
receive this message in time, it re-broadcasts the election message.

Note that if P receives a victory message from a process with a lower ID number,
it immediately initiates a new election. This is how the algorithm gets its name - a
process with a higher ID number will bully a lower ID process out of the
coordinator position as soon as it comes online.

http://www.scribd.com/doc/6919757/BULLY-ALGORITHM
Ring algorithm
http://www2.cs.uregina.ca/~hamilton/courses/330/notes/distributed/distributed.html
☞ We assume that the processes are arranged in a logical ring; each process knows the
address of one other process, which is its neighbour in the clockwise direction.

☞ The algorithm elects a single coordinator, which is the process with the highest identifier.

☞ Election is started by a process which has noticed that the current coordinator has failed. The
process places its identifier in an election message that is passed to the following process.

☞ When a process receives an election message it compares the identifier in the message with
its own. If the arrived identifier is greater, it forwards the received election message to its
neighbour; if the arrived identifier is smaller it substitutes its own identifier in the election
message before forwarding it.

☞ If the received identifier is that of the receiver itself ⇒ this will be the coordinator. The new
coordinator sends an elected message through the ring.
Example:
Suppose that we have four processes arranged in a ring: P1 à P2 à P3 à P4 à P1 …
P4 is coordinator
Suppose P1 + P4 crash
Suppose P2 detects that coordinator P4 is not responding
P2 sets active list to [ ]
P2 sends “Elect(2)” message to P3; P2 sets active list to [2]
P3 receives “Elect(2)”
This message is the first message seen, so P3 sets its active list to [2,3]
P3 sends “Elect(3)” towards P4 and then sends “Elect(2)” towards P4
The messages pass P4 + P1 and then reach P2
P2 adds 3 to active list [2,3]
P2 forwards “Elect(3)” to P3
P2 receives the “Elect(2) message
P2 chooses P3 as the highest process in its list [2, 3] and sends an “Elected(P3)” message
P3 receives the “Elect(3)” message
P3 chooses P3 as the highest process in its list [2, 3] + sends an “Elected(P3)” message

Question 18 - Distributed Mutual Exclusion


Algorithm a. Ricart-Agrawala algorithm b. Token
ring algorithm
There are two basic approaches to distributed mutual exclusion:

1. Non-token-based: each process freely and equally competes for the right to use the
shared resource; requests are arbitrated by a central control site or by distributed agreement.

2. Token-based: a logical token representing the access right to the shared resource is passed
in a regulated fashion among the processes; whoever holds the token is allowed to enter the
critical section.

Ricart-Agrawala algorithm
The Ricart-Agrawala Algorithm is an algorithm for mutual exclusion on a
distributed system.

Terminology
● A site is any computing device which is running the Ricart-Agrawala
Algorithm
● The requesting site is the site which is requesting entry into the critical
section.
● The receiving site is every other site which is receiving the request from
the requesting site.

Algorithm
Requesting Site:
● Sends a message to all sites. This message includes the site's name, and
the current timestamp of the system according to its logical clock (which is
assumed to be synchronized with the other sites)
Receiving Site:
● Upon reception of a request message, immediately send a timestamped
reply message if and only if:
● the receiving process is not currently interested in the critical section
OR
● the receiving process has a lower priority (usually this means having
a later timestamp)
● Otherwise, the receiving process will defer the reply message. This means
that a reply will be sent only after the receiving process has finished using
the critical section itself.
Critical Section:
● Requesting site enters its critical section only after receiving all reply
messages.
● Upon exiting the critical section, the site sends all deferred reply
messages.

Problems
The algorithm is expensive in terms of message traffic; it requires 2(n-1) messages for entering
a CS: (n-1) requests and (n-1) replies.

The failure of any process involved makes progress impossible if no special recovery measures
are taken.

Token ring algorithm


The Token Ring Algorithm is an algorithm for mutual exclusion on a distributed
system.

☞ A very simple way to solve mutual exclusion ⇒


arrange the n processes P1, P2, .... Pn in a logical ring.

☞ The logical ring topology is created by giving each process the address of one
other process which is its neighbour in the clockwise direction.

☞ The logical ring topology is unrelated to the physical interconnections between


the computers.

The algorithm

1. The token is initially given to one process.


2. The token is passed from one process to its neighbour round the ring.
3. When a process requires to enter the CS, it waits until it receives the token from its left
neighbour and then it retains it; after it got the token it enters the CS; after it left the CS it
passes the token to its neighbour in clockwise direction.
4. When a process receives the token but does not require to enter the critical section, it
immediately passes the token over along the ring.

☞ It can take from 1 to n-1 messages to obtain a token. Messages are sent around the ring
even when no process requires the token ⇒ additional load on the network.

The algorithm works well in heavily loaded situations, when there is a high probability that the
process which gets the token wants to enter the CS. It works poorly in lightly loaded cases.

☞ If a process fails, no progress can be made until a reconfiguration is applied to extract the
process from the ring.

☞ If the process holding the token fails, a unique process has to be picked, which will
regenerate the token and pass it along the ring; an election algorithm

Question 19. Explain Transaction Models


QP-Sep2010. Distinguish between Strict and
sequential consistency with an example for each
Data-Centric Consistency Models
A contract between a (distributed) data store and processes, in which the data store specifies
precisely what the results of read and write operations are in the presence of concurrency.
A data store is a distributed collection of storages accessible to clients:

Strong consistency models: Operations on shared data are synchronized (models not using
synchronization operations):

1. Strict consistency (related to absolute global time)


2. Linearizability (atomicity)
3. Sequential consistency (what we are used to -serializability)
4. Causal consistency (maintains only causal relations)
5. FIFO consistency (maintains only individual ordering)

Weak consistency models: Synchronization occurs only when shared data is locked and
unlocked (models with synchronization operations):
1. General weak consistency
2. Release consistency
3. Entry consistency

Observation: The weaker the consistency model, the easier it is to build a scalable solution

Strict consistency (related to absolute global time)


Any read to a shared data item X returns the value stored by the most recent write
operation on X.

Observations
1. Unfortunately, this is impossible to implemented in a distributed system
2. If a data item is changed, all subsequent reads performed on that data return the new
value, no matter how soon after the change the reads are done, and no matter which
processes are doing the reading and where they are located

Sequential consistency (what we are used to - serializability)

Sequential consistency is a slightly weaker consistency model than strict consistency.

A data store is said to be sequentially consistent when it satisfies the following condition:

The result of any execution is the same as if the (read and write) operations by all
processes on the data store were executed in some sequential order, and the
operations of each individual process appear in this sequence in the order
specified by its program.
Observations
1. When processes run concurrently on possibly different machines, any valid interleaving
of read and write operations is acceptable behavior
2. All processes see the same interleaving of executions.
3. Nothing is said about time
4. A process “sees” writes from all processes but only its own reads

Question 20 - Client centric consistency/


Monotonic read and write
Client centric consistency model
Client-centric consistency models are generally used for applications that lack simultaneous
updates –i.e., most operations involve reading data.

Goal: Show how we can perhaps avoid system-wide consistency, by concentrating on what
specific clients want, instead of what should be maintained by servers.

Most large-scale distributed systems (i.e., databases) apply replication for scalability, but can
support only weak consistency.

Example
1. DNS: Updates are propagated slowly, and inserts may not be immediately visible.News:
Articles and reactions are pushed and pulled throughout the Internet, such that reactions
can be seen before postings.
2. Lotus Notes: Geographically dispersed servers replicate documents, but make no
attempt to keep (concurrent) updates mutually consistent.
3. WWW: Caches all over the place, but there need be no guarantee that you are reading
the most recent version of a page.

Important
● Client-centric consistency provides guarantees for a single client concerning the
consistency of access to a data store by that client
● No guarantees are given concerning concurrent accesses by different clients

Monotonic-Read Consistency
Example 1: Automatically reading your personal calendar updates from different servers.
Monotonic Reads guarantees that the user sees all updates, no matter from which server the
automatic reading takes place.

Example 2: Reading (not modifying) incoming mail while you are on the move. Each time you
connect to a different e-mail server, that server fetches (at least) all the updates from the server
you previously visited.

Monotonic-Write Consistency
1. Example 1: Updating a program at server S2, and ensuring that all components on
which compilation and linking depends, are also placed at S2.

2. Example 2: Maintaining versions of replicated files in the correct order everywhere


(propagate the previous version to the server where the newest version is installed).
Question 21 -. Replica placement – 3 types of
replicas a. Client driven, b. Server driven, c.
Permanent
46th Slide to 51st Slide in Lecture06.pdf

Question 31 - NFS – Architecture (basic), file


system operations supported OR Explain the basic
NFS architecture for unix systems. Also list any
eight file system operations supported by NFS
NFS
An industry standard for file sharing on local networks since the 1980s
· An open standard with clear and simple interfaces
· Supports many of the design requirements already mentioned:
o transparency
o heterogeneity
o efficiency
o fault tolerance
The basic NFS architecture(2) for UNIX systems

Unix implementation advantages


· Binary code compatible -no need to recompile applications
o Standard system calls that access remote files can be routed through the NFS client module by the
kernel
· Shared cache of recently-used blocks at client
· Kernel-level server can access i-nodes and file blocks directly
o But a privileged (root) application program could do almost the same.
· Security of the encryption key used for authentication.
file system operations supported by NFS
1. Create – Create a regular file
2. Rename – Change the name of a file
3. Mkdir – Create a subdirectory under a given directory
4. Rmdir – Remove an empty subdirectory from a directory
5. Open –Open a file
6. Close – Closes a file
7. Read – Read the data contained in a file
8. Write – Write data to a file.

Question 32 - Naming scheme in NFS with


different types of mounting
Mount operation: mount(remotehost, remotedirectory, localdirectory)
o Server maintains a table of clients who have mounted file systems at that server
o Each client maintains a table of mounted file systems holding:
o < IP address, port number, file handle>

Hard versus soft mounts


o Soft - If a file request fails, the NFS client will report an error to the
process on the client machine requesting the file access.

o -The program accessing a file on a NFS mounted file system will


Hard
hang when the server crashes
Appendix

Questions from old question papers (Test 1)

Introduction
1. With Suitable examples explain the fields of application of distributed systems?
2. What is global state? Explain the mechanism of distributed snapshot.
3. What are Goals of DS? Explain briefly.
4. Explain the different system architectures of DS.
5. Enumerate the fundamental characters required for DS.
6. How DS operating system differs from normal OS.
7. Discuss about various characteristics of DS.

RPC
1. Describe Remote Procedure call RPC with example.
2. What is Distributed object based system? Discuss how object-based system is differ from
conventional RPC system.
3. Explain the mechanism of RPC with diagram.

DNS
1. Write a note on i) DNS ii)X.500
2. Explain about Directory and Discovery services of Name Services.
3. Discuss the problems raised by the uses of aliases in a name service and indicate how, if
at all these may be overcome.

Model
1. Describe thread synchronization, thread scheduling and thread implementation in
Distributed OS.
2. What is fundamental model? Discuss the features.
3. Explain about the Architecture Models of DS.
4. What is Architect Model? Discuss in brief Client-server model.

RMI
1. Describe RMI. Discuss the design issues of RMI.

Mobile
1. Write a note on i) Mobile code ii)Mobile Agent
2. Differentiate between Mobile agents and codes.

Communication
1. Discuss in brief the Client –Server communication process.
2. Discuss the architecture of CORBA.
3. With a supporting diagram explain the general organization of an internet search engine
showing three different layers(UI layer, processing layer and data layer).

Test 3 Syllabus Questions

DSM(distributed shared memory) Algorithms (different types of


algorithms)

Discuss granularity and replacement algorithms in DSM.


Different distribution algorithm components (Transfer
policy/Location policy) etc –

Load distribution algorithms - Sender initiated distributed


algorithms

Load distribution algorithms -Discuss Adaptive load distributive


algorithms.

También podría gustarte