Está en la página 1de 13

Reliable Backup Routing in Fault Tolerant

Real-Time Networks

Rajeev Vadali C. R. Muthukrishnan


Software Engineer Department of Computer Science
Hughes Software Systems of Engineering
Gurgaon, India 122015 I.I.T Madras
Chennai, India 600036
info@hssworld.com crm@iitm.ernet.in
www.hssworld.com
Copyright 1999 IEEE.

Published in the Proceedings of the IEEE International Conference on Networks 2001 (ICON'01), held on October 10-
12, 2001 at Thailand. Personal use of this material is permitted. However, permission to reprint/republish this material
for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or
lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact:
Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-
1331, USA. Telephone: + Intl.732-562-3966.
Contents
1 Abstract..........................................................................................................................................................................................4
2 Previous Studies ...........................................................................................................................................................................4
3 Paper Objectives ...........................................................................................................................................................................5
4 Assigning Link Capacity.............................................................................................................................................................5
5 Computing Transmission Paths .................................................................................................................................................5
6 Defining a Reliable Dispersity Routing Scheme.....................................................................................................................5
6.1 Dispersity Routing Scheme: Implementation .................................................................................................................5
6.2 Dispersity Routing Scheme: Comparison........................................................................................................................6
7 Reliable Link Failure: Alternatives ...........................................................................................................................................7
8 Performance Study.......................................................................................................................................................................8
8.1 Assumptions.........................................................................................................................................................................8
8.2 Study Details .........................................................................................................................................................................8
8.3 Blocking Probability............................................................................................................................................................8
8.3.1 Scenario 1.....................................................................................................................................................................9
8.3.2 Scenario 2.....................................................................................................................................................................9
8.4 Fault Tolerance...................................................................................................................................................................10
8.4.1 Scenario 1...................................................................................................................................................................10
8.4.2 Scenario 2...................................................................................................................................................................11
9 Conclusion...................................................................................................................................................................................12
10 Bibliography ...........................................................................................................................................................................13
Abstract
Broadband integrated services digital networks (B-ISDN) are designed to transport both real-time traffic and non real-
time traffic. Users of these networks require Quality of Service (QoS) guarantees. Previous studies on these networks
did not include extensive research to provide QoS guarantees based on fault tolerance. This paper proposes the
provision of QoS guarantees based on fault tolerance, which is based on the reliability of network links. This provides
network users the flexibility to choose network resources.

This paper also proposes a new terminology for dispersity routing, which will be useful in providing QoS guarantees
based on reliability. Dispersity routing transmits the traffic along multiple paths. In addition, a reliable backup resource
allocation method is presented, which can be used in the context of dispersity routing for fault tolerant real-time
networks.

Keywords: Dispersity routing, Quality of Service (QoS), Fault tolerance, Reliability.

1 Previous Studies

Emerging distributed real-time applications require guaranteed end-to-end quality of service (QoS) and have stringent
constraints on delay and cost. This section traces the QoS guarantees provided and the routing schemes proposed in
different studies conducted over the years.

In [1], dispersity routing at the physical layer is introduced. In this routing procedure, the message is subdivided and
dispersed through multiple paths in the network. This is unlike the conventional routing procedure in which a message
is routed along a single path between the source and the destination.

In [2], the dispersity routing at the application layer is proposed based on the advances in workstation speeds. Moving
dispersity routing to a high level layer provides increased flexibility and lower cost. This is because the network user
can choose the level of reliability required. Dispersity routing is characterized by the triplet, (N,K,S). The variable N
denotes the number of paths between the source and the destination, and K denotes the number of paths over which
real-time traffic is transmitted. The rest of the paths, N-K, are used to send redundant information. This information is
used to reconstruct the message in the event of loss of information during transmission over the network. The variable
S denotes the maximum number of paths amongst which a link can be shared.

In [3], the different dispersity routing schemes are simulated. It is shown that fragmenting the traffic among multiple
paths in the network utilizes the network capacity to a greater degree and provides fault tolerance at the same time.

In [4] and [5], a number of algorithms are discussed for routing and reserving resources amongst multiple paths. It is
shown that when there are multiple paths in a network, reservation of resources in multiple paths is better than
reservation of resources in a single path.

In [6], a study is made in the areas of network architecture and protocols for supporting real-time services in packet-
switched networks. In [7], [8] and [9], it is made possible to support real-time communication in packet-switched
networks. Therefore, packet-switched networks can provide guaranteed QoS to distributed real-time applications.
Recently, in [10] and [11], a study was made about QoS in packet-switched networks and the issues involved.

In [12] and [13], it is shown that large amount of spare resources may seriously degrade the attractiveness of the
backup channel scheme. The idea of backup multiplexing is proposed in [12] and [13]. The idea is to use the
probability of link failure to reserve a small fraction of spare resources at each link.

The dynamics of client requirements and performance in real-time systems are dealt in [14]. This study presents the
Dynamic Connection Management (DCM) scheme, which addresses issues such as network availability and flexibility.
2 Paper Objectives
This paper proposes new methods to:

• assign capacity to links in a network.


• compute paths for transmission.
• define a dispersity routing scheme.

The following sections elaborate these methods.

3 Assigning Link Capacity


Assigning capacity to the links of a network is an important issue. In earlier works, [2], [3], [12] and [13], it is assumed
that a link is assigned a capacity based on the incoming traffic. Therefore, a link that receives higher traffic is assigned
higher capacity. In this paper, it is assumed that a link is assigned a capacity based on the incoming traffic and the
reliability of that link. Therefore, a link with more incoming traffic or with higher reliability is assigned higher
capacity. Categorizing links based on their reliability maximizes the availability of network resources for a longer
period.

4 Computing Transmission Paths


It is essential to provide fault tolerance to network users and also use the available resources efficiently. Efficient use of
resources requires the efficient computation of transmission paths. In earlier works, [2], [4] and [5], paths are computed
based on the path distance and delay constraint metrics. This paper suggests that paths should be computed based on
reliability in addition to the path distance and delay constraint metrics. This ensures that stable paths, which can remain
active most of the time, are selected for transmission. However, it should be noted that when multiple metrics are
considered for computing paths, there is a tradeoff between various metrics. Therefore, in this proposal, the number of
reliable paths used is limited to adhere to the delay constraints.

This paper proposes the computation of paths in two phases. In the first phase, the required reliable paths are computed
based on the reliability of links and delay constraints. Reliable paths are computed using the probability of link failures.
It is assumed that the probabilities of link failures are known a priori from previous studies on the network. The
reliability of a path is calculated as shown in the equation below, where p 1 , p 2 … p n are the probability of link failures of
each link in the path.

Path Reliability = (1-p 1 ) * (1-p 2 ) ... * (1-p n )

In the second phase, the rest of the paths are computed based on shortest path metric and delay constraints. The paths
that are computed in the second phase are link disjoint with respect to the reliable paths that are computed in the first
phase.

5 Defining a Reliable Dispersity Routing Scheme


The resources used for backup paths are randomly chosen in earlier dispersity routing schemes mentioned in [2] and
[3]. In the proposed work, a backup path is selected based on the reliability of the path. This ensures that network
resources are effectively allocated for backup paths. The reliability of a path depends on the probability of link failures
of the links in that path. It is assumed that the probability of each link failure in the network is known.

5.1 Dispersity Routing Scheme: Implementation


In this paper, a dispersity routing request is characterized by the quintuplet, (N,K,S,R,B), instead of the conventional
triplet (N,K,S). As defined earlier, the variables N and K represent the number of paths between the source and the
destination, and the number of paths over which real-time traffic is transmitted, respectively. The variable S denotes the
maximum number of paths amongst which a link can be shared. However, S is relevant only to the paths that are
computed based on the shortest path metric and does not refer to reliable paths. The variable R represents the number
of reliable paths in N available paths. The variable B is an array of R elements. Each value of B represents the number
of backup channels allocated for each reliable path. The variables R and B provide the flexibility to select the QoS
guarantees based on the user requirements.

To implement the proposed routing scheme, all the available paths are used to transmit real-time traffic (N=K). It is
ensured that more than one channel can be established on a path. For backup resources, additional channels from the
source to the destination are established. These channels are established in the reliable path where a channel is already
dedicated for transmitting real-time traffic. QoS guarantees based on fault tolerance are provided by varying the
number of reliable backup paths and also by varying the number of backup channels in each reliable path.

5.2 Dispersity Routing Scheme: Comparison

Based on the fault tolerance required for an application, dispersity routing with reliable backup can be provided. To
tolerate single faults in the network, the (N,N,1,1,{1}) dispersity routing scheme is used. To tolerate double faults when
only one reliable path exists, the (N,N,1,1,{2}) dispersity routing scheme is used.

The proposed dispersity routing scheme for a (4,4,1,2,{1,1}) request to tolerate double faults is shown in Figure 1. In
this scheme, the paths P1 and P2 are the reliable paths, denoted by R1 and R2. The backup channel ch2 is allocated in
reliable paths, R1 and R2, to tolerate faults that may occur in the other two paths, P3 and P4. Channel ch1 of all four
paths, P1, P2, P3, and P4, is used to transmit real-time traffic.

For the same (4,4,1,2,{1,1}) request to tolerate double faults, which is equivalent to a (4,2,1) request according to
earlier works, Figure 2 depicts the dispersity routing scheme. In this scheme, paths P1 and P2 are used for transmitting
real-time traffic. Paths P3 and P4 serve the purpose of backup paths.

Figure 1 (N=4, K=4, S=1, R=2, B={1,1}) Dispersity Routing to Tolerate Double Faults
Figure 2 (N=4, K=2, S=1) Dispersity Routing to Tolerate Double Faults

Notice that the scheme proposed in Figure 1 utilizes the network connectivity for real-time traffic to the maximum.
Compared to this, the scheme depicted in Figure 2 does not utilize two of the available paths for real-time traffic
because these paths are reserved to tolerate double faults. When the network connectivity is low, reserving some paths
separately as backup paths is an overhead, especially if faults in the network occur infrequently. Also, the advantages
of dispersity routing are lost when all the multiple paths are not used for real-time traffic in a network with low
connectivity. In the proposed work, the backup channels are established in a reliable path, which is also used for
transmitting real-time traffic. This ensures that there is no overhead in allocating backup channels.

6 Reliable Link Failure: Alternatives


A reliable path can also develop faults, though the probability of such an occurrence is low. This results in a slow
degradation of the system. To avoid this slow degradation of the system, one of the following three alternatives can be
implemented:

• Use the reserved bandwidth of another application that is not being used at that moment and design a tolerant
dispersity routing scheme. The real-time traffic on the failed path is passed through the underutilized link. This is
feasible because some applications reserve the peak bandwidth and might not use the bandwidth all the time.
Therefore, faults can be tolerated, considering the inherent dynamics of client requirements.

• Block a channel of an application and assign a higher priority to the real-time traffic on the failed link. The
application that has to be blocked is the one with no requirements of time guarantees such a file transfer
application or an e-mail. However, it needs to be ensured that the non real-time application is not blocked for a
long time.

• Route the failed links real-time traffic along the remaining paths, which are active in the same application. This
will degrade the QoS guarantees of an application. Therefore, this option is used only if the above two options fail
to allocate a channel.
7 Performance Study
7.1 Assumptions
• 33% of the links are assumed to be reliable.

Figure 3 Network (with 17 nodes and 59 links) used for Experiments

7.2 Study Details


• A number of experiments were conducted to display the advantages of the proposed work over earlier works [2]
and [3]. The experiments were conducted on a network with 17 nodes and 59 bi-directional links as shown in
Figure 3. The same network is also referred in the earlier work [14].
• The simulation was performed for both earlier works and the proposed work on a wide range of request arrival
rates.
• The network load was generated at each node with independent arrival of real-time requests according to Poisson
distribution and exponentially distributed holding time.
• The simulation was implemented assuming a single fault model. Fault events and restoration events are generated
according to Poisson distribution.
• The request arrival rate was varied from 0.1 requests per millisecond to 4.0 requests per millisecond.
• The number of requests ranged from 1712 to 66650 in the set of experiments.
• The number of multiple paths used was restricted to ensure that all accepted requests satisfy delay constraints.

7.3 Blocking Probability

The advantage of the proposed work over the earlier work is shown using blocking probability. Blocking probability is
calculated as the number of requests rejected divided by the number of requests received. Blocking probability of both
works was compared in two scenarios:
7.3.1 Scenario 1
Consider that a network user requests for a dispersity route with (3,2,1). In earlier works, it is observed that all single
faults are tolerated. In this case, the first two paths are used for transmitting real-time data and the third path is used as
a backup path.

In the proposed work, an application will request for a dispersity route with (2,2,1,1,{1}). In this case, the first path is a
reliable path and second path is a minhop path. In this proposal, it is known which links are reliable and which are not.
Reliable links are assumed to have a higher link capacity. In addition, the routes for different paths are computed based
on the reliability of links and shortest path metric.

It can be noticed from Figure 4 that the blocking probability of the earlier work is higher than the proposed work at all
instants of the request arrival rate. In the earlier work, all resources are assigned randomly and the reliability of links is
not considered when link capacity is assigned. The proposed work uses the reliable links in its favor by assuming that a
higher link capacity is assigned to reliable links.

Figure 4 Blocking Probability of Proposed Work (with higher link capacity in reliable links) in Comparison to
Earlier Work (without higher link capacity in reliable links)

7.3.2 Scenario 2

In this scenario, the network user request and the proposed dispersity routing scheme is the same as the previous
scenario. However, it is assumed that both the earlier work and the proposed work use the same network in which the
reliable links are assigned a higher capacity.
In this case, the blocking probability of the earlier work is higher than the proposed work. This is depicted in Figure 5.
It is observed that the earlier work does not use the resources assigned to reliable links efficiently. Therefore, it can be
concluded that assigning a higher link capacity to reliable links cannot bring down the blocking probability of the
earlier work to the extent of the proposed work. Multiple paths need to be computed based on the reliability of links to
increase the number of requests that are being serviced, and thereby decrease blocking probability.

Figure 5 Blocking Probability of Proposed Work (with higher link capacity in reliable links) in Comparison to
Earlier Work (with higher link capacity in reliable links)

7.4 Fault Tolerance

7.4.1 Scenario 1
Similar to the previous experiment, consider that a network user requests for a dispersity route with (3,2,1). In the
absence of three link disjoint paths or lack of link capacity, the request might be rejected. In the earlier work, when a
(3,2,1) request is rejected, the (3,2,2) dispersity routing scheme might be implemented. This is depicted in Figure 6
below:

P1

P3
Source Destination
Shared Link

P2

Figure 6 (N=3,K=2,S=2) Dispersity Routing Scheme


As depicted in the figure, the shared link is shared by paths P2 and P3.

In the earlier work, the shared link is chosen without considering the reliability of the link. In the proposed work, when
a (3,2,1) request is rejected, the (2,2,1,1,{1}) dispersity routing scheme is implemented. In this case, one among the two
paths is a reliable path. In the reliable path, two logical channels are established so that one channel can be used to
tolerate faults.
As depicted in Figure 7, it has been observed from that the blocking probability for earlier work increases at a faster
rate when compared to the proposed work. In the earlier work, it has also been observed that as the request arrival rate
increases, the number of dispersity applications that fail to tolerate single faults increases at a faster rate. This is
depicted in Figure 8. In the proposed work, all the dispersity applications could tolerate single faults. Also, in the
earlier work, the blocking probability for a (3,2,2) request is higher. This is because the higher capacity of the reliable
link is not used efficiently, which is not the case in the (2,2,1,1,{1}) request of the proposed work. In the earlier work,
the reliability of the shared link is not considered when computing the shared path. In the proposed work, single faults
can be tolerated because all shared links are reliable.

Figure 7 Blocking Probability of Proposed Work (using reliable backup path) in Comparison to Earlier Work
for Rejected (3, 2, 1) Requests

Figure 8 Number of Dispersity Applications Failed for Earlier Work with (3, 2, 2) Dispersity Routing
7.4.2 Scenario 2
In the previous scenario, when the (3,2,1) request is rejected, the proposed work implements (2,2,1,1,{1}) dispersity
routing. In this case, it is assumed that the reliable links are close to each other and they result in a complete reliable
path between the source and the destination. However, when the reliable links in a network are distributed evenly, a
complete reliable path from the source to the destination might not be established always. Therefore, in this scenario,
when the (3,2,1) request is rejected, the proposed work implements (3,2,2) dispersity routing. At the same time, it is
ensured that only one reliable link is shared.
As depicted in Figure 9, the blocking probability of the proposed work is slightly higher than the earlier work. The
slight increase in the blocking probability of the proposed work is due to the non-availability of a reliable link, which
can be shared by two paths. It is also observed that all dispersity applications could tolerate single faults in the
proposed work, unlike the earlier work where there are some failures of dispersity applications. This is shown in Figure
10. Therefore, there is a tradeoff between the faults being tolerated and the number of requests being serviced.

Figure 9 Blocking Probability of Proposed Work (sharing of one reliable link) in Comparison to Earlier Work
for Rejected (3, 2, 1) Requests

Figure 10 Number of Dispersity Applications Failed for Earlier Work (reliable links are distributed evenly) with
(3, 2, 2) Dispersity Routing

8 Conclusion

This paper proposed a method for dispersity routing with reliable backup paths. This approach leads to efficient
utilization of network resources and fault tolerant real-time networks. It is shown that identifying reliable links and
using them efficiently will help in more requests being serviced. The new dispersity routing scheme, (N,K,S,R,B),
presented in this paper, provides increased flexibility to network users. Users can select the required fault tolerance
based on QoS guarantees.
9 Bibliography

1. N. F. Maxemchuk, “Dispersity Routing”, Proceedings of ICC, June 1975, pp. 41.10-41.13.


2. A. Banerjea, “A Taxonomy of Dispersity Routing Schemes for Fault Tolerant Real-Time Channels”, Proceedings
of ECMAST, May 1996, pp. 129-148.
3. A. Banerjea, “Simulation Study of the Capacity Effects of Dispersity Routing for Fault Tolerant Real-Time
Channels”, Proceedings of ACM SIGCOMM, August 1996, pp. 194-205.
4. I. Cidon and R. Rom, “Multi-path Routing Combined with Resource Reservation”, Proceedings of IEEE
INFOCOM, April 1997, pp. 92-100.
5. I. Cidon, R. Rom, and Y. Shavitt, “Analysis of Multi-path Routing”, IEEE/ACM Transactions on Networking, vol.
7, no. 6, December 1999, pp. 885-896.
6. C. M. Aras, J. F. Kurose, D. S. Reeves, and H. Schulzrinne, “Real-Time Communication in Packet-Switched
Networks”, Proceedings of the IEEE, vol. 82, no. 1, January 1994, pp. 122-139.
7. D. Ferrari, A. Banerjea, and H. Zhang, “Network Support for Multimedia: A Discussion of the Tenet Approach”,
Computer Networks and ISDN Systems, Special Issue on Multimedia Networking, July 1994, pp. 1267-1280.
8. A. Banerjea, D. Ferrari, B. A. Mah, D. Verma, and M. Moran, “The Tenet Real-Time Protocol Suite: Design,
Implementation, and Experiences”, IEEE/ACM Transactions on Networking, vol. 4, no. 1, February 1996, pp. 1-
10.
9. A. Banerjea, E. Knightly, F. Templin, and H. Zhang, “Experiments with the Tenet Real-Time Protocol Suite on the
Sequoia 2000 Wide Area Network”, Proceedings of ACM Multimedia, October 1994, pp. 183-191.
10. S. Chen and K. Nahrstedt, “An Overview of Quality-of-Service Routing for the Next Generation High-speed
Networks: Problems and Solutions”, IEEE Network, Special Issue on Transmission and Distribution of Digital
Video, vol. 12, no. 6, November 1998, pp. 64-79.
11. X. Xiao and L. M. Ni, “Internet QoS: A Big Picture”, IEEE Network Magazine, vol. 13, no. 2, March 1999, pp. 8-
18.
12. S. Han and K. G. Shin, “Fast Restoration of Real-Time Communication Service from Component Failures in
Multi-hop Networks”, Proceedings of ACM SIGCOMM, September 1997, pp. 77-88.
13. K. G. Shin and S. Han, “Fast Low-Cost Failure Recovery for Reliable Real-Time Multimedia Communication”,
IEEE Network, Special Issue on Transmission and Distribution of Digital Video, vol. 12, no. 6, November 1998,
pp. 56-63.
14. C. Parris, H. Zhang, and D. Ferrari, “A Dynamic Management Scheme for Real-Time Communications”,
Proceedings of IEEE INFOCOM, June 1994, pp. 698-707.

También podría gustarte