Está en la página 1de 898

Mo

re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
HCIE-R&S

Huawei Certification

en
m/
co
HCIE-R&S

.
ei
Huawei Certified Internetwork Expert-Enterprise

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng

Huawei Technologies Co.,Ltd


ni
ar
Le
re
Mo

HUAWEI TECHNOLOGIES
HCIE

Copyright Huawei Technologies Co., Ltd. 2010. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any

en
means without prior written consent of Huawei Technologies Co., Ltd.

m/
Trademarks and Permissions

. co
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

ei
All other trademarks and trade names mentioned in this document are the property of

w
their respective holders.

ua
Notice

g .h
The information in this document is subject to change without notice. Every effort

in
has been made in the preparation of this document to ensure accuracy of the contents,
but all statements, information, and recommendations in this document do not

rn
constitute the warranty of any kind, expressed or implied.

ea
/l
:/
tp

Huawei Certification
ht

HCIE-R&S
s:

Huawei Certified Internetwork Expert-Enterprise


ce
ur
so
Re
ng
ni
ar
Le
re
Mo

HUAWEI TECHNOLOGIES
HCIE-R&S

en
Huawei Certification System

m/
co
Relying on its strong technical and professional training system, in accordance with
different customers at different levels of ICT technology, Huawei certification is

.
ei
committed to provide customs with authentic, professional certification.

w
Based on characteristics of ICT technologies and customersneeds at different levels,

ua
Huawei certification provides customers with certification system of four levels.

.h
HCDA (Huawei Certification Datacom Associate) is primary for IP network

g
maintenance engineers, and any others who want to build an understanding of the IP

in
network. HCDA certification covers the TCP/IP basics, routing, switching and other
common foundational knowledge of IP networks, together with Huawei

rn
communications products, versatile routing platform VRP characteristics and basic
maintenance.
ea
/l
HCDP-Enterprise (Huawei Certification Datacom Professional-Enterprise) is aimed at
enterprise-class network maintenance engineers, network design engineers, and any
:/

others who want to grasp in depth routing, switching, network adjustment and
tp

optimization technologies. HCDP-Enterprise consists of IESN (Implement Enterprise


Switch Network), IERN (Implement Enterprise Routing Network), and IENP
ht

(Improving Enterprise Network performance), which includes advanced IPv4 routing


and switching technology principles, network security, high availability and QoS, as
s:

well as the configuration of Huawei products.


ce

HCIE-Enterprise (Huawei Certified Internetwork Expert-Enterprise) is designed to


ur

endue engineers with a variety of IP technologies and proficiency in the maintenance,


diagnostics and troubleshooting of Huawei products, which equips engineers with
so

competence in planning, design and optimization of large-scale IP networks.


Re
ng
ni
ar
Le
re
Mo

HUAWEI TECHNOLOGIES
Mo
re
HCIE

Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/

HUAWEI TECHNOLOGIES
/l
ea
rn
in
g.h
ua
w ei
. co
m/
en
HCIE-R&S

en
Referenced icon

m/
co
.
w ei
ua
.h
Router L3 Switch L2 Switch Firewall Net cloud

g
in
rn
Ethernet line
ea Serial line
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo

HUAWEI TECHNOLOGIES
HCIE

CONTENTS

en
m/
RIP ..................................................................................................................................................... 7

co
IS-IS.................................................................................................................................................. 59

.
ei
OSPF .............................................................................................................................................. 123

w
BGP BASICS .................................................................................................................................... 196

ua
.h
BGP ADVANCED AND INTERNET DESIGN ........................................................................................ 266

g
ROUTE IMPORT AND CONTROL ...................................................................................................... 334

in
rn
VLAN .............................................................................................................................................. 393

ea
LAN LAYER 2 TECHNOLOGIES ......................................................................................................... 448
/l
WAN LAYER 2 TECHNOLOGIES........................................................................................................ 496
:/

STP ................................................................................................................................................. 548


tp
ht

MULTICAST .................................................................................................................................... 636

IPv6 ................................................................................................................................................ 719


s:
ce

MPLS VPN ...................................................................................................................................... 805


ur

OTHER TECHNOLOGIES .................................................................................................................. 841


so
Re
ng
ni
ar
Le
re
Mo

HUAWEI TECHNOLOGIES
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RIPv1 packet format


A RIP packet consists of two parts: Header and Route Entries.
ht

The Header includes the Command and Version fields. Route


Entries include at most 25 routing entries. Each routing entry
contains the Address Family Identifier field, IP Address of
s:

target network, and Metric field.


The meaning of each field in a RIP packet is as follows:
ce

Command: indicates whether the packet is a request or response.


ur

The value is 1 or 2. The value 1 indicates a request, and the value 2


indicates a response.
so

Version: specifies the used RIP version. The value 1 indicates a


RIPv1 packet, and the value 2 indicates a RIPv2 packet.
Re

Address Family Identifier: specifies the used address family. The


value is 2 for IPv4. If the packet is a request for the entire routing
ng

table, the value is 0.


IP Address: specifies the destination address for the routing entry.
ni

The value can be a network address or host address.


Metric: indicates how many hops the packet has passed
ar

through to the destination. Although the field value ranges from 0 to


2^32 (2 to the power of 32), the value ranges from 1 to 16 in RIP.
Le
re
Mo
en
RIPv1 characteristics

m/
RIP is a UDP-based routing protocol. A RIP packet excluding
an IP header has at most 512 bytes, which includes a 4-byte

co
RIP header, and each route includes a 20-byte, the maxium
message of RIP is 4+(25*20)=504-byte routing entries, and an

.
8-byte UDP header. A RIPv1 packet does not carry mask

ei
information. RIPv1 send and receive routes based on the main

w
class network segment mask and interface address mask.

ua
Therefore, RIPv1 does not support route summarization or
discontinuous subnets. RIPv1 packets do not carry the

.h
authentication field, and so RIPv1 does not support
authentication.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RIPv2 packet format


A RIPv2 packet has the same format as a RIPv1 packet
ht

except that RIPv2 uses some new and unused fields in RIPv1
to provide extended functions.
The meaning of the new fields is as follows:
s:

Route Tag: indicates external routes learned from other


ce

protocols or routes imported into RIPv2.


Subnet Mask: identifies the subnet mask of an IPv4 address.
ur

Next Hop: indicates a next-hop address that is better than


the advertising router address. The value 0.0.0.0
so

indicates that the advertising router address is the


optimal next-hop address.
Re

When authentication is configured in RIPv2, RIPv2 modifies


the first Route Entries:
ng

Changes the Address Family Identifier field to 0XFFFF.


Changes the Route Tag field to the Authentication Type field.
ni

Changes the IP Address, Subnet Mask, Next Hop, and


Metric fields to the Password field.
ar

Compared with RIPv1, RIPv2 has the following advantages:


Le

Supports route tags. Route tags are used in routing policies to


flexibly control routes. Tags can also be used when RIP
processes import routes from each other.
re

Supports subnet masks, route summarization, and CIDR.


Mo
en
Supports specified next hops to select the optimal next-hop

m/
address on a broadcast network.
Multicasts route updates. Only RIPv2-running devices can

co
receive protocol packets, reducing resource consumption.
Supports packet authentication to enhance security.

.
ei
On a broadcast network with more than two devices, the Next Hop field

w
changes to optimize the path.

ua
In MD5 authentication, the AND operation is performed on route entries

.h
and shared key. A router then sends the AND operation results and
route entries to the neighbor.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RI mainly uses three timers:


Update timer: defines the interval between two route updates.
ht

It periodically triggers the transmission of route updates at a


default interval of 30 seconds.
Aging timer: specifies the aging time of routes. If a RIP device
s:

does not receive the update of a route from its neighbor within
ce

the aging time, the RIP device considers the route as


unreachable. After the aging timer expires, the RIP device sets
ur

the metric of the route to 16.


Garbage-collect timer: specifies the interval between a route is
so

marked as unreachable and the route is deleted from the


routing table. The default interval is four times the update
Re

interval, namely, 120 seconds. If the RIP device does not


receive the update of an unreachable route from the same
ng

neighbor within the garbage-collect time (defaults to 120


seconds), the RIP device deletes the route from the routing
ni

table.
Relationship between three timers:
ar

RIP route update advertisement is controlled by the update


timer. A route update is sent at a default interval of 30 seconds.
Le

Each routing entry has two timers: aging timer and garbage-
collect timer. When a route is learned and added to the routing
table, the aging timer starts. If a RIP device does not receive
re

the update of the route from a neighbor when the aging timer
Mo

expires, the device sets the metric of the route to 16


(indicatingan unreachable route) and starts the garbage-collect
timer.
en
If the device still does not receive the update of the

m/
unreachable route from the neighbor when the garbage-collect
timer expires, the device deletes the route from the routing

co
table.

.
Precautions

ei
If a RIP device does not have the triggered update function, it

w
deletes an unreachable route from the routing table after a

ua
maximum of 300 seconds (aging time plus garbage-collect
time).

.h
If a RIP device has the triggered update function, it deletes an
unreachable route from the routing table after a maximum of

g
120 seconds (the garbage-collect time).

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Split horizon
RIP uses split horizon to reduce bandwidth consumption and
ht

prevent routing loops.

Implementation
s:

R1 sends R2 a route to network 10.0.0.0/8. If split horizon is


ce

not configured, R2 sends the route learned from R1 back to R1.


In this manner, R1 can learn two routes to network 10.0.0.0/8:
ur

one direct route with zero hops and the other route with two
hops and R2 as the next hop.
so

However, only the direct route is active in the RIP routing table
of R1. When the route from R1 to network 10.0.0.0/8 becomes
Re

unreachable and R2 does not receive route unreachable


information, R2 continues sending route information indicating
ng

that network 10.0.0.0/8 is reachable to R1. Subsequently, R1


receives incorrect route information and considers that it can
ni

reach network 10.0.0.0/8 through R2; R2 still considers that it


can reach network 10.0.0.0/8 through R1. As a result, a routing
ar

loop occurs. After split horizon is configured, R2 does not send


the route to network 10.0.0.0/8 back to R1, preventing a
Le

routing loop.
Precautions
Split horizon is disabled on NBMA networks by default.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Poison reverse function


Poison reverse helps delete useless routes from the routing
ht

table of the peer end.

Implementation
s:

After receiving a route 10.0.0.0/8 from R1, R2 sets the metric


ce

of the route to 16, indicating that the route is unreachable, if


poison reverse is configured. Then R1 does not use the route
ur

10.0.0.0/8 learned from R2, preventing a routing loop.


so

Precautions
Poison reverse is disabled by default. Generally, split horizon
Re

is enabled on Huawei devices (except on NBMA networks)


and poison reverse is disabled.
ng

Comparisons between split horizon and poison reverse


ni

Both split horizon and poison reverse can prevent routing


loops in RIP. The difference between them is as follows: Split
ar

horizon avoids advertising a route back to neighbors along the


same path to prevent routing loops, while poison reverse
Le

marks a route as unreachable and advertises the route back to


neighbors along the same path to prevent routing loops.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Triggered update
Triggered update can shorten the network convergence time.
ht

When a routing entry changes, a RIP device broadcasts the


change to other devices immediately without waiting for
periodic update. If triggered update is not configured, by
s:

default, invalid routes are retained in the routing table for a


ce

maximum of 300 seconds (aging time plus garbage-collect


time).
ur

Update is not triggered when the next-hop address becomes


unreachable.
so

Implementation
Re

After R1 detects a network fault, it sends a route update to R2


immediately without waiting for the expiry of the update timer.
ng

Subsequently, the routing table of R2 is updated in a timely


manner.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Route summarization
RIPv2 supports route summarization. Because RIPv2 packets
ht

carry the mask, RIPv2 supports subnetting. Route


summarization can improve scalability and efficiency of large
networks and reduce the routing table size.
s:

RIPv2 process-based classful summarization can implement


ce

automatic summarization.
Interface-based summarization can implement manual
ur

summarization.
If the routes to be summarized carry tags, the tags are deleted
so

after these routes are summarized into one summary route.


Re

Case
Two routes: route 10.1.0.0/16 (metric=10) and route
ng

10.2.0.0/16 (metric=2) are summarized into one natural


network segment route 10.0.0.0/8 (metric=3).
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Working process analysis:


Initial state: A router starts a RIP process, associates an
ht

interface with the RIP process, and sends as well as receives


RIP packets from the interface.
Build a routing table: The router builds its routing entries
s:

according to received RIP packets.


Maintain the routing table: The router sends and receive a
ce

route update at an interval of 30 seconds to maintain its


ur

routing entries.
Age routing entries: The router starts a 180-second timer for its
so

routing entries. If the router receives route updates within 180


seconds, it resets the update timer and aging timer.
Re

Garbage collect entries: If the router does not receive the


update of a route after 180 seconds, it starts the 120-second
ng

garbage-collect timer and sets the metric of the route to 16.


Delete routing entries: If the router still does not receive the
ni

update of the route after 120 seconds, it deletes the route from
the routing table.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, R1, R2, and R3 reside on network 192.168.1.0/24;
ht

R3, R4, and R5 reside on network 192.168.2.0/24. All the


routers run RIPv2 and advertise IP addresses of connected
interfaces. To control route selection on R3, modify the metric
s:

of routes.
ce

Remarks
ur

In the IP routing table, only some related routing entries are


displayed. In the Flags field of the route, R indicates an
so

iterated route, and D indicates that the route is delivered to the


FIB table.
Re

The route iteration process is as follows: Iteration process is


finding routing for iteration. On a device, when the next hop of
ng

a route to the destination address does not match the


outbound interface of the device, routes can match again the
ni

destination address in the table of the next hop so routes be


iterated to find the correct outbound interface for forwarding.
ar

The FIB table is the route forwarding table that is generated by


the routing table. You can run the display fib command to
Le

view the forwarding table.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The rip metricin command increases the metric of a received
ht

route. After the route is added to the routing table, the metric of
the route is changed. Running this command affects route
selection of the local device and other devices.
s:

The rip metricout command increases the metric of an


ce

advertised route. The metric of the route remains unchanged


in the routing table. Running this command does not affect
ur

route selection of the local device but affects route selection of


other devices.
so

View
Re

Interface view

Parameters
ng

rip metricout { value | { acl-number | acl-name acl-name | ip-


ni

prefix ip-prefix-name } value1 }: sets the additional metric to


be added to an advertised route.
ar

value: increases the metric of an advertised route. The


value ranges from 1 to 15 and defaults to 1.
Le

acl-number: specifies a basic ACL number. The value


ranges from 2000 to 2999.
acl-name acl-name: specifies an ACL name. The value
re

is case-sensitive.
Mo

ip-prefix ip-prefix-name: specifies an IP prefix list name,


which must be unique.
en
value1: increases the metric of the route that passes the

m/
filtering of an ACL or IP prefix list.

co
Precautions
You can specify value1 to increase the metric of the advertised

.
RIP route that passes the filtering of an ACL or IP prefix list. If

ei
a RIP route does not pass the filtering, its metric is increased

w
by 1.

ua
Running the rip metricin/metricout commands will affect
route selection of other devices.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. To prevent interfaces from sending or receiving route


updates, suppress the interfaces or run the undo rip
input/output commands.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The silent-interface command suppresses an interface to
ht

allow it to receive but not send RIP packets. If an interface is


suppressed, direct routes of the network segment where the
interface resides can still be advertised to other interfaces.
s:

This command can be used together with the peer (RIP)


ce

command to advertise routes to a specified device.


The undo rip output/input command prohibits an interface
ur

from sending/receiving RIP packets.


so

View
silent-interface: RIP view
Re

undo rip output/input: interface view

Parameters
ng

silent-interface { all | interface-type interface-number }


ni

all: suppresses all the interfaces.


ar

Precautions
After all the interfaces are suppressed, one of the interfaces
Le

cannot be activated. That is, the silent-interface all command


has the highest priority. In this case, all the interfaces of R4
are suppressed, and so any interface of R4 cannot be
re

activated.
Mo
en
Configuration verification

m/
The display ip routing-table command output shows that: R3
can receive the update of route 172.16.0.0/24 from R5 but not

co
R4 and can receive the update of route 10.0.0.0/24 from R1
but not R2.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. To prevent a device from receiving routes from a


specified neighbor, run the filter-policy gateway command.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The filter-policy { acl-number | acl-name acl-name } import
ht

command filters received routes based on an ACL.


The filter-policy gateway ip-prefix-name import command
filters routes based on the advertising gateway.
s:
ce

View
filter-policy { acl-number | acl-name acl-name | ip-prefix ip-
ur

prefix-name } import: RIP view


filter-policy gateway ip-prefix-name import: RIP view
so

Parameters
Re

filter-policy { acl-number | acl-name acl-name } import


acl-number: specifies the number of a basic ACL used to
ng

filter the destination address of routes.


acl-name acl-name: specifies the name of an ACL. The
ni

name is case-sensitive and must start with a letter.


ip-prefix: filters routes based on an IP prefix list.
ar

ip-prefix-name: specifies the name of an IP prefix list


used to filter the destination address of routes.
Le

filter-policy gateway ip-prefix-name import


gateway: filters routes based on the advertising gateway.
ip-prefix-name: specifies the IP prefix list name of the
re

advertising gateway.
Mo
en
Configuration verification

m/
Run the filter-policy gateway command to filter routes from a
specified neighbor. In this case, routes from R4 are filtered on

co
R3.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
To reduce routing entries, Company A decides to summarize
ht

routes. RIPv2 summarization includes automatic


summarization based on the main class network and manual
summarization. You can perform automatic summarization on
s:

R1 and manual summarization on R3 and R4.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
summary [ always ]: When the class summarization is enable,
ht

summary routes are advertised to the natural network


boundary. In default the RIPv2 is enable. But If split horizon or
poison reverse is configured, summarization will become
s:

invalid. And when the always parameter is configured, no


ce

matter how the split horizon or poison reverse situation is,


RIPv2 automatic summarization is enable.
ur

rip summary-address ip-address mask [ avoid-feedback ]:


so

configures a RIP router to advertise a local summary IP


address. If the avoid-feedback keyword is configured, the
Re

local interface does not learn the summary route to the


advertised summary IP address. This configuration prevents
ng

routing loops.
ni

View
summary [ always ]: RIP view
ar

rip summary-address ip-address mask [ avoid-feedback ]:


interface view
Le

Parameters
summary [ always ]
re

always: If the always parameter is not configured,


Mo

classful summarization becomes ineffective when split


horizonor poison reverse is configured.
en
Therefore, when summary routes are advertised to the natural

m/
network boundary with no always, split horizon or poison
reverse must be disabled in corresponding views.

co
rip summary-address ip-address mask [ avoid-feedback ]
ip-address: specifies a summary IP address.

.
mask: specifies a network mask.

ei
avoid-feedback: avoids learning the summary route to

w
the advertised summary IP address from the interface.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, R1 and R2 connect over network 192.168.1.0/24.
ht

R1 connects to network 10.0.0.0/24, and R2 connects to


network 172.16.0.0/24. Devices on the network run RIPv2 and
import the routes to networks where the devices reside. Only
s:

the display command output of R1 is provided and only


ce

information about this case is displayed.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
timers rip update age garbage-collect: adjusts a timer.
ht

rip authentication-mode md5 nonstandard password-


key key-id: configures the MD5 authentication mode.
Authentication packets use the nonstandard packet format.
s:

nonstandard indicates that MD5 authentication packets use


ce

the nonstandard packet format (IETF standards).


rip replay-protect [ window-range ]: enables the replay-
ur

protect function. window-range specifies the receive or


transmit buffer size for connections. The default value is 50.
so

View
Re

timers rip update age garbage-collect: RIP view


rip authentication-mode md5 nonstandard password-
ng

key key-id: interface view


rip replay-protect [ window-range ]: interface view
ni

Parameters
ar

timers rip update age garbage-collect


update: specifies the interval for transmitting route
Le

updates.
age: specifies the route aging time.
garbage-collect: specifies the interval at which an
re

unreachable route is deleted from the routing table, namely,


Mo

garbage-collect time defined in standards.


en
Precautions

m/
If the three timers are configured incorrectly, routes become
unstable. The update time must be shorter than the aging time.

co
For example, if the update time is longer than the aging time, a
RIP router cannot notify route updates to neighbors within the

.
update time. In applications, the timeout period of the garbage-

ei
collect timer is not fixed. When the update timer is set to 30

w
seconds, the garbage-collect timer may range from 90 to 120

ua
seconds. The reason is as follows: Before the RIP router
deletes an unreachable route from the routing table, it sends

.h
Update packets four times to advertise the route and sets the
metric of the route is set to 16. Subsequently, all the neighbors

g
learn that the route is unreachable. Because a route may

in
become unreachable anytime within an update period, the

rn
garbage-collect timer is 3 to 4 times the update timer.
Assume that the Identification field (a field in an IP header) of

ea
the last RIP packet sent before a RIP interface goes Down is X.
After the interface becomes Up, the Identification file of the
/l
RIP packet sent again becomes 0, and subsequent RIP
packets are discarded until a RIP packet with the Identification
:/
field as X+1 is received. This, however, causes asynchronous
and lost RIP routing information between two ends. To
tp

address this issue, configure the rip replay-protect command


ht

to enable the RIP interface to obtain the Identification field of


the last RIP packet sent before the RIP interface goes Down
and increase the Identification field in the subsequent RIP
s:

packet by 1.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

1. Check whether ARP is working properly.


2. Check whether related interfaces are Up.
ht

3. Check whether RIP is enabled on the interfaces. Run the display


current-configuration configuration rip command to view
information about the RIP-enabled network segment. Check
s:

whether the interfaces reside on the network segment. The network


ce

address specified in the network command must be a natural


network address.
ur

4. Check whether versions of the RIP packets sent by the peer end
and received by the local end match. By default, an interface sends
so

only RIPv1 packets but can receive RIPv1 and RIPv2 packets.
When an inbound interface receives RIP packets of a different
Re

version, RIP routes may fail to be correctly received.


5. Check whether a routing policy is configured to filter received RIP
ng

routes. If so, modify the routing policy.


6. Check whether UDP port 520 is disabled.
ni

7. Check whether the undo rip input/output commands are


configured on the interfaces or whether a high metric is configured
ar

using the rip metricin command.


8. Check whether the interfaces are suppressed.
Le

9. Check whether the route metric is larger than 16.


10. Check whether the interface authentication modes on two ends
match. If packet authentication fails, correctly configure interface
re

authentication modes.
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

1. Check whether RIP is enabled on the interfaces. Run the display


current-configuration configuration rip command to view
ht

information about the RIP-enabled network segment. Check


whether the interfaces reside on the network segment. The network
address specified in the network command must be a natural
s:

network address.
ce

2. Check whether versions of the RIP packets sent by the peer end
and received by the local end match. By default, an interface sends
ur

only RIPv1 packets but can receive RIPv1 and RIPv2 packets.
When an inbound interface receives RIP packets of a different
so

version, RIP routes may fail to be correctly received.


3. Check whether a routing policy is configured to filter received RIP
Re

routes. If so, modify the routing policy.


4. Check whether UDP port 520 is disabled.
5. Check whether the undo rip input/output commands are
ng

configured on the interfaces or whether a high metric is configured


ni

using the rip metricin command.


6. Check whether the interfaces are suppressed.
ar

7. Check whether the route metric is larger than 16.


8. Check whether the interface authentication modes on two ends
Le

match. If packet authentication fails, correctly configure interface


authentication modes.
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, R1 connects to R2 through a frame relay network.
ht

R1 connects to network 10.X.X.0/24, and R2 connects to


network 172.16.X.0/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Analysis process
In the pre-configurations of R1 and R2, the frame relay
ht

configuration supports multicast.


R1 sends version 2 Update packets to R2 in multicast.
R1 and R2 can learn routes to each other.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Generally, the peer command makes the routers send the
ht

packets in unicast, but not surpress to sent packets in


multicast by default. Therefore, suggest configure the related
interfaces are silent mode when configure this command. So,
s:

the multicast packets is supress and only unicast packets can


ce

be sent.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
The display rip route command displays the RIP routes
ht

learned from other routers and values of timers for routes. The
Tag field indicates whether a RIP route is an internal or
external route. The default value is 0. The Flags field indicates
s:

whether a RIP route is active or inactive. The value RA


ce

indicates an active RIP route, and the value RG indicates an


inactive RIP route and that the garbage-collect timer has been
ur

started.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
After the avoid-feedback keyword is specified, the local
ht

interface does not learn the summary route to the advertised


summary IP address, preventing routing loops.
The filter-policy export command configures a filtering policy
s:

to filter the routes to be advertised. Only the filtered routes can


ce

be added to the routing table and advertised through Update


packets.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this topology, R1, R2, and R3 connect to the same
ht

broadcast domain. R3 connects to network 172.16.X.0/24 and


advertises routes to RIP.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Analysis process
In requirements 1 and 3, R1 is taken as an example. The
ht

command output shows that R1 sends multicast packets and


does not start authentication.
Before meeting requirement 2, R1 can receive all routes to
s:

172.16.X.0/24.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
RIP authentication command can only be configured on an
ht

interface. Huawei devices support standard MD5


authentication and Huawei proprietary authentication mode.
You can run the display rip process-id interface interface-
s:

type verbose command to view the authentication mode.


ce

Parameters
ur

rip authentication-
mode { simple password | md5 { nonstandard { password-
so

key1 key-id | keychain keychain-name } | usual password-


key2 } }
Re

simple: indicates plain-text authentication.


password: Specifies the plain-text authentication password.
ng

md5: indicates MD5 cipher-text authentication.


nonstandard: indicates that MD5 cipher-text
ni

authentication packets use the nonstandard packet


format (IETF standards)
ar

password-key1: specifies the authentication password in


cipher text.
Le

key-id: specifies the key in MD5 cipher-text authentication.


keychain keychain-name: specifies a keychain name.
usual: indicates that MD5 cipher-text authentication
re

packets use the universal packet format (namely,


Mo

private standards).
en
password-key2: indicates the cipher-text authentication

m/
keyword.

co
Precautions
Only one authentication password is used for each

.
authentication. If multiple authentication passwords are

ei
configured, only the latest one takes effect. The authentication

w
password does not contain spaces.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Only an ACL can be used but an IP prefix list cannot be used,
ht

When defined ACLs make sure use the wild-mask. In this case,
need focus on the bits of wild-mask is 0, and the other bits is 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
RIPv2 multicasts Update packets by default. You can run the
ht

rip version 2 broadcast command in the interface view to


configure RIPv2 to broadcast Update packets.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IS-IS Overview
IS-IS is a dynamic routing protocol designed by the
ht

International Organization for Standardization (ISO) for its


Connectionless Network Protocol (CLNP).
The Internet Engineering Task Force (IETF) extended and
s:

modified IS-IS so that IS-IS can be applied to TCP/IP and


ce

OSI environments. This version of IS-IS is called Integrated


IS-IS.
ur

IS-IS Terms
Connectionless network service (CLNS)
so

CLNS consists of the following three protocols:


CLNP: is similar to the Internet Protocol (IP) of TCP/IP.
Re

IS-IS: is a routing protocol between intermediate systems,


that is, a protocol between routers.

ng

ES-IS: End System to Intermediate System ,is similar to


Address Resolution Protocol (ARP) and Internet Control
ni

Message Protocol (ICMP) of IP.


NSAP: The open systems interconnection (OSI) uses
ar

NSAP(Network Service Access Point) to search for various


services at the transport layer on OSI networks. An NSAP
Le

is similar to an IP address.
Note for Integrated IS-IS
Integrated IS-IS applies to TCP/IP and OSI environments.
re

Unless otherwise specified, the IS-IS protocol in this


Mo

material refers to Integrated IS-IS.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Overall IS-IS Topology


To support large-scale routing networks, IS-IS adopts a
ht

two-level hierarchy consisting of a backbone area and non-


backbone areas in an autonomous system (AS). Generally,
Level-1 routers are deployed in non-backbone areas,
s:

whereas Level-2 and Level-1-2 routers are deployed in the


ce

backbone area. Each non-backbone area connects to the


backbone area through a Level-1-2 router.
ur

Topology Introduction
The figure shows a network that runs IS-IS. The network
so

topology is similar to the multi-area topology of an OSPF


network. The backbone area contains all routers in area
Re

47.0001 and Level-1-2 routers in other areas.


In addition, Level-2 routers can be in different areas.
Differences between IS-IS and OSPF of topology are as follows:
ng

In OSPF, a link can belongs to only one area.In IS-IS, a link


ni

can belong to different areas.


In IS-IS, no area is physically defined as the backbone or
ar

non-backbone area. In OSPF, Area 0 is defined as the


backbone area.
Le

In IS-IS, Level-1 and Level-2 routers use the shortest path


first (SPF) algorithm to generate shortest path trees (SPTs)
respectively. In OSPF, the SPF algorithm is used only in
re

the same area, and inter-area routes are forwarded by the


Mo

backbone area.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Level-1 Router
A Level-1 router manages intra-area routing. It establishes
ht

neighbor relationships with only Level-1 and Level-1-2


routers in the same area. A Level-1 router maintains a
Level-1 link state database (LSDB), which contains routes
s:

in the local area.


A Level-1 router forwards packets destined for other areas
ce

to the nearest Level-1-2 router.


ur

A Level-1 router connects to other areas through a Level-1-


2 router.
so

Level-2 Router
A Level-2 router manages inter-area routing. It can
Re

establish neighbor relationships with Level-2 routers in the


same area or in other areas, as well as Level-1-2 routers.
A Level-2 router maintains a Level-2 LSDB, which contains
ng

all routes in the IS-IS network.


ni

All Level-2 routers form the backbone network of the


routing domain,They establish Level-2 neighbor
ar

relationships and are responsible for inter-area


communication. Level-2 routers in the routing domain must
Le

be physically contiguous to ensure the continuity of the


backbone network.
Level-1-2 Router
re

A router that belongs to both a Level-1 area and a Level-2


Mo

area is called a Level-1-2 router. It can establish Level-1


neighbor relationships with Level-1 and Level-1-2 routers in
the same area.
en
It can also establish Level-2 neighbor relationships with

m/
Level-2 and Level-1-2 routers in the same area or the other
areas.

co
A Level-1 router connects to other areas through a Level-1-
2 router.

.
A Level-1-2 router maintains a Level-1 LSDB for intra-area

ei
routing and a Level-2 LSDB for inter-area routing.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Network Types Supported by IS-IS


For a non-broadcast multiple access (NBMA) network such
ht

as a frame relay (FR) network, you need to configure sub-


interfaces and set the sub-interface type to point-to-point
(P2P). IS-IS cannot run on point-to-multipoint (P2MP) links.
s:

DIS
In a broadcast network, IS-IS needs to elect a designated
ce

intermediate system (DIS) from all the routers.


ur

The Level-1 DIS and Level-2 DIS are elected respectively.


The router with the highest DIS priority is elected as the
so

DIS. If there are multiple routers with the highest DIS


priority, the router with the largest MAC address is elected
Re

as the DIS.
You can set different DIS priorities for electing DISs of
ng

different levels.
A router whose DIS priority is 0 can also participate in a
ni

DIS election, which supports preemption.


All routers (including non-DIS routers) of the same level
ar

and on the same network segment establish adjacencies.


However, the LSDB synchronization is ensured by DISs.
Le

DISs are used to create and update pseudonodes, and


generate link state protocol data units (LSPs) of
pseudonodes. LSPs are used to describe network devices
re

on the network.
Mo
en
Pseudonode

m/
A pseudonode is used to simulate a virtual node in the
broadcast network. It is not a real router. In IS-IS, a

co
pseudonode is identified by the system ID of the DIS and
the 1-byte Circuit ID (its value is not 0). The use of

.
pseudonodes simplifies the network topology.

ei
When the network changes, the number of generated LSPs

w
is reduced, and the SPF calculation consumes fewer

ua
resources.

.h
Differences Between DIS in IS-IS and designated router (DR)/backup
designated router (BDR) in OSPF

g
In an IS-IS broadcast network, a router whose priority is 0

in
also takes part in DIS election. In an OSPF network, a

rn
router whose priority is 0 does not take part in DR election.
In an IS-IS broadcast network, when a new router that

ea
meets the requirements of being a DIS connects to the
network, the router is elected as the new DIS, and the
/l
previous pseudonode is deleted. This causes a new
flooding of LSPs. In an OSPF network, when a new router
:/
connects to the network, it is not immediately elected as the
tp

DR even if it has the highest DR priority.


In an IS-IS broadcast network, all routers (including non-
ht

DIS routers) of the same level and on the same network


segment establish adjacencies.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

NSAP
An NSAP consists of the initial domain part (IDP) and domain
ht

specific part (DSP). The lengths of the IDP and DSP are variable.
The maximum length of the NSAP is 20 bytes and its minimum
length is 8 bytes.
s:

The IDP is similar to the network ID in an IP address. It is defined


ce

by the ISO and consists of the authority and format identifier (AFI)
and initial domain identifier (IDI). The AFI indicates the address
ur

allocation authority and address format, and the IDI identifies a


domain.
so

The DSP is similar to the subnet ID and host address in an IP


address. It consists of the High Order DSP (HODSP), system ID,
Re

and NSAP selector (SEL). The HODSP is used to divide areas,


the system ID identifies a host, and the SEL indicates the service
ng

type.
The area address (area ID) consists of the IDP and the HODSP of
ni

the DSP. It identifies a routing domain and areas in a routing


domain. An area address is similar to an area number in OSPF.
ar

Routers in the same Level-1 area must have the same area
address, while routers belong to the Level-2 area can have
Le

different area addresses.


A system ID uniquely identifies a host or router in an area. On a
device, the fixed length of the system ID is 48 bits (6 bytes).
re

Generally, the device's router ID is converted into a system ID.


Mo

An SEL provides similar functions as the protocol identifier of IP.


A transport protocol matches an SEL. The SEL is always 00 in IP.
en
NET

m/
An NET indicates network layer information about a device. An
NET can be regarded as a special NSAP (SEL is 00). The NET

co
length is the same as the NSAP length. Its maximum length is 20
bytes and minimum length is 8 bytes. When configuring IS-IS on a

.
router, you only need to consider an NET but not an NSAP.

ei
A maximum of three NETs can be configured during IS-IS

w
configuration. When configuring multiple NETs, ensure that their

ua
system IDs are the same.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Hello PDU (IIH)


Level-1 LAN IIHs apply to Level-1 routers on broadcast networks.
ht

Level-2 LAN IIHs apply to Level-2 routers on broadcast networks.


P2P IIHs apply to non-broadcast networks.
Compared to a LAN IIH, a P2P IIH does not have the Priority and
s:

LAN ID fields, but has a Local Circuit ID field. The Priority field
ce

indicates the DIS priority in a broadcast network, the LAN ID field


indicates the system ID of the DIS and pseudonode, and the
ur

Local Circuit ID field indicates the local link ID.


IIHs are used for two neighbors to negotiate MTU by padding the
so

packets to the maximum size.


LSP LSPs are similar to link-state advertisements (LSAs) in OSPF.
Re

Level-1 routers transmit Level-1 LSPs.


Level-2 routers transmit Level-2 LSPs.

ng

Level-1-2 routers transmit both Level-1 and Level-2 LSPs.


The ATT, OL, and IS-Type fields are major fields in an LSP. The
ni

ATT field identifies that the LSP is sent by a Level-1 or Level-2


router. The OL field identifies the overload state. The IS-Type field
ar

indicates whether the router that generates the LSP is a Level-1


router or Level-2 router (the value 01 indicates Level-1 and the
Le

value 11 indicates Level-2).


The LSP update interval is 15 minutes and aging time is 20
minutes. However, an expired LSP will be kept in the database for
re

an additional 60 seconds (known as ZeroAgeLifetime) before it is


Mo

cleared. The LSP retransmission time is 5 seconds.


en
Sequence number PDU (SNP)

m/
An SNP contains summary information of the LSDB and is used
to maintain LSDB integrity and synchronization.

co
Complete SNPs (CSNPs) carry summaries of all LSPs in LSDBs,
ensuring LSDB synchronization between neighboring routers. In a

.
broadcast network, the DIS periodically sends CSNPs. The

ei
default interval for sending CSNPs is 10 seconds. On a P2P link,

w
CSNPs are sent only when the neighbor relationship is

ua
established for the first time.
Partial SNPs (PSNPs) carry summaries of LSPs in some LSDBs,

.h
and are used to request and acknowledge LSPs.
Initial Packet Structure of an IS-IS PDU

g
Intra domain routing protocol discriminator

in
This field has a fixed value of 0x83 in all IS-IS PDUs.

rn
PDU header length indicator
It identifies the length of the fixed header field.

ea
Version/protocol ID extension
It has a fixed value of 1.
System ID length
/l
It indicates the system ID length and has a fixed
:/
value of 6 bytes.
PDU type
tp

It identifies the PDU type.



ht

Version
It has a fixed value of 1.
Reserve
s:

It is set to all zeros.


Max areas
ce

It indicates the maximum number of areas supported


by the intermediate system (IS). If the value is 3, the
ur

IS supports a maximum of three areas.


IIHs on a P2P link
so

Circuit type
Re

It indicates the level of the router that sends the PDU.


If this field is set to 0, the PDU will be ignored.
System ID
ng

It indicates the system ID of the originating router


that sends the IIH.
ni

Holding time
It indicates the interval for the peer router to wait for
ar

the originating router to send the next IIH.


Le

PDU length
It indicates the PDU length.
Local circuit ID
re

It is allocated to the local circuit by the originating


router when the router sends IIHs. This ID is unique
Mo

on the router interface. On the other end of the P2P


link, thecircuit ID contained in IIHs may be the same
as or different from the local circuit ID.
en
Area address TLV

m/
It indicates the area address of the originating router.
IP interface address TLV

co
It indicates the interface address or IP address of the
router that sends the PDU.

.
Protocol supported TLV

ei
It indicates protocol types supported by the

w
originating router, such as IP, CLNP, and IPv6.

ua
Restart option TLV
It is used for graceful restart.

.h
Point-to-point adjacency state TLV
It indicates that three-way handshake is supported.

g
Multi topology TLV

in
It indicates that multi-topology is supported.

rn
Padding TLV
It indicates that IIH padding is supported.

ea
LSP
PDU length

Remaining lifetime
/l
It indicates the PDU length.
:/
It indicates the time before an LSP expires
LSP ID
tp

It can be the system ID, pseudonode ID, or LSP


ht

number.
The value 0000.0000.0001.00-00 indicates a
common LSP.
s:

The value 0000.0000.0001.01-00 indicates a


pseudonode LSP.
ce

The value 0000.0000.0001.00-01 indicates a


fragment of a common LSP.
ur

Sequence number
It indicates the sequence number of the LSP. The
so

value starts from 0 and increases by 1. The


Re

maximum value is 2^32-1.


Checksum
It indicates the checksum. The checksum start after
ng

from the LSP Remaining Time till the end.


P bit
ni

It is used to repair segmented areas and is similar to


the OSPF virtual link. Most vendors do not support
ar

this feature.
Le

ATT bit
It indicates that the originating router is connected to
one or multiple areas.
re

OL bit
It identifies the overload state.
Mo

IS type
It indicates the router type.
en
Protocol supported TLV

m/
It indicates protocol types supported by the
originating router, such as IP, CLNP, and IPv6.

co
Area address TLV
It indicates the area address of the originating router.

.
IS reachability TLV

ei
It is used to list neighbors of the originating router.

w
IP interface address TLV

ua
It indicates the interface address or IP address of the
router that sends the PDU.

.h
IP internal reachability TLV
It indicates that the IP address is internally reachable.

g
It is used to advertise the IP address and related

in
mask information of the area that directly connects to

rn
the router that sends the LSP. A pseudonode LSP
does not contain this TLV.

ea
CSNP and PSNP
PDU length
/l
It indicates the PDU length.
Source-ID
:/
It indicates the system ID of the originating router.
Start LSP-ID
tp

It starts from 0000.0000.0000.00-00.


It ends at ffff.ffff.ffff.ff-ff.
ht

LSP entries
LSP summary information
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Routers of different levels cannot establish neighbor relationships.


Level-2 routers cannot establish neighbor relationships with Level-1
ht

routers. However, Level-1-2 routers can establish Level-1 neighbor


relationships with Level-1 routers in the same area, and establish
Level-2 neighbor relationships with Level-2 routers in the same area or
s:

in different areas.
ce

Level-1 routers can only establish Level-1 neighbor relationships with


ur

Level-1 or Level-1-2 routers in the same area.


so

IP addresses of IS-IS interfaces on both ends of a link must be on the


Re

same network segment.


According to IS-IS principles, the establishment of IS-IS neighbor
relationships is irrelevant to IP addresses. Therefore, routers that
ng

establish neighbor relationships may be on different network


segments. To solve this problem, Huawei devices check the
ni

network segment of routers to ensure that IS-IS neighbor


relationships are correctly established.
ar

You can configure interfaces not to check IP addresses on a P2P


Le

network if the network does not need to check the IP addresses.


In a broadcast network, you need to simulate Ethernet interfaces
as P2P interfaces before configuring the interfaces not to check
re

IP addresses.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Two routers running IS-IS need to establish a neighbor relationship


before exchanging protocol packets to implement routing. On different
ht

networks, the modes for establishing IS-IS neighbor relationships are


different.
s:

In a broadcast network, routers exchange LAN IIHs to establish


ce

neighbor relationships. LAN IIHs are classified into Level-1 LAN IIHs
(with the multicast MAC address 01-80-C2-00-00-14) and Level-2 LAN
ur

IIHs (with the multicast MAC address 01-80-C2-00-00-15). Level-1


routers exchange Level-1 LAN IIHs to establish neighbor relationships.
so

Level-2 routers exchange Level-2 LAN IIHs to establish neighbor


relationships. Level-1-2 routers exchange Level-1 LAN IIHs and Level-2
Re

LAN IIHs to establish neighbor relationships.

In this example, two Level-2 routers establish a neighbor relationship


ng

on a broadcast link.
ni

R1 multicasts a Level-2 LAN IIH (with the multicast MAC address


01-80-C2-00-00-15) with no neighbor ID specified.
ar

R2 receives the packet and sets the status of the neighbor


relationship with R1 to Initial. R2 then responds to R1 with a
Le

Level-2 LAN IIH, indicating that R1 is a neighbor of R2.


R1 receives the packet and sets the status of the neighbor
relationship with R2 to Up. R1 then responds to R2 with a Level-2
re

LAN IIH, indicating that R2 is a neighbor of R1.


Mo

R2 receives the packet and sets the status of the neighbor


relationship with R1 to Up. R1 and R2 successfully establish a
neighbor relationship.
en
The network is a broadcast network, so a DIS needs to be elected.

m/
After the neighbor relationship is established, routers wait for two
intervals before sending Hello PDUs to elect the DIS. Hello PDUs

co
exchanged by the routers contain the Priority field. The router with the
highest priority is elected as the DIS. If the routers have the same

.
priority, the router with the largest interface MAC address is elected as

ei
the DIS. In an IS-IS network, the DIS sends Hello PDUs at an interval

w
of 10/3 seconds, and non-DIS routers send Hello PDUs at an interval of

ua
10 seconds.

.h
Differences between IS-IS Adjacencies and OSPF Adjacencies
In IS-IS, two neighbor routers establish an adjacency if they

g
exchange Hello PDUs. In OSPF, two routers establish a neighbor

in
relationship if they are in 2-Way state, and establish an adjacency

rn
if they are in Full state.
In IS-IS, a router whose priority is 0 can participate in a DIS

ea
election. In OSPF, a router whose priority is 0 does not take part
in DR election.

/l
In IS-IS, the DIS election is based on preemption. In OSPF, a
router cannot preempt to be the DR or BDR if the DR or BDR has
:/
been elected.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Unlike the establishment of a neighbor relationship on a broadcast


network, the establishment of a neighbor relationship on a P2P network
ht

is classified into two modes: two-way mode and three-way mode.


s:

Two-Way Mode
Upon receiving a P2P IIH from a peer router, a router
ce

considers the peer router Up and establishes a neighbor


ur

relationship with the peer router.


Unidirectional communication may occur.
so

Three-Way Mode
A neighbor relationship is established after P2P IIHs are
Re

sent for three times. The establishment of a neighbor


relationship on a P2P network is similar to that on a
ng

broadcast network.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The process of synchronizing LSDBs between a newly added router


and DIS on a broadcast link is as follows:
ht

Assume that the newly added router R3 has established neighbor


relationships with R2 (DIS) and R1.
R3 sends an LSP to a multicast address (01-80-C2-00-00-14 in a
s:

Level-1 area and 01-80-C2-00-00-15 in a Level-2 area). All


ce

neighbors on the network can receive the LSP.


The DIS on the network segment adds the received LSP to its
ur

LSDB. After the CSNP timer expires, the DIS sends CSNPs at an
interval of 10 seconds to synchronize the LSDBs on the network.
so

R3 receives the CSNPs from the DIS, checks its LSDB, and
sends a PSNP to the DIS to request the LSPs it does not have.
Re

The DIS receives the PSNP and sends the required LSPs to R3
for LSDB synchronization.
ng

The process of updating the LSDB of the DIS is as follows:


ni

The DIS receives an LSP and searches the matching record in


the LSDB. If no matching record exists, the DIS adds the LSP to
ar

the LSDB and multicasts the new LSDB.


If the sequence number of the received LSP is larger than that of
Le

the corresponding LSP in the LSDB, the DIS replaces the local
LSP with the received LSP and multicasts the new LSDB. If the
re

sequence number of the received LSP is smaller than that of the


LSP in the LSDB, the DIS sends the local LSP to the inbound
Mo

interface.
en
If the sequence number of the received LSP is the same as that of

m/
the
corresponding LSP in the LSDB, the DIS compares the remaining

co
lifetime of the two LSPs. If the remaining lifetime of the received
LSP is smaller than that of the LSP in the LSDB, the DIS replaces

.
the local LSP with the received LSP and broadcasts the new

ei
LSDB. If the remaining lifetime of the received LSP is larger than

w
that of the LSP in the LSDB, the DIS sends the local LSP to the

ua
inbound interface.
If the sequence number and the remaining lifetime of the received

.h
LSP and those of the corresponding LSP in the LSDB are the
same, the DIS compares the checksum of the two LSPs. If the

g
checksum of the received LSP is larger than that of the LSP in the

in
LSDB, the DIS replaces the local LSP with the received LSP and

rn
broadcasts the new LSDB. If the checksum of the received LSP is
smaller than that of the LSP in the LSDB, the DRB sends the local

ea
LSP to the inbound interface.
If the sequence number, remaining lifetime, and checksum of the
/l
received LSP and those of the corresponding LSP in the LSDB
are the same, the LSP is not forwarded.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The process of synchronizing LSDBs on a P2P network is as follows:


After establishing a neighbor relationship, R1 and R2 send a
ht

CSNP to each other. If the LSDB of the neighbor and the received
CSNP are not synchronized, the neighbor sends a PSNP to
request the required LSP.
s:

Assume that R2 requests the required LSP from R1. R1 sends


ce

the required LSP to R2, starts the LSP retransmission timer, and
waits for a PSNP from R2 as an acknowledgement for the
ur

received LSP.
If R1 does not receive a PSNP from R2 after the LSP
so

retransmission timer expires, R1 resends the LSP until it receives


a PSNP from R2.
Re

The process of updating LSDBs on a P2P link is as follows:


ng

If the sequence number of the received LSP is smaller than that of


the corresponding LSP in the LSDB, the router directly sends the
ni

local LSP to the neighbor and waits for a PSNP from the neighbor.
If the sequence number of the received LSP is larger than that of
ar

the corresponding LSP in the LSDB, the router adds the received
LSP to its LSDB, sends a PSNP to acknowledge the received
Le

LSP, and then sends the received LSP to all its neighbors except
the neighbor that sends the LSP.
re

If the sequence number of the received LSP is the same as that of


the corresponding LSP in the LSDB, the router compares the
Mo

remaining lifetime of the two LSPs.


en
If the remaining lifetime of the received LSP is smaller than that of

m/
the LSP in the LSDB, the router replaces the local LSP with the
received LSP, sends a PSNP to acknowledge the received LSP,

co
and sends the received LSP to all neighbors except the neighbor
that sends the LSP. If the remaining lifetime of the received LSP

.
is larger than that of the LSP in the LSDB, the router sends the

ei
local LSP to the neighbor and waits for a PSNP.

w
If the sequence number and remaining lifetime of the received

ua
LSP are the same as those of the corresponding LSP in the LSDB,
the router compares the checksums of the two LSPs. If the

.h
checksum of the received LSP is larger than that of the LSP in the
LSDB, the router replaces the local LSP with the received LSP,

g
sends a PSNP to acknowledge the received LSP, and sends the

in
received LSP to all neighbors except the neighbor that sends the

rn
LSP. If the checksum of the received LSP is smaller than that of
the LSP in the LSDB, the router sends the local LSP to the

ea
neighbor and waits for a PSNP.
If the sequence number, remaining lifetime, and checksum of the
/l
received LSP and those of the corresponding LSP in the LSDB
are the same, the LSP is not forwarded.
:/
tp

On a P2P network, a PSNP has the following functions:


It is used to acknowledge a received LSP.
ht

It is used to request a required LSP.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Assume that R1 sends packets to R6. The default situation is as


follows:
ht

As a Level-1 router, R1 does not know routes outside its area, so


it sends packets to other areas through the default route
s:

generated by the nearest Level-1-2 router (R3). Therefore, R1


selects the route R1->R3->R5->R6, which is not the optimal
ce

route, to forward the packets.


ur

To solve this question, IS-IS provide the Route Leaking. You can
so

configure access control lists (ACLs) and routing policies and mark
routes with tags on Level-1-2 routers to select eligible routes. Then a
Re

Level-1-2 router can advertise routing information of other Level-1


areas and the backbone area to its Level-1 area.
ng

If route leaking is enabled on Level-1-2 routers (R3 and R4), Level-1


ni

routers in area 47.0001 can know of routes outside area 47.0001 and
routes passing through the two Level-1-2 routers. After route calculation,
ar

the forwarding path becomes R1->R2->R4->R5->R6, which is the


Le

optimal route from R1 to R6.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Principles
LSPs with the overload bit are still flooded on the network,
ht

but the LSPs are not used when routes that pass through a
router configured with the overload bit are calculated. That is,
s:

after the overload bit is set on a router, other routers ignore this
router when performing SPF calculation and calculate only the
ce

direct routes of the router.


ur

Topology
so

R2 forwards the packets from R1 to R3. If the overload bit on R2


is set to 1, R1 considers the LSDB of R2 incomplete and sends
Re

packets to R3 through R4 and R5. This process does not affect


packets sent to the directly connected address of R2.
ng

A device enters the overload state in the following situations:


A device automatically enters the overload state due to
ni

exceptions.
You can manually configure a device to enter the overload
ar

state.
Le

Results of entering the overload state


If the system enters the overload state due to exceptions, the
system deletes all the imported or leaked routes.
re

If the system is configured to enter the overload state, the system


determines whether to delete all the imported or leaked routes
Mo

based on the configuration.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Fast Convergence
Incremental SPF (I-SPF): recalculates only the routes of the
ht

changed nodes rather than all the nodes when the network
topology changes, with exception to where calculation is
s:

performed for the first time, at which time all nodes are involved,
thereby speeding up route calculation. I-SPF improves the SPF
ce

algorithm. The shortest path tree (SPT) generated is the same as


that generated by the SPF algorithm. This decreases CPU usage
ur

and speeds up network convergence.


so

Partial route calculation (PRC): calculates only the changed


routes when the network topology changes. Similar to I-SPF, PRC
Re

calculates only the changed routes, but it does not calculate the
shortest path. It updates routes based on the SPT
ng

calculated by I-SPF. In route calculation, a leaf represents a


ni

route, and a node represents a router. If the SPT changes


after I-SPF calculation, PRC processes all the leaves only on the
ar

changed node. If the SPT remains unchanged, PRC processes


only the changed leaves. For example, if IS-IS is enabled on an
Le

interface of a node, the SPT calculated by I-SPF remains


unchanged. PRC updates only the routes of this interface,
re

consuming less CPU resources.


Mo
en
Intelligent Timer

m/
LSP generation intelligent timer: There is a minimum
interval restriction on LSP generation to prevent frequent

co
flapping of LSPs from affecting the network. The same LSP
cannot be generated repeatedly within the minimum

.
interval, which is 5 seconds by default. This restriction

ei
significantly affects route convergence speed.

w
In IS-IS, if local routing information changes,

ua
a router generates a new LSP to advertise this change.
When local routing information changes frequently, the

.h
newly generated LSPs consume a lot of system resources.
If the delay in generating an LSP is too long, the router

g
cannot advertise changed routing information to neighbors

in
in time, reducing the network convergence speed. The

rn
delay in generating an LSP for the first time is determined
by init-interval, and the delay in generating an LSP for the

ea
second time is determined by incr-interval. From the third
time on, the delay in generating an LSP increases twice
/l
every time until the delay reaches the value specified by
max-interval. After the delay remains at the value specified
:/
by max-interval for three times or the IS-IS process is
restarted, the delay decreases to the value specified by init-
tp

interval. When only max-interval is specified, the intelligent


ht

timer functions as an ordinary one-time triggering timer.


SPF calculation intelligent timer: In IS-IS, routes are
calculated when the LSDB changes. However, frequent
s:

route calculations consume a lot of system resources and


decrease the system performance. Delaying SPF
ce

calculation can improve route calculation efficiency. If the


delay in route calculation is too long, the route convergence
ur

speed is reduced. The delay in SPF calculation for the first


time is determined by init-interval and the delay in SPF
so

calculation for the second time is determined by incr-


Re

interval. From the third time on, the delay in SPF


calculation increases twice every time until the delay
reaches the value specified by max-interval. After the delay
ng

remains at the value specified by max-interval for three


times or the IS-IS process is restarted, the delay decreases
ni

to the value specified by init-interval. If incr-interval is not


specified, the delay in SPF calculation for the first time is
ar

determined by init-interval. From the second time on, the


Le

delay in SPF calculation is determined by max-interval.


After the delay remains at the value specified by max-
interval for three times or the IS-IS process is restarted, the
re

delay decreases to the value specified by init-interval.


When only max-interval is specified, the intelligent timer
Mo

functions as an ordinary one-time triggering timer.


en
LSP fast flooding: Because the number of LSPs is huge, IS-IS

m/
periodically floods LSPs in batches to reduce the impact of LSP
flooding on network devices. By default, the minimum interval for

co
sending LSPs on an interface is 50 milliseconds and the
maximum number of LSPs sent at a time is 10. After the flash-

.
flood function is enabled, when LSPs change and cause SPF

ei
recalculation, IS-IS immediately floods LSPs that cause SPF

w
recalculation instead of sending the LSPs periodically. When the

ua
network topology changes, LSDBs of all devices on the network
are inconsistent. This function effectively reduces the time during

.h
which LSDBs are inconsistent and improves the network fast
convergence performance. When a network fault occurs, only a

g
small number of LSPs change although a large number of LSPs

in
exist. Therefore, IS-IS only needs to flood the changed LSPs and

rn
consumes a few system resources.
Priority-based Convergence

ea
You can use the IP prefix list to filter routes and configure different
convergence priorities for different routes so that important routes


/l
are converged first, improving the network reliability.
The convergence priorities of IS-IS routes are classified into
:/
critical, high, medium, and low in decreasing order.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

In area authentication and routing domain authentication, you can


configure a router to authenticate LSPs and SNPs separately in the
ht

following ways:
The router sends LSPs and SNPs carrying the authentication TLV
and verifies the authentication information of the received LSPs
s:

and SNPs.
ce

The router sends LSPs carrying the authentication TLV and


verifies the authentication information of the received LSPs. The
ur

router sends SNPs carrying the authentication TLV but does not
verify the authentication information of the received SNPs.
so

The router sends LSPs carrying the authentication TLV and


Re

verifies the authentication information of the received LSPs.


The router sends SNPs without the authentication TLV and
does not verify the authentication information of the received
ng

SNPs.
The router sends LSPs and SNPs carrying the authentication TLV
ni

but does not verify the authentication information of the received


ar

LSPs and SNPs.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Concepts
Originating system: is a router that runs the IS-IS protocol. After
ht

LSP fragment extension is enabled, you can configure virtual


systems for the router. The originating system refers to the IS-IS
process.
s:

System ID: is the system ID of the originating system.



ce

Additional System ID: is configured for a virtual system after IS-IS


LSP fragment extension is enabled. A maximum of 256 extended
ur

LSP fragments can be generated for each additional system ID.


Like a normal system ID, an additional system ID must be unique
so

in a routing domain.
Virtual system: is a system identified by an additional system ID. It
Re

is used to generate extended LSP fragments.


Principles

ng

IS-IS floods LSPs to advertise link state information. Because one


LSP carries limited information, IS-IS fragments LSPs. Each LSP
ni

fragment is uniquely identified by and consists of the system ID,


pseudonode ID (0 for a common LSP and a non-zero value for a
ar

pseudonode LSP), and LSP number (LSP fragment No.) of the


node or pseudonode that generates the LSP. The length of the
Le

LSP number is 1 byte. Therefore, an IS-IS router can generate a


maximum of 256 LSP fragments, restricting link information that
can be advertised by the router.
re
Mo
en
The LSP fragment extension feature enables an IS-IS router to

m/
generate more LSP fragments. You can configure up to 50 virtual
systems for the router. Each virtual system can generate a

co
maximum of 256 LSP fragments. An IS-IS router can generate a
maximum of 13,056 LSP fragments.

.
An IS-IS router can run the LSP fragment extension feature in two

ei
modes.

w
Mode-1

ua
It is used when some routers on the network do not support
LSP fragment extension.

.h
Virtual systems participate in SPF calculation. The
originating system advertises LSPs containing information

g
about links to each virtual system. Similarly, each virtual

in
system advertises LSPs containing information about links

rn
to the originating system. Virtual systems look like the
physical routers that connect to the originating system.

ea
The LSP sent by a virtual system contains the same area
address and overload bit as those in a common LSP. If the
/l
LSPs sent by a virtual system contain TLVs specified in
other features, these TLVs must be the same as those in
:/
common LSPs.
The virtual system carries neighbor information indicating
tp

that the neighbor is the originating system, with the metric


ht

equal to the maximum value (64 for narrow metric) minus 1.


The originating system carries neighbor information
indicating that the neighbor is the virtual system, with the
s:

metric 0. This ensures that the virtual system is the


downstream node of the originating system when other
ce

routers calculate routes.


As shown in the topology, R2 does not support LSP
ur

fragment extension, and R1 is configured to support LSP


fragment extension in mode-1. R1-1 and R1-2 are virtual
so

systems of R1 and send LSPs carrying some routing


Re

information of R1. After receiving LSPs from R1, R1-1, and


R1-2, R2 considers that there are three individual routers at
the remote end and calculates routes. Because the cost of
ng

the route from R1 to R1-1 and the cost of the route from R1
to R1-2 are both 0, the cost of the route from R2 to R1 is
ni

the same as the cost of the route from R2 to R1-1.


The LSPs that are generated by virtual systems contain
ar

only the originating system as the neighbor (the neighbor


Le

type is P2P). In addition, virtual systems are considered


only as leaves.
Mode-2
re

It is used when all the routers on the network support LSP


fragment extension. In this mode, virtual systems do not
Mo

participate in SPF calculation.


en
All the routers on the network know that the LSPs

m/
generated by virtual systems actually belong to the
originating system.

co
R2 supports LSP fragment extension, and R1 is configured
to support LSP fragment extension in mode-2. R1-1 and

.
R1-2 are virtual systems of R1 and send LSPs carrying

ei
some routing information of R1.

w
When receiving LSPs from R1-1 and R1-2, R2 obtains the IS

ua
Alias ID TLV and knows that the originating system of R1-1
and R1-2 is R1. R2 then considers that information

.h
advertised by R1-1 and R1-2 belongs to R1.
Precautions

g
After LSP fragment extension is configured, the system

in
prompts you to restart the IS-IS process if information is

rn
lost because LSPs overflow. After being restarted, the
originating system loads as much routing information as

ea
possible to LSPs, and adds the overloaded information to
the LSPs of the virtual system for transmission.
/l
If there are devices of other vendors on the network, LSP
fragment extension must be set to mode-1, otherwise,
:/
devices of other vendors cannot identify the LSPs.
It is recommended that you configure LSP fragment
tp

extension and virtual systems before establishing IS-IS


ht

neighbor relationships or importing routes. If you establish


IS-IS neighbor relationships or import routes, IS-IS will
carry a lot of information that cannot be loaded through 256
s:

fragments. You must configure LSP fragment extension


and virtual systems. The configuration takes effect only
ce

after you restart the IS-IS router. Therefore, exercise


caution when you establish IS-IS neighbor relationships or
ur

import routes.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IS-IS Administrative Tag


Administrative tags control the advertisement of IP prefixes in an
ht

IS-IS routing domain to simplify route management. You can use


administrative tags to control the import of routes of different
levels and different areas and control IS-IS multi-instances (tags)
s:

running on the same router.


ce

Topology
Assume that R1 only needs to receive only Level-1 routing
ur

information from R2, R3, and R4. To meet this requirement,


configure the same administrative tag for IS-IS interfaces on R2,
so

R3, and R4. Then configure the Level-1-2 router in area 47.0003
to leak only the routes matching the configured administrative tag
Re

from Level-2 to Level-1 areas. This configuration allows R1 to


receive only Level-1 routing information from R2, R3, and R4.
Precautions
ng

To use administrative tags, you must enable the IS-IS wide metric
ni

attribute.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RX interconnects with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively, network
s:

mask is 24.
ce

Remarks
R4 and R5 are Level-1-2 routers. They take part in calculate the
ur

routes of Level-1 and Level-2 at the same time, and maintain the
Level-1 and Level-2 LSDB.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The is-level command sets the level of an IS-IS router. By
ht

default, the level of an IS-IS router is Level-1-2.


The isis circuit-level command sets the link type of an
interface.
s:

View

ce

is-level: IS-IS view


isis circuit-level: interface view
ur

Parameters
is-level { level-1 | level-1-2 | level-2 }
so

level-1: sets a router as a Level-1 router, which


calculates only intra-area routes and maintains a Level-1
Re

LSDB.
level-1-2: sets a router as a Level-1-2 router, which
ng

calculates Level-1 and Level-2 routes and maintains a


Level-1 LSDB and a Level-2 LSDB.
ni

level-2: sets a router as a Level-2 router, which


exchanges only Level-2 LSPs, calculates only Level-2
ar

routes, and maintains a Level-2 LSDB.


isis circuit-level [ level-1 | level-1-2 | level-2 ]
Le

level-1: specifies the Level-1 link type. That is, only


Level-1 neighbor relationship can be established on the
interface.
re

level-1-2: specifies the Level-1-2 link type. That is, both


Mo

Level-1 and Level-2 neighbor relationships can be


established on the interface.
en
level-2: specifies the Level-2 link type. That is, only

m/
Level-2 neighbor relationship can be established on the
interface.

co
Precautions
If a router is a Level-1-2 router and needs to establish a

.
neighbor relationship at a specified level (Level-1 or Level-

ei
2) with a peer router, you can run the isis circuit-level

w
command to allow the local interface to send and receive

ua
only Hello packets of the specified level on the P2P link.
This configuration prevents the router from processing too

.h
many Hello packets and saves the bandwidth.
The configuration of the isis circuit-level command takes

g
effect on the interface only when the IS-IS system type is

in
Level-1-2, otherwise, the level configured using the is-

rn
level command is used as the link type.
In a P2P network, the Circuit ID uniquely identifies a local

ea
interface. In a broadcast network, the Circuit ID is the
system ID and pseudonode ID.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The topology in this case is the same as that in the previous case.
ht

It is required that no DIS can be elected between R4 and R6 or


between R5 and R6. That is, the links between R4 and R6 and
between R5 and R6 cannot be broadcast links.
s:

A priority that is as small as possible but can still enable a router


ce

to participate in the DIS election is 0.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The isis dis-priority command sets the priority of the interface
ht

that is a candidate for the DIS at a specified level.


The isis circuit-type command simulates the network type of an
interface to P2P.
s:

View

ce

isis dis-priority: interface view


isis circuit-type: interface view
ur

Parameters
isis dis-priority priority [ level-1 | level-2 ]
so

Specifies the priority for electing DIS. The value ranges from 0
to 127. The default value is 64. The greater the value of priority
Re

is, the higher the priority is.


level-1 Indicates the priority for electing Level-1 DIS.
ng

level-2 Indicates the priority for electing Level-2 DIS.


isis circuit-type p2p
ni

Sets the interface network type as P2P.


Precautions
ar

The isis dis-priority command takes effect only on a broadcast


link.
Le

The isis circuit-type command takes effect only on a broadcast


interface. The network types of IS-IS interfaces on both ends of a
link must be the same, otherwise, the two interfaces cannot
re

establish a neighbor relationship.


Mo

Configuration Verification
Run the display isis interface process-id command, and view
the DIS field in the command output.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The topology in this case is the same as that in the previous case.
ht

Company A requires route control. When configuring tags, you


should also enable IS-IS wide metric on all devices in the network
so that the tags can be transmitted in the entire network. In
s:

addition, Level-2 routes cannot be directly leaked to Level-1 areas


ce

and need to be configured manually.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The import-route command configures IS-IS to import routes
ht

from other routing protocols.


The import-route isis level-2 into level-1 command controls
route leaking from Level-2 areas to Level-1 areas. The command
s:

needs to be configured on Level-1-2 routers that are connected to


ce

external areas.
The cost-style command sets the cost style of routes sent and
ur

received by an IS-IS router.


View
so

import-route: IS-IS view


import-route isis level-2 into level-1: IS-IS view
Re

cost-style: IS-IS view


Parameters

ng

import-route isis level-2 into level-1 [ filter-policy { acl-


number | acl-name acl-name | ip-prefix ip-prefix-name | route-
ni

policyroute-policy-name } | tag tag ]


filter-policy: indicates the route filtering policy.
ar

acl-name: specifies the number of a basic ACL.


acl-name acl-name: specifies the name of a named ACL.
Le

ip-prefix ip-prefix-name: specifies the name of an IP prefix.


Only the routes that match the IP prefix can be imported.
route-policy route-policy-name: specifies the name of a
re

routing policy.
Mo

tag tag: assigns administrative tags to the imported


routes.
cost-style { narrow | wide | wide-compatible }
en
narrow: indicates that the device can receive and send

m/
routes with cost style narrow.
wide: indicates that the device can receive and send

co
routes with cost style wide.
wide-compatible: indicates that the device can receive

.
routes with cost style narrow or wide but sends only

ei
routes with cost style wide.

w
Precautions

ua
To transmit tags in the entire network, run the cost-style wide
command on all devices in the network.

.h
Configuration Verification
Run the display isis router command to view tag information.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The topology in this case is the same as that in the previous case.
ht

Company A reconstructs its network. IS-IS uses ACLs, IP prefix


lists, and tags to control routes.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The filter-policy import command allows IS-IS to filter the
ht

received routes to be added to the IP routing table.


View
filter-policy import: IS-IS view
s:

Parameters

ce

filter-policy { acl-number | acl-name acl-name | ip-prefix ip-


prefix-name | route-policy route-policy-name } import
ur

acl-number: specifies the number of a basic ACL.


acl-name acl-name: specifies the name of a named ACL.
so

ip-prefix ip-prefix-name: specifies the name of an IP


prefix list.
Re

route-policy route-policy-name: specifies the name of a


routing policy that filters routes based on tags and
ng

other protocol parameters.


Precautions
ni

IS-IS can control routes and determine whether a route is


added to the routing table. However, LSP transmission is
ar

not affected.
The filter-policy export command takes effect only when it
Le

is used together with the filter-policy import command.


+IP-Extended* indicates that wide metric is supported. The
symbol * indicates that the route is learned through route
re

leaking.
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
IS-IS authentication classifies into area authentication, routing
ht

domain authentication, and interface authentication.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The area-authentication-mode command configures an IS-IS
ht

area to authenticate received Level-1 packets (LSPs and SNPs)


using the specified authentication mode and password, or adds
authentication information to Level-1 packets to be sent.
s:

The isis authentication-mode command configures an IS-IS


interface to authenticate Hello packets using the specified mode
ce

and password.
View
ur

area-authentication-mode: IS-IS view


isis authentication-mode: interface view
so

Parameters
isis authentication-mode { simple password | md5 password-
Re

key } [ level-1 | level-2 ] [ ip | osi ] [ send-only ]


simple password: indicates that the password is
transmitted in plain text.
ng

md5 password-key: indicates that the password to be


transmitted is encrypted using MD5.
ni

keychain keychain-name: specifies a keychain that


changes with time.
ar

level-1: sets Level-1 authentication.


level-2: sets Level-2 authentication.
Le

ip: indicates the IP authentication password. This


parameter cannot be configured in the keychain authentication
mode.
re

osi: indicates the OSI authentication password. This


parameter cannot be configured in the keychain authentication
Mo

mode.
en
send-only: indicates that the router encapsulates sent Hello

m/
packets with authentication information but does not
authenticate received Hello packets.

co
area-authentication-mode { simple password | md5 password-
key } [ ip | osi ] [ snp-packet { authentication-avoid | send-only }

.
| all-send-only ]

ei
simple password: indicates that the password is

w
transmitted in plain text.

ua
md5 password-key: indicates that the password to be
transmitted is encrypted using MD5.

.h
keychain keychain-name: specifies a keychain that
changes with time.

g
ip: indicates the IP authentication password. This

in
parameter cannot be configured in the keychain authentication

rn
mode.
osi: indicates the OSI authentication password. This

ea
parameter cannot be configured in the keychain authentication
mode.
/l
send-only: indicates that the router encapsulates sent
Hello packets with authentication information but does not
:/
authenticate received Hello packets.
all-send-only: indicates that the router encapsulates
tp

generated LSPs and SNPs with authentication information and


ht

does not authenticate received LSPs and SNPs.


authentication-avoid: indicates that the router does not
encapsulate generated SNPs with authentication information
s:

or authenticates received SNPs. The router encapsulates


generated LSPs with authentication information and
ce

authenticates received LSPs.


snp-packet: authenticates SNPs.
ur

Precautions
The area-authentication-mode command takes effect only on
so

Level-1 and Level-1-2 routers.


Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RX interconnects with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively, network
s:

mask is 24.

ce

R2 connects to R3 and R1 through serial interfaces. R1 and R3


connect through Ethernet interfaces. R1 connects to network
ur

10.0.0.0/24 through G0/0/1.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
You can run the display isis peer command to check whether
ht

neighbor relationships are established successfully.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
You can run the display isis interface command to view the
ht

interface relationship.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
You can run the display ip routing-table command to view the
ht

routing table.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
In this case, the network runs IS-IS.
ht

Requirement analysis
The log prompt function of IS-IS is disabled by default.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
The nexthop command sets the preferences of equal-cost routes.
ht

After IS-IS calculates equal-cost routes using the SPF algorithm,


the next hop is chosen from these equal-cost routes based on the
value of weight. The smaller the value is, the higher the
s:

preference is.
ce

Parameters
nexthop ip-address weight value
ur

ip-address: indicates the next hop address.


weight value: indicates the next hop weight. The value is
so

an integer that ranges from 1 to 254. The default value


is 255.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
The summary ip-address mask avoid-feedback |
ht

generate_null0_route command avoids learning the aggregation


route again. It can also generate a route to the Null0 interface to
prevent loops.
s:

You need to manually open logs of a neighbor.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPF topology:
OSPF divides an Autonomous System (AS) into one or
ht

multiple logical areas. All areas are connected to Area 0.Area


0 is backbone Area.
s:

Router type:
Internal router: All interfaces on an internal router belong to the
ce

same OSPF area.


ur

Area Border Router (ABR): An ABR belongs to two or more


areas, one of which must be the backbone area. An ABR is
so

used to connect the backbone area and non-backbone areas.


It can be physically or logically connected to the backbone
Re

area.
Backbone router: At least one interface on a backbone router
ng

belongs to the backbone area. Internal routers in Area 0 and


all ABRs are backbone routers.
ni

AS Boundary Router (ASBR): An ASBR exchanges routing


information with other ASs. An ASBR does not necessarily
ar

reside on the border of an AS. It can be an internal router or an


ABR. An OSPF device that has imported external routing
Le

information will become an ASBR.

Differences between OSPF and IS-IS in the topology:


re

In OSPF, a link can belongs to only one area.In IS-IS, a link


Mo

can belong to different areas.


en
In IS-IS, no area is physically defined as the backbone or

m/
non-backbone area. In OSPF, Area 0 is defined as the
backbone area.

co
In IS-IS, Level-1 and Level-2 routers use the shortest path
first (SPF) algorithm to generate shortest path trees (SPTs)

.
respectively. In OSPF, the SPF algorithm is used only in the

ei
same area, and inter-area routes are forwarded by the

w
backbone area.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPF supports the following network types:


P2P: A network where the link layer protocol is PPP or HDLC
ht

is a P2P network by default. On a P2P network, protocol


packets such as Hello packets, DD packets, LSR packets,
LSU packets, and LSAck packets are sent in multicast mode
s:

using the multicast address 224.0.0.5.


P2MP: No network is a P2MP network by default, no matter
ce

what type of link layer protocol is used on the network. A


ur

network can be changed to a P2MP network. The common


practice is to change a non-fully meshed NBMA network to a
so

P2MP network. On a P2MP network, Hello packets are sent in


multicast mode using the multicast address 224.0.0.5, and
Re

other types of protocol packets, such as DD packets, LSR


packets, LSU packets, and LSAck packets are sent in unicast
ng

mode.
NBMA: A network where the link layer protocol is ATM or FR is
ni

an NBMA network by default. On an NBMA network, protocol


packets such as Hello packets, DD packets, LSR packets,
ar

LSU packets, and LSAck packets are sent in unicast mode.


Broadcast: A network with the link layer protocol of Ethernet or
Le

FDDI is a broadcast network by default. On a broadcast


network, Hello packets, LSU packets, and LSAck packets are
usually sent in multicast mode. The multicast addresses
re

224.0.0.5 is used by an OSPF device. The multicast address


Mo

224.0.0.6 is reserved for an OSPF designated router (DR). DD


and LSR packets are transmitted in unicast mode.
en
DR/BDR functions

m/
Reduces the number of neighbors and further reduces the
number of times that link-state information and routing

co
information are updated. The DRother sets up full adjacency
only with the DR/BDR. The DR and BDR set up full adjacency

.
with each other.

ei
The DR generates Network-LSAs to describe information about

w
the NBMA or broadcast network segment.

ua
DR/BDR election rules

.h
When Hello is used for DR/BDR election, the DR/BDR is
elected based on Router Priority of interfaces.

g
If Router Priority is set to 0, the router cannot be elected as

in
the DR or BDR.
A larger value of Router Priority indicates a higher priority. If

rn
the value of Router Priority is the same on two interfaces, the

ea
interface with a larger Router ID is elected.
The DR/BDR cannot preempt resources.
/l
If the DR is faulty, the BDR automatically becomes the new DR,
and a new BDR is elected on the network. If the BDR is faulty,
:/
the DR does not change, and a new BDR is elected.
tp

Differences between IS-IS DIS and OSPF DR/BDR


On an IS-IS broadcast network, routers with priority 0 still
ht

participate in DIS election. On an OSPF network, routers with


priority 0 do not participate in DR election
s:

On an IS-IS broadcast network, when a new router meeting


DIS conditions joins the network, the router is elected as the
ce

new DIS, and the original pseudonode is deleted. This causes


LSP flooding. On an OSPF network, a new router will not
ur

immediately become the DR on the network segment even if


the router has the highest DR priority.
so

On an IS-IS broadcast network, routers with the same level on


Re

the same network segment form adjacencies with each other,


including all non-DIS routers.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Overview of OSPF packets


OSPF packets are transmitted at the network layer. The
ht

protocol number is 89. There are five types of OSPF packets,


whose packet headers are in the same format.
OSPF packets except the Hello packet carry LSA information.
s:
ce

OSPF packet header information


All OSPF packets have the same OSPF packet header.
ur

Version: specifies the OSPF protocol number. This field must


be set to 2.
so

Type: specifies the OSPF packet type. There are five types of
OSPF packets.
Re

Packet length: specifies the total length of an OSPF packet,


including the packet header. The unit is byte.
Router ID: specifies the router ID of the router generating the
ng

packet
ni

Area ID: specifies the area to which the packet is to be


advertised.
ar

Checksum: specifies the standard IP checksum of the entire


packet (including the packet header).
Le

AuType: specifies the authentication mode


Authentication: specifies information for authenticating packets,
such as the password.
re
Mo

Hello packet
Network Mask: specifies the network mask of the interface
sending Hello packets.
en
HelloInterval: specifies the interval for sending Hello packets, in

m/
seconds.
Options: specifies optional functions supported by the OSPF

co
router sending the Hello packet. Detailed functions are not
mentioned in this course.

.
Rtr Pri: specifies the router priority on the interface sending

ei
Hello packets. This field is used for electing the DR and BDR.

w
RouterDeadInterval: specifies the interval for advertising that

ua
the neighbor router does not run OSPF on the network
segment, in seconds. In most cases, the value of this field is

.h
four times HelloInterval.
Designated Router: specifies the IP address of the DR elected

g
by routers sending Hello packets. The value 0.0.0.0 of this field

in
indicates that the DR is not elected.
Backup Designated Router: specifies the IP address of the

rn
BDR elected by routers sending Hello packets. The value

ea
0.0.0.0 of this field indicates that the BDR is not elected.
Neighbor: specifies the neighbor router ID, indicating that the
/l
router has received valid Hello packets from neighbors.
:/
DD packet
Interface MTU: specifies the maximum IP data packet size that
tp

an interface on the originating router can send without


ht

fragmentation. The value of this field is 0x0000 on a virtual link.


Options: is the same as that of the Hello packet.
I-bit: is set to 1 for the first DD packet in a series of sent DD
s:

packets. The I-bit fields of subsequent DD packets are 0.


M-bit: is set to 1 when the sent DD packet is not the last one.
ce

The M-bit field of the last DD packet is set to 0.


MS-bit: advertises the router as the master router.
ur

DD Sequence Number: specifies the sequence number of the


DD packet.
so

LSA header information


Re

LSR packet
Link State Advertisement Type: specifies the LSA type, which
ng

can be router-LSA, network-LSA, or other LSA types.


Link State ID: varies depending on LSA types.
ni

Advertising Router: specifies the router ID of the originating


router that advertises LSAs.
ar
Le

LSU packet
Number of LSAs: specifies the number of LSAs in an LSU
packet.
re

LSA: specifies detailed LSA information.


Mo

LSU packet
Header of LSA: specifies LSA header information.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

LSA header information contained in all OSPF packets excluding Hello


packets
ht

LS age: specifies the age of the LSA, in seconds.


Option: specifies optional performance that LSAs supported in
some OSPF areas.
s:

LS type: identifies the format and functions of LSAs. There are


ce

five types of commonly used LSAs.


Link State ID: varies with LSAs.
ur

Advertising Router: specifies router ID in the first LSA.


Sequence Number: increases with the generation of LSA
so

instances. This field allows other routers to identify latest LSA


instances.
Re

Checksum: indicates the checksum of all information in an LSA.


The checksum needs to be recalculated as the aging time
ng

increases.
Length: specifies the length of an LSA, including the LSA header.
ni

Router-LSA (describing all interfaces or links on the originating router)


ar

Link State ID: specifies the router ID of the originating router.


V: indicates that the originating router is an endpoint on one or
Le

more virtual links with full adjacency when this field is set to 1.
E: is set to 1 when the originating router is an ASBR.

re

B: is set to 1 when the originating router is an ABR.


Number of links: specifies the number of router links described in
Mo

an LSA.
Link Type: indicates the link type. The value of this field can be:
1: P2P link to a device, point-to-point connection to another router
en
2: link to a transit network, such as broadcast or NBMA network

m/
3: link to a subnet, such as Loopback interface
4: virtual link

co
Link ID: specifies the link ID. The value of this field can be:
1: neighbor router ID

.
2: IP address of the interface on a DR

ei
3: IP network or subnet address

w
4: neighbor router ID

ua
Link Data: indicates more information about a link. This field
specifies the IP address of the interface on the originating router

.h
connected to the network when the value of Link Type is 1 or 2,
and specifies the IP address or subnet mask of the network when

g
the value of Link Type is 3.

in
ToS: is not supported.

rn
Metric: specifies the metric of a link or interface.

ea
Network-LSA
Link State ID: specifies the IP address of the interface on a DR.
/l
Network Mask: specifies the IP address or subnet mask used on
the network.
:/
Attached router: lists router IDs of the DR and all routers that have
set up adjacency relationships with the DR on an NBMA network.
tp
ht

Network-summary-LSA and ASBR-summary-LSA


Link State ID: specifies the IP address of the network or subnet in
a Type 3 LSA. In a Type 4 LSA, this field specifies the router ID of
s:

the ASBR.
Network Mask: specifies the IP address or subnet mask of the
ce

network in a Type 3 LSA. In a Type 4 LSA, this field has no


meaning and is set to 0.0.0.0.
ur

Metric: specifies the metric of a route to the destination.


so

AS-external-LSA
Re

Link State ID: Indicates the advertised network or subnetIP


address.
Network Mask: specifies the destination IP address or subnet
ng

mask.

ni

E: specifies the type of the external route. The value 1 indicates


the E2 metric, and the value 0 indicates the E1 metric.
ar

Metric: specifies the metric of a route and is set by an ASBR.


Forwarding Address: specifies the forwarding address (FA) of a
Le

packet destined for a specific destination address. When this field


is set to 0.0.0.0, the packet is forwarded to the originating router.
External Route Tag: identifies an external route.
re
Mo
en
NSSA LSA

m/
Forwarding Address: When an internal route is advertised
between an NSSA ASBR and the neighboring AS, this field is set

co
to the next-hop address of the local network. When the internal
route is not used for advertisement, this field is set to the interface

.
ip of the stub network,such as loopback,if have multi stub

ei
network,choose the maximum ip address.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Options field:
DN: prevents loops on an MPLS VPN network. When a type 3, 5,
ht

or 7 LSA is sent from a PE to a CE, the DN bit MUST be set.


When the PE receives, from a CE router, a type 3, 5, or 7 LSA
with the DN bit set, the information from that LSA MUST NOT be
s:

used during the OSPF route calculation.



ce

O: indicates that the originating router supports Opaque LSAs


(Type 9, 10, and 11 LSAs).
ur

DC-bit: indicates that the originating router supports OSPF


capabilities of on-demand links.
so

EA: indicates that the originating router can receive and forward
External-Attributes-LSA(type8 LSA).
Re

N-bit: exists only in Hello packets. The value 1 indicates that the
router supports Type 7 LSAs. The value 0 indicates the router
ng

does not receive or send NSSA LSAs.


P-bit: exists only in NSSA LSAs. This field instructs an NSSA
ni

ABR to convert the Type 7 LSA into a Type 5 LSA.


MC-bit: indicates that the originating router supports multicast,
ar

this bit will be set.


E-bit: indicates that the originating router can receive AS external
Le

LSAs. This field is set to 1 in all Type 5 LSAs and LSAs that are
sent from the backbone area and NSSA areas. This field is set to
0 in LSAs that are sent from stub areas. This field in a Hello
re

packet indicates that the interface can receive and send Type 5
Mo

LSAs.
MT-bit: indicates that the originating router supports MOSPF.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Neighbor status:
Down: It is the initial stage of setting up sessions between
ht

neighbors. In this state, a router receives no message from its


neighbor.
Init: A router has received Hello packets from its neighbor but is
s:

not in the neighbor list of the received Hello packets. The router
has not established bidirectional communication with its neighbor.
ce

In this state, the neighbor is in the neighbor list of Hello packets.


2-Way: In this state, bidirectional communication has been
ur

established but the router has not established the adjacency


relationship with the neighbor. This is the highest state before the
so

adjacency relationship is established. When routers are located


on a broadcast or NBMA network, the routers elect the DR/BDR.
Re

When the neighbor relationship is established, routers negotiate


parameters carrying in Hello packets.
ng

If the network type of the interface receiving Hello packets is


P2MP or NBMA, the Network Mask field in Hello packets must
ni

be the same as the network mask of the interface receiving the


Hello packets. If the network type of the interface is P2P or virtual
ar

link, the Network Mask field is not checked.


The HelloInterval and RouterDeadInterval fields in a Hello
Le

packet must be the same as those on the interface receiving the


Hello packet.
The Authentication field in a Hello packet must be the same as
re

that on the interface receiving the Hello packet.


The E-bit option in a Hello packet must be the same as that on
Mo

the interface receiving in the area configuration.


The Area ID field in a Hello packet must be the same as that on
the interface receiving the Hello packet.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Neighbor relationship setup:


When the neighbor state machine is ExStart on R1, R1 sends the
ht

first DD packet to R2. Assume that in fields in this DD packet are


set as follows:
DD Sequence Number is set to 552A.
s:

I-bit is set to 1, indicating that the DD packet is the first DD packet.


ce

M-bit is set to 1, indicating that more DD packets are to be sent.


MS-bit is set to 1, indicating that R1 advertises itself as the
ur

master router.
When the neighbor state machine is ExStart on R2, R2 sends the
so

first DD packet in which DD Sequence Number is set to 5528 to


R1. The router ID of R2 is larger than that of R1; therefore, R2
Re

functions as the master router. After the comparison of router IDs


is complete, R1 generates a NegotiationDone event and changes
ng

its neighbor state machine from ExStart to Exchange.


When the neighbor state machine is Exchange on R1, R1 sends a
ni

new DD packet containing the local LSDB. In the DD packet, DD


Sequence Number is set to the sequence number of the DD
ar

packet sent by R2, M-bit is set to 0 indicating no other DD packet


is required for describing the local LSDB, and MS-bit is set to 0
Le

indicating that R1 advertises itself as the slave router. After


receiving the DD packet, R2 generates a NegotiationDone event
and changes its neighbor state machine to Exchange.
re

When the neighbor state machine is Exchange on R2, R2 sends a


Mo

new DD packet containing the local LSDB. In this DD packet, DD


Sequence Number is increased by 1 (5528 + 1 = 5529).
en
R1 as the slave router needs to acknowledge each DD packet

m/
from R2
even through R1 does not need to update its LSDB using new DD

co
packets. R1 sends an empty DD packet with DD Sequence
Number of 5529.

.
When the neighbor state machine is Loading on R1, R1 sends a

ei
Link State Request (LSR) packet to request link state information

w
that is learned from DD packets when the neighbor state machine

ua
is Exchange but not contained in the local LSDB.
After receiving the LSR packet, R2 sends a Link State Update

.h
(LSU) packet containing detailed link state information to R1.
When receiving the LSU packet, R1 changes its neighbor state

g
machine from Loading to Full.

in
R1 then sends a Link State Acknowledgement (LSAck) packet to

rn
R2 to ensure information transmission reliability. LSAck packets
are flooded to acknowledge the receiving of LSAs.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPF can define areas as stub and totally stub areas. A stub area is a
special area where ABRs do not flood the received AS external routes.
ht

The ABR in a stub area maintains fewer routing entries and transmits
less routing information. The stub area is an optional configuration, but
not all areas can be configured as stub areas. Generally, a stub area is
s:

a non-backbone area with only one ABR and is located at the AS


ce

boundary. To ensure the reachability of AS external routes, the ABR in


a stub area generates a Type 3 LSA carrying a default route and
ur

advertises it within the entire stub area.


so

Stub area
Re

The backbone area cannot be configured as a stub area.


If an area needs to be configured as a stub area, all the routers in
ng

this area must be configured with stub attributes.


An ASBR cannot exist in a stub area. That is, AS external routes
ni

are not flooded in the stub area.


A virtual link cannot pass through a stub area.
ar

Type 5 LSAs cannot be advertised within a stub area.


A router in the stub area must learn AS external routes from the
Le

ABR. The ABR automatically generates a Type 3 LSA carrying a


default route and advertises it within the entire stub area. The
re

router can then learn the AS external network from the ABR.
Mo

Totally stub area


Neither Type 3 nor Type 5 LSAs can be advertised within a totally
stub area.
en
A router in the totally stub area must learn AS external and inter-

m/
area network from an ABR.
The ABR automatically generates a Type 3 LSA and advertises it

co
within the entire totally stub area.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To prevent a large number of external routes from consuming the


bandwidth and storage resources of routers in a stub area, OSPF
ht

defines that stub areas cannot import external routes. However, stub
areas cannot meet the requirements of the scenario that requires the
import of external routes while preventing resources from being
s:

consumed by external routes. Therefore, NSSA areas are introduced.


ce

Type 7 LSA
Type 7 LSAs are defined in an NSSA Area to describe AS
ur

external routes.
Type 7 LSAs are generated by an ASBR in an NSSA area and
so

advertised only within the NSSA area of this ASBR.


When receiving Type 7 LSAs, an ABR in an NSSA selectively
Re

translates the Type 7 LSAs to Type 5 LSAs so that external


routes can be advertised in other areas of the OSPF network.
Type 7 LSAs can be used to carry default route information to
ng

guide traffic to other ASs.


ni

To advertise the external routes imported by an NSSA area to other


areas, ABRs in the NSSA area needs to translate Type 7 LSAs to Type
ar

5 LSAs so that the external routes can be advertised on the entire


OSPF network.
Le

The P-bit informs routers whether Type 7 LSAs need to be


translated.
The ABR with the largest router ID in an NSSA area translates
re

Type 7 LSAs to Type 5 LSAs.


Only when the P-bit is set and Forwarding Address is not 0, a
Mo

Type 7 LSA can be translated to a Type 5 LSA. Forwarding


Address figure out the destination address inside the ospf
domain for the external routes.
en
The default Type 7 LSAs meeting the preceding conditions can

m/
also be translated.
The Type 7 LSAs generated by ABRs are not set with the P-bit.

co
Precautions

.
Multiple ABRs may be deployed in an NSSA area. To prevent

ei
routing loops, ABRs do not calculate the default routes advertised

w
by each other.

ua
NSSA and totally NSSA

.h
A small number of AS external routes learned from the ASBR in an
NSSA area can be imported to the NSSA area. Type 5 LSAs

g
cannot be advertised within the NSSA area, but routers can learn

in
the AS external routes from the ASBR.

rn
Neither Type 3 nor Type 5 LSAs can be advertised within a totally
NSSA.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Fast convergence
I-SPF improves this algorithm. With exception to where
ht

calculation is performed for the first time, only changed nodes, as


opposed to all nodes, are involved in calculation. The SPT
ultimately generated is the same as that generated by the
s:

previous algorithm. This decreases the CPU usage and speeds


ce

up network convergence.
Similar to I-SPF, PRC calculates only the changed routes. PRC,
ur

however, does not calculate the shortest path. PRC updates


routes based on the SPT calculated by I-SPF. In route calculation,
so

a leaf represents a route, and a node represents a router. A


change in the SPT or leaf causes a change in routing information,
Re

but changes in the SPT or leaf and routing information are not
dependent on each other. PRC processes routing information
ng

based on the SPT or leaf changes:


When the SPT is changed, the PRC processes routing
ni

information on all leaves of the changed nodes.


When the SPT is not changed, PRC does not process
ar

routing information on nodes.


When a leaf is changed, PRC processes routing
Le

information on the changed leaf.


When the leaf is not changed, PRC does not process
routing information on the leaf.
re

The OSPF intelligent timer controls the route calculation, LSA


Mo

generation, and receiving of LSAs to speed up network


convergence. The OSPF intelligent timer speeds up network
convergence in the following modes:
en
On a network where routes are frequently calculated, the

m/
OSPF intelligent timer dynamically adjusts the interval for
calculating

co
routes based on the user configuration and exponential
backoff technology. In this manner, the route calculation and

.
CPU resource consumption are decreased. Routes are

ei
calculated after the network topology becomes stable.
On an unstable network, if a router generates or receives

w
LSAs due to frequent topology changes, the OSPF

ua
intelligent timer can dynamically adjust the interval for
calculating routes. No LSA is generated or handled within

.h
an interval, which prevents invalid LSAs from being
generated and advertised on the entire network.

g
The OSPF intelligent timer helps calculate routes as follows:

in
Based on the local LSDB, a router that runs OSPF
calculates the SPT with itself as the root using the

rn
SPF algorithm, and determines the next hop to the
destination network according to the SPT. Changing

ea
the interval for SPF calculation can prevent the
bandwidth and resource consumption caused by
/l
frequent LSDB changes.
On a network that requires short route convergence
:/
time, specify the interval for route calculation in
milliseconds to increase the route calculation
tp

frequency and speed up route convergence.


When the OSPF LSDB changes, the shortest path
ht

needs to be recalculated. If a network changes


frequently and the shortest path is calculated
continually, a large number of system resources will
s:

be consumed, affecting router performance. You can


configure an intelligent timer and set a proper interval
ce

for SPF calculation to prevent memory and bandwidth


resources from being consumed.
ur

After the OSPF intelligent timer is used:


The initial interval for SPF calculation is
so

specified by the parameter start-interval.


The interval for SPF calculation for the nth (n
Re

is larger than or equal to 2) time is equal to


hold-interval x 2 x (n 1).
When the interval specified by hold-interval x
ng

2 x (n 1) reaches the maximum interval


specified by max-interval, OSPF performs
ni

SPF calculation at the maximum interval for


three consecutive times. Then perform step 1
ar

again for SPF calculation at the initial interval


specified by start-interval.
Le

Priority-based convergence
Filter routes based on the IP prefix list. Set different priorities for
re

the routes so that routes with the highest priority are preferentially
converged, improving network reliability.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Setting the maximum number of non-default external routes on a router


can prevent an OSPF database overflow. You must set the same
ht

maximum number of non-default routes for all routers on an OSPF


network. If the number of external routes on a router reaches the
configured maximum number, the router enters the overflow state and
s:

starts the overflow timer. The router automatically leaves the overflow
ce

state after the overflow timer expires. The default timeout period is 5
seconds.
ur

The OSPF database overflow process is as follows:


When entering the overflow state, a router deletes all non-default
so

external routes that are generated by itself.


When staying in the overflow state, the router does not generate
Re

non-default external routes, discards newly received, non-default


routes, and does not reply with an LSAck packet. When the
ng

overflow timer expires, the router checks whether the number of


external routes still exceeds the maximum value. If so, restart the
ni

timer; if not, the router leaves the overflow state.


When leaving the overflow state, the router deletes the overflow
ar

timer, generates non-default external routes, receives new non-


default external routes, replies with LSAck packets, and gets
Le

ready to enter the overflow state again.


re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

During OSPF deployment, all non-backbone areas must be connected


to the backbone area to ensure that all areas are reachable.
ht
s:

Two ABRs use a virtual link to directly transmit OSPF packets. The
routers between the two ABRs only forward packets. Because the
ce

destination of OSPF packets is not these routers, the routers


transparently forward the OSPF packets as common IP packets.
ur

If a virtual link is not properly deployed, a loop may occur.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When the two authentication types exist, use authentication based on


interfaces.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

The OSPF default route is generally applied to the following scenarios:


An ABR in an area advertises Type 3 LSAs carrying the default
ht

route within the area. Routers in the area use the received default
route to forward inter-area packets.
An ASBR in an area advertises Type 5 or Type 7 LSAs carrying
s:

the default route within the AS. Routers in the AS use the
ce

received default route to forward AS external packets.


ur

Precautions
When no exactly matched route is discovered, a router can
so

forward packets through the default route. Due to hierarchical


management of OSPF routes, the priority of default Type 3 routes
Re

is higher than the priority of default Type 5 or Type 7 routes.


If an OSPF router has advertised LSAs carrying a default route,
ng

the router does not learn this type of LSA advertised by other
routers, which carry a default route. That is, the router uses only
ni

the LSAs advertised by itself to calculate routes. The LSAs


advertised by others are still saved in the LSDB.
ar

If a router has to use a route to advertise LSAs carrying an


external default route, the route cannot be a route learned by the
Le

local OSPF process. This is because a router in an area uses


default external routes to forward packets outside the area,
whereas the routes in the AS have the next hop pointing to
re

devices within the AS.


Mo

Principles for advertising default routes in different areas


Common area
en
By default, OSPF routers in a common OSPF area do not

m/
automatically generate default routes, even if the common
OSPF area has default routes.

co
NSSA area
To advertise AS external routes using the ASBR in an

.
NSSA area and advertise other external routes

ei
through other areas, configure a default Type 7 LSA

w
on the ABR and advertise this LSA in the entire

ua
NSSA area. In this way, a small number of AS
external routes can be learned from the ASBR in the

.h
NSSA, and other inter-area routes can be learned
from the ABR in the NSSA area.

g
To advertise all the external routes using the ASBR in

in
the NSSA area, configure a default Type 7 LSA on

rn
the ASBR and advertise this LSA in the entire NSSA
area. In this way, all the external routes are

ea
advertised using the ASBR in the NSSA area.
The preceding configurations are performed using the
/l
same command in different views. The difference
between these two configurations is described as
:/
follows:
An ABR will generate a default Type 7 LSA
tp

regardless of whether the routing table contains the


ht

default route 0.0.0.0.


An ASBR will generate a default Type 7 LSA only
when the routing table contains the default route
s:

0.0.0.0.
An ABR does not translate Type 7 LSAs carrying a
ce

default route into Type 5 LSAs carrying a default


route or flood them to the entire AS.
ur

Totally NSSA area


All routers in the totally NSSA area must learn AS
so

external routes from the ABR.


Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Route filtering
LSAs are not filtered during route learning. Route filtering can
ht

only determine whether calculated routes are added to the


routing table. The learned LSAs are complete.
s:

Precautions
Stub areas and database overflow can also implement the
ce

LSA filtering function.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

This figure shows the process of establishing the neighbor relationship


and process of neighbor status changes.
ht

Down: It is the initial stage of setting up sessions between


neighbors. In this state, a router receives no message from its
neighbor. On an NBMA network, the router can still send Hello
s:

packets to the neighbor with static configurations. PollInterval


ce

specifies the interval for sending Hello packets and its value is
usually the same as the value of RouterDeadInterval.
ur

Attempt: This state exists only on the NBMA network and


indicates that the router receives no message from the neighbor.
so

In this state, the router periodically sends packets to the neighbor


at an interval of HelloInterval. If the router receives no Hello
Re

packets from the neighbor within RouterDeadInterval, the state


changes to Down.

ng

Init: A router has received Hello packets from its neighbor but is
not in the neighbor list of the received Hello packets. The router
ni

has not established bidirectional communication with its neighbor.


In this state, the neighbor is in the neighbor list of Hello packets.
ar

2-WayReceived: A router knows that bidirectional communication


with the neighbor has started, that is, the router is in the neighbor
Le

list of Hello packets received from the neighbor. If the router


needs to establish the adjacency relationship with the neighbor,
the router enters the ExStart state and starts database
re

synchronization. If the router fails to establish the adjacency


Mo

relationship with the neighbor, the router enters the 2-Way state.
en
2-Way: In this state, bidirectional communication has been

m/
established but the router has not established the adjacency
relationship with the neighbor.

co
This is the highest state before the adjacency relationship is established.
1-WayReceived: The router knows that it is not in the neighbor list

.
of Hello packets received from the neighbor. This is caused by the

ei
restart of the neighbor.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The state machines in the figure are described as follows:


ExStart: This is the first step for establishing the adjacency
ht

relationship. In this state, the router starts to send DD packets to


the neighbor. The two neighbors start to negotiate the
master/slave status and determine the sequence numbers of DD
s:

packets. DD packets transmitted in this state do not contain the


ce

local LSDB.
Exchange: The router exchanges DD packets containing the local
ur

LSDB with its neighbor.


Loading: The router exchanges LSR packets with the neighbor for
so

requesting LSAs and exchanges LSU packets for advertising


LSAs.
Re

Full: The local LSDBs on the two routers have been synchronized.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPF supports P2P, P2MP, NBMA, and multicast networks. IS-IS


supports only P2P and broadcast networks.
ht

OSPF works only at the network layer and the protocol number is
89.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When an OSPF neighbor relationship is established, the two


routers check the mask, authentication mode, Hello/dead interval,
ht

and area ID in Hello packets. The conditions for establishing an


IS-IS neighbor relationship are relatively loose.
Establishing a neighbor relationship over an OSPF P2P link
s:

requires a three-way handshake. Establishing an IS-IS neighbor


ce

relationship does require a three-way handshake. Huawei devices


are enabled with the three-way handshake function on an IS-IS
ur

P2P network by default, which ensuring reliability for establishing


the neighbor relationship.
so

An IS-IS neighbor relationship has level 1 and level 2.


The election of an OSPF DR/BDR is based on the priority and IP
Re

address. The elected DR/BDR cannot be preempted. On an


OSPF network, all DRothers establish full adjacency relationships
ng

with DRs/BDRs, and establish 2-way adjacency relationships with


each other. When the priority of a router on the OSPF network is
ni

0, the router does not participate in the DR/BDR election.


The election of an IS-IS DIS is based on the priority and MAC
ar

address. The elected DIS can be preempted. On an IS-IS network,


all routers establish adjacency relationships with each other. If the
Le

priority of a router on the IS-IS network is 0, the router can still


participate in the DIS election and just has a lower priority.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IS-IS supports a few type of LSPs but provides good extension


capabilities through the TLV field contained in LSPs.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPF costs are calculated based on bandwidth. IS-IS


supports the default cost, delay cost, overhead cost, and error
ht

cost. IS-IS uses the default cost for implementation.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The NBMA network topology is displayed in this case. Other
ht

devices are connected based on the following rules:


If RX is interconnected with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively
s:

network mask is 24.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The peer command sets the IP address and DR priority of the
ht

neighboring router on an NBMA network. On an NBMA network, a


router cannot discover neighboring routers by broadcasting Hello
packets. You must manually specify IP addresses and DR
s:

priorities of neighboring routers.


ce

View
OSPF view
ur

Parameters
so

peer ip-address [ dr-priority priority ]


ip-address: specifies the IP address for a neighboring
Re

router.
dr-priority priority: specifies the priority for the neighbor
ng

to select a DR.
ni

Precautions
In the routing table on R3, the routing entry mapping the IP
ar

address 12.1.1.2/32 exits. This is caused by the PPP echo


function. When this function is disabled, the routing entry mapping
Le

this 32-bit IP address does not exist.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The network topology in this case is the same as the previous
ht

topology. Area 3 is not directly connected to Area 0, and


therefore cannot communicate with other areas.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The vlink-peer command creates and configures a virtual link.
ht

View
OSPF area view
s:

Parameters

ce

vlink-peer router-id
router-id: specifies the router ID of the virtual link
ur

neighbor.
so

Configuration Verification
Run the display ospf vlink command to view information about
Re

the OSPF virtual link.

Remarks
ng

A virtual link needs to be configured for R4.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The network topology in this case is the same as the previous
ht

topology. Company A requires control on the DR. To meet this


requirement, change the DR priorities of routers. The DR/BDR
cannot be preempted.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The ospf dr-priority command sets the priority of an interface
ht

that participates in the DR election.

View
s:

Interface view
ce

Parameters
ur

ospf dr-priority priority


priority: specifies the priority of an interface that
so

participates in the DR/BDR election. A larger value


indicates a higher priority.
Re

Precautions
If the DR priority of an interface on a router is 0, the router
ng

cannot be elected as a DR or a BDR. In OSPF, the DR


ni

priority cannot be configured for null interfaces. Note that


the DR/BDR cannot be preempted even if the DR priority is
ar

changed.
Le

Configuration Verification
Run the display ospf peer command to view information about
neighbors in OSPF areas.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
The network topology in this case is the same as the previous
ht

topology. This is the network extension requirement. On an


OSPF FR network, the default interval for sending Hello
packets is 30 seconds, and the default interval for sending is
s:

120 seconds. When the neighbor relationship is invalid, the


ce

interval for sending Hello packets is 120 seconds.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The ospf timer hello command sets the interval for sending Hello
ht

packets on an interface.
The ospf timer poll command sets the poll interval for sending
Hello packets on an NBMA network.
s:
ce

View
ospf timer hello: interface view
ur

ospf timer poll: interface view


so

Parameters
ospf timer hello interval
Re

interval: specifies the interval for sending Hello packets


on an interface.

ng

ospf timer poll interval


interval: specifies the poll interval for sending Hello
ni

packets.
ar

Precautions
By default, the intervals for sending Hello packets are 10
Le

seconds on P2P and broadcast interfaces and 30 seconds


on P2MP and NBMA interfaces respectively. Ensure that
parameters are set to the same on the local interface and
re

the remote interface of the neighboring router.


Mo
en
On an NBMA network, after the neighbor relationship is

m/
invalid, the router sends Hello packets periodically at the
interval specified using the ospf timer poll command. The

co
poll interval must be at least four times of the interval for
sending Hello packets.

.
ei
Remarks

w
Perform the same interface configuration on R4 as that on

ua
R2 and R3.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
This case is an extension to the original case. Perform
ht

configurations on the basis of the original case. Imported


routes are advertised in E2 mode by default, and the default
cost value is 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The import-route command imports routes learned by other
ht

routing protocols.
The ospf cost command sets the cost of a route on an OSPF-
enabled interface.
s:
ce

View
import-route: OSPF view
ur

ospf cost: interface view


so

Parameters
import-route[ cost cost | type type ]
Re

cost cost: specifies the cost of a route.


type type: specifies the cost type.

ng

ospf cost cost


cost: specifies the cost of an OSPF-enabled interface.
ni

Precautions
ar

On a non-PE device, only EBGP routes are imported after the


import-route bgp command is configured. IBGP routes are also
Le

imported after the import-route bgp permit-ibgp command is


configured. If IBGP routes are imported, routing loops may occur.
In this case, run the preference (OSPF) and preference (BGP)
re

commands to set the priority of OSPF ASE routes to lower than


Mo

that of IBGP routes.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
This case is an extension to the original case. Perform
ht

configuration on the basis of the original case. If R6 does not


want to receive routes from network 172.16.X.0/24, filter Type
3 LSAs on R5.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The filter-policy export command configures a filtering policy
ht

to filter the imported routes when these routes are advertised


in Type 5 LSAs within the AS. This command can be
configured only on an ASBR to filter Type 5 LSAs.
s:

The filter-policy import command configures a filtering policy


ce

to filter intra-area, inter-area, and AS external routes received


by OSPF. On routers within an area, this command can be
ur

used to filter only routes; on an ABR, this command can be


used to filter Type 3 LSAs.
so

View
Re

filter-policy export: OSPF view


filter-policy import: OSPF view
ng

Parameters
ni

filter-policy { acl-number | acl-name acl-name | ip-prefix ip-


prefix-name } export [ protocol [ process-id ] ]
ar

acl-number: specifies the basic ACL number.


acl-name acl-name: specifies the ACL name.
Le

ip-prefix ip-prefix-name: specifies the name of an IP


prefix list.
protocol: specifies the protocol for advertising routing
re

information.
Mo

process-id: specifies the process ID when RIP, IS-IS, or


OSPF is used for advertising routing information.
en
filter-policy { acl-number | acl-name acl-name | ip-prefix ip-

m/
prefix-name } import
acl-number: specifies the basic ACL number.

co
acl-name acl-name: specifies the ACL name.
ip-prefix ip-prefix-name: specifies the name of an IP

.
prefix list.

w ei
Precautions

ua
Type 5 LSAs are generated on an ASBR to describe AS
external routes and advertised to all areas (excluding stub and

.h
NSSA areas). The filter-policy command needs to be
configured on an ASBR. To advertise only routing information

g
meeting specific conditions, run the filter-policy command to

in
set filtering conditions.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
This case is an extension to the original case. Perform
ht

configuration on the basis of the original case. Configure Area


1 as an NSSA area.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The nssa command configures an OSPF area as an NSSA area.
ht

View
OSPF area view
s:
ce

Parameters
nssa [ default-route-advertise | flush-waiting-timer interval-
ur

value | no-import-route | no-summary | set-n-bit |suppress-


forwarding-address | translator-always | translator-
so

interval interval-value | zero-address-forwarding ] *


default-route-advertise: generates default Type 7 LSAs
Re

on an ABR or ASBR and then advertises them to the


NSSA area.
ng

flush-waiting-timer interval-value: specifies the interval


for an ASBR to send aged Type 5 LSAs. This parameter
ni

takes effect for once only.


no-import-route: indicates that no external route is
ar

imported to the NSSA area.


no-summary: indicates that an ABR is prohibited from
Le

sending Type 3 LSAs to the NSSA area.


set-n-bit: sets the N-bit in DD packets.
suppress-forwarding-address: sets the FA of the Type
re

5 LSAs translated from Type 7 LSAs by the NSSA ABR


Mo

to 0.0.0.0.
en
translator-always: specifies an ABR in an NSSA area as

m/
an all-the-time translator. Multiple ABRs in an NSSA area
can be configured as translators.

co
translator-interval interval-value: specifies the timeout
period of a translator.

.
zero-address-forwarding: sets the FA of the generated

ei
NSSA LSAs to 0.0.0.0 when external routes are imported

w
from an ABR in an NSSA area.

ua
Precautions

.h
The parameter default-route-advertise is configured to advertise
Type 7 LSAs carrying the default route. Regardless of the route

g
0.0.0.0 exists in the routing table, Type 7 LSAs carrying the default

in
route will be generated on an ABR. However, Type 7 LSAs

rn
carrying the default route will be generated only when the route
0.0.0.0 exists in the routing table on an ASBR.

ea
When the area to which the ASBR belongs is configured as an
NSSA area, invalid Type 5 LSAs from other routers in the area
/l
where LSAs are flooded will be reserved. These LSAs will be
deleted only when the aging time reaches 3600 seconds. The
:/
router performance is affected because the forwarding of a large
number of LSAs consumes the memory resources. The parameter
tp

flush-waiting-timer is configured to generate Type 5 LSAs with


ht

the aging time of 3600 seconds. Invalid Type 5 LSAs on other


routers are therefore cleared in a timely manner.
The parameter flush-waiting-timer does not take effect when the
s:

ASBR also functions as an ABR. In this way, Type 5 LSAs in non-


NSSA areas will not be deleted.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
This case is an extension to the original case. Perform
ht

configuration on the basis of the original case. Note that the


virtual link belongs to Area 0.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command Usage
The authentication-mode command sets the authentication
ht

mode and password for an OSPF area. After this command is


executed, interfaces on all routers in an OSPF area use the same
authentication mode and password.
s:
ce

View
OSPF view
ur

Parameters
so

authentication-mode { md5 | hmac-md5 } [ key-


id { plain plaintext | [ cipher ] ciphertext } ]
Re

md5 password-key: indicates the MD5 authentication


using the ciphertext password.
ng

hmac-md5: indicates HMAC-MD5 authentication using


the ciphertext password.
ni

key-id: specifies an authentication ID, which must be the


same on the two ends.
ar

keychain: indicates keychain authentication.


keychain-name: specifies the keychain name.
Le

authentication-
mode simple [ [ plain ] plaintext | cipher ciphertext ]
simple password: indicates simple authentication.
re

plain: indicates authentication using the plaintext


Mo

password. If this parameter is specified, the device


allows you to set only a plaintext key, and the key is
displayed in plaintext mode in the configuration file.
en
plaintext: specifies a plaintext password.

m/
cipher: specifies a ciphertext password. If this parameter
is specified, the device allows you to set only a ciphertext

co
key, and the key is displayed in ciphertext mode in the
configuration file.

.
ciphertext: specifies a ciphertext password.

w ei
Precautions

ua
The authentication modes and passwords of all the devices must
be the same in an area, but can be different in different areas.

.h
The authentication-mode command used in the interface view
takes precedence over the authentication-mode command used

g
in the OSPF area view.

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case Description
If RX is interconnected with RY, their interconnection
ht

addresses are XY.1.1.X/24 and XY.1.1.Y/24 respectively.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration Verification
Run the display ospf peer brief command to check whether
ht

the neighbor relationship is established.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration Verification
Run the tracert command to trace traffic on R3. The command
ht

output shows that traffic on R3 reaches S0/0/0 on R1 through


the Ethernet link.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration Verification
Run the display ip routing-table command to view the routing
ht

table. During the route summarization, original tags are


removed. Therefore, tags need to be added in the next route
summarization.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re

Case Description
so
ur
ce
s:

The network runs OSPF.


ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Analysis
To make R1 select the path through area 2 to reach the
ht

networks in area 1,we must make the path through area2 work
as it is passing through area 0.virtual link meet the
needs.when virtual link is established,R1 will compare the cost
s:

of the two path and choose the path with lower cost as the
ce

best.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration Verification
Only the external LSA (10.0.0.0) exists in the LSDB on R2.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration Verification
All neighbor relationships on R3 are correct, indicating
ht

successful authentication.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP is a dynamic routing protocol used between ASs. BGP-1 (defined


in RFC 1105), BGP-2 (defined in RFC 1163), and BGP-3 (defined in
ht

RFC 1267) are three earlier-released BGP versions. BGP exchanges


reachable inter-AS routes, establishes inter-AS paths, avoids routing
loops, and applies routing policies between ASs. The current BGP
s:

version is BGP-4 defined in RFC 4271.


ce

As an external routing protocol on the Internet, BGP is widely used


ur

among Internet Service Providers (ISPs).


BGP has the following characteristics:
so

BGP is an EGP. Different from Interior Gateway Protocols


(IGPs) such as Open Shortest Path First (OSPF) and Routing
Re

Information Protocol (RIP), BGP controls route advertisement


and selects optimal routes between ASs rather than discover
ng

or calculate routes.
BGP uses the Transport Control Protocol (TCP) with listening
ni

port 179 as the transport layer protocol. TCP enhances BGP


reliability with requiring a dedicated mechanism to ensure
ar

connectivity.
BGP needs to select inter-AS routes, which requires
Le

high protocol stability. TCP with high reliability


therefore is used to enhance BGP stability.
BGP peers must be logically connected and establish
re

TCP connections. The destination port number is 179,


Mo

and the local port number is random.


en
When routes are updated, BGP transmits only the updated

m/
routes. This greatly reduces the bandwidth occupied by BGP
route advertisements. Therefore, BGP applies to the

co
transmission of a large number of routes on the Internet.
BGP is designed to avoid loops.

.
Inter-AS: BGP routes carry information about the ASs

ei
along the path. The routes that carry the local AS

w
number are discarded to avoid inter-AS loops.

ua
Intra-AS: BGP does not advertise the routes learned in
an AS to BGP peers in the AS. In this manner, intra-AS

.h
loops are avoided.
BGP provides rich routing policies to flexibly filter and select

g
routes.

in
BGP provides a route flapping prevention mechanism, which

rn
effectively improves Internet stability.
BGP is easy to extend and adapts to network development. It

ea
is mainly extended using TLVs.

/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An AS is a group of routers that are managed by a single technical


administration and use the same routing policy.
ht

An AS is a group of routers that are managed by a single technical


administration and use the same routing policy.
Each AS has a unique AS number, which is assigned by the
s:

Internet Assigned Numbers Authority (IANA).


An AS number ranges from 1 to 65535. Values 1 to 64511 are
ce

registered Internet numbers, while values 64512 to 65535 are


ur

private AS numbers.
Each AS on a BGP network is assigned a unique AS number to
so

identify the AS. Currently, 2-byte AS and 4-byte AS numbers are


available. A 2-byte AS number ranges from 1 to 65535, while a 4-
Re

byte AS number ranges from 1 to 4294967295. Devices supporting


4-byte AS numbers are compatible with devices supporting 2-byte
ng

AS numbers.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

EBGP and IBGP


IBGP: runs within an AS. To prevent routing loops within an AS, a
ht

BGP device does not advertise the routes learned from an IBGP
peer to other IBGP peers, and establishes full-mesh connections
with all the IBGP peers.
s:

EBGP: runs between ASs. To prevent routing loops between ASs, a


ce

BGP device discards routes containing the local AS number when


receiving routes from EBGP peers.
ur

Device roles in BGP message exchange


so

Speaker: The device that sends BGP messages is called a BGP


speaker. The speaker receives and generates new routes, and
Re

advertises the routes to other BGP speakers.


Peer: The speakers that exchange messages with each other are
ng

called BGP peers. A group of peers sharing the same policies can
form a peer group.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP peers exchange five types of messages: Open, Update, Keepalive,


Notification, and Route-Refresh messages.
ht

Open message: is used to establish BGP peer relationships. It is


the first message sent after a TCP connection is set up. After a
BGP peer receives an Open message and the peer negotiation
s:

succeeds, the BGP peer sends a Keepalive message to confirm


ce

and maintain the peer relationship. Subsequently, BGP peers can


exchange Update, Notification, Keepalive, and Route-refresh
ur

messages.
Update message: is used to exchange routes between BGP peers.
so

Update messages can be used to advertise multiple reachable


routes with the same attributes or to withdraw multiple unreachable
Re

routes.
An Update message can be used to advertise multiple
ng

reachable routes with the same attributes. These


routes can share a group of route attributes. The route
ni

attributes in an Update message apply to all the


destination addresses (expressed by IP prefixes) in the
ar

Network Layer Reachability Information (NLRI) field of


the Update message.
Le

An Update message can be used to withdraw multiple


unreachable routes. Each route is identified by its
destination address (expressed by an IP prefix), which
re

identifies the routes previously advertised between


Mo

BGP speakers.
en
An Update message can be used only to withdraw

m/
routes. In this case, it does not need to carry route
attributes or NLRI. Similarly, an Update message can

co
be used only to advertise reachable routes, so it does
not need to carry information about withdrawn routes.

.
Keepalive message: is periodically sent to the BGP peer to

ei
maintain the peer relationship.

w
Notification message: is sent to the BGP peer when an error is

ua
detected. The BGP connection is then terminated immediately.
Route-Refresh message: is used to request the BGP peer resend

.h
routes when the BGP inbound routing policy changes. If all BGP
routers have the Route-Refresh capability, the local BGP router

g
sends a Route-Refresh message to BGP peers when the BGP

in
inbound routing policy changes. After receiving the Route-Refresh

rn
message, the BGP peers resend their routing information to the
local BGP router. In this manner, the BGP routing table can be

ea
dynamically updated, and the new routing policy can be used
without terminating BGP connections. A BGP peer notifies its peer
/l
of its Route-Refresh capability by sending an Open message.
BGP message applications
:/
BGP uses TCP port 179 to set up a connection. BGP connection
setup requires a series of dialogues and handshakes. TCP
tp

advertises parameters such as the BGP version, BGP connection


ht

holdtime, local router ID, and authorization information in an Open


message during handshake negotiation.
After a BGP connection is set up, a BGP router sends the BGP
s:

peer an Update message that carries the attributes of a route to be


advertised. This helps the BGP peer select the optimal route. When
ce

local BGP routes change, a BGP router sends an Update message


to notify the BGP peer of the changes.
ur

After two BGP peers exchange routes for a period of time, they do
not have new routes to be advertised and need to periodically send
so

Keepalive messages to maintain the validity of the BGP connection.


Re

If the local BGP router does not receive any BGP message from the
BGP peer within the holdtime, the local BGP router considers that
the BGP connection has been terminated, tears down the BGP
ng

connection, and deletes all the BGP routes learned from the peer.
When the local BGP router detects an error during the operation, for
ni

example, it does not support the peer BGP version or receives an


invalid Update message, it sends the BGP peer a Notification
ar

message to report the error. Before terminating a BGP connection


Le

with the peer, the local BGP router also needs to send a Notification
message to the peer.
re

BGP message header


Marker: A 16-byte field fixed to a value of 1.
Mo
en
Length: A 2-byte unsigned integer that indicates the total length of a

m/
message, including the header.
Type: A 1-byte field that specifies the type of a message:

co
Open
Update

.
Keepalive

ei
Notification

w
Route-Refresh

ua
Open message format

.h
Version: Indicates the BGP version number. For BGPv4, the value
is 4.

g
My Autonomous System: Indicates the local AS number.

in
Comparing the AS numbers on both ends, you can determine

rn
whether a BGP connection is an IBGP or EBGP connection.
Hold Time: Indicates the time during which two BGP peers maintain

ea
a BGP connection between them. During the peer relationship
setup, two BGP peers need to negotiate the holdtime and keep the
/l
holdtime consistent. If two BGP peers have different holdtime
periods configured, the shorter holdtime is used. If the local BGP
:/
router does not receive a Keepalive message from the peer within
the holdtime, it considers that the BGP connection is terminated. If
tp

the holdtime is 0, no Keepalive message is sent.


BGP Identifier: Indicates the router ID of a BGP router. It is
ht

expressed as an IP address to identify a BGP router.


Opt Parm Len (Optional Parameters Length): Indicates the optional
s:

parameter length. The value 0 indicates that no optional parameters


are available.
ce

Optional Parameters: These are used for BGP authentication or


Multiprotocol Extensions. Each parameter is a 3-tuple (Parameter
ur

Type-Parameter Length-Parameter Value).


so

Update message format


Re

Withdrawn Routes Length: A 2-byte unsigned integer that indicates


the total length of the Withdrawn Routes field. The value 0 indicates
that the Withdrawn Routes field is not present in this Update
ng

message.
Withdrawn Routes: A variable-length field that contains a list of IP
ni

address prefixes for the routes to be withdrawn. Each IP address


prefix is in <length, prefix> format. For example, <19,198.18.160.0>
ar

indicates a network at 198.18.160.0 255.255.224.0.


Le

Path Attribute Length: A 2-byte unsigned integer that indicates the


total length of the Path Attribute field. The value 0 indicates that the
Path Attribute field is not present in an Update message.
re

Network Layer Reachability Information: Contains a list of IP


address prefixes. This variable length field is in the same format as
Mo

the Withdrawn Routes: <length, prefix>.


en
Keepalive message format

m/
A Keepalive message has only the message header.
By default, the interval for sending Keepalive messages is 60

co
seconds, and the holdtime is 180 seconds. Each time a BGP router
receives a Keepalive message from its peer, it resets the hold timer.

.
If the hold timer expires, it considers the peer to be 'down'.

w ei
Notification message format

ua
Errorcode: A 1-byte field that uniquely identifies an error. Each error
code may have one or more error subcodes. If no error subcode is

.h
defined for an error code, the Error Subcode Field is all 0s.
Errsubcode: Indicates an error subcode.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A BGP finite state machine (FSM) has six states: Idle, Connect, Active,
OpenSent, OpenConfirm, and Established.
ht

The Idle state is the initial BGP state. In Idle state, a BGP
device refuses all the connection requests from neighbors.
The BGP device initiates a TCP connection with its BGP peer
s:

and changes its state to connect only after receiving a start


ce

event from the system.


A start event occurs when an operator configures a
ur

BGP process, resets an existing BGP process or when


the router software resets a BGP process.
so

If an error occurs in any FSM state, for example, the


BGP device receives a notification message or TCP
Re

connection termination notification, the BGP device


returns to the Idle state.
In the connect state, the BGP device starts the ConnectRetry
ng

timer and waits to establish a TCP connection. The


ni

ConnectRetry timer defaults to 32 seconds.


If a TCP connection is established, the BGP device
ar

sends an open message to the peer and changes to


the OpenSent state.
Le

If a TCP connection fails to be established, the BGP


device moves to the Active state.
If the BGP device does not receive a response from the
re

peer before the ConnectRetry timer expires, the BGP


Mo

device attempts to establish a TCP connection with


another peer and stays in the connect state.
en
If another event (started by the system or operator)

m/
occurs, the BGP device returns to the Idle state.
In the Active state, the BGP device keeps trying to establish a

co
TCP connection with the peer.
If a TCP connection is established, the BGP device

.
sends an open message to the peer, closes the

ei
ConnectRetry timer, and changes to the OpenSent

w
state.

ua
If a TCP connection fails to be established, the BGP
device stays in the Active state.

.h
If the BGP device does not receive a response from the
peer before the ConnectRetry timer expires, the BGP

g
device returns to the connect state.

in
In the OpenSent state, the BGP device waits for an Open

rn
message from the peer and then checks the validity of the
received Open message, including the AS number, version,

ea
and authentication password.
If the received Open message is valid, the BGP device
/l
sends a Keepalive message and changes to the
OpenConfirm state.
:/
If the received Open message is invalid, the BGP
device sends a Notification message to the peer and
tp

returns to the Idle state.



ht

In OpenConfirm state, the BGP device waits for a Keepalive or


Notification message from the peer. If the BGP device receives
a Keepalive message, it transitions to the Established state. If
s:

it receives a Notification message, it returns to the Idle state.


In Established state, the BGP device exchanges Update,
ce

Keepalive, Route-Refresh, and Notification messages with the


peer.
ur

If the BGP device receives a valid Update or Keepalive


message, it considers that the peer is working properly
so

and maintains the BGP connection with the peer.


Re

If the BGP device receives a valid Update or Keepalive


message, it sends a Notification message to the peer
and returns to the Idle state.
ng

If the BGP device receives a Route-refresh message, it


does not change its state.
ni

If the BGP device receives a Notification message, it


returns to the Idle state.
ar

If the BGP device receives a TCP connection


Le

termination notification, it terminates the TCP


connection with the peer and returns to the Idle state.
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A BGP device adds optimal routes to the BGP routing table to generate
BGP routes. After establishing a BGP peer relationship with a neighbor,
ht

the BGP device follows the following rules to exchange routes with the
peer:

s:

Advertises the BGP routes received from IBGP peers


only to its EBGP peers.
ce

Advertises the BGP routes received from EBGP peers


ur

to all its EBGP peers and IBGP peers.


so

Advertises the optimal route to its peers when there


are multiple valid routes to the same destination.
Re

Sends only updated BGP routes when BGP routes


ng

change.
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP routing information processing


When receiving Update messages from peers, a BGP router
ht

saves the Update messages to the routing information base


(RIB) and specifies the Adj-RIB-In of the peer from which the
Update messages are received. After these Update messages
s:

are filtered by the inbound policy engine, the BGP router


ce

determines the optimal route for each prefix according to the


route selection algorithm.
ur

The optimal routes are saved in the local BGP RIB (Loc-RIB)
and then submitted to the local IP route selection table (IP-
so

RIB).
In addition to the optimal routes received from peers, Loc-RIB
Re

also contains the BGP prefixes that are selected as the optimal
routes and injected by the current router (locally originated
ng

routes). Before the routes in Loc-RIB are advertised to other


peers, these routes must be filtered by the outbound policy
ni

engine. Only the routes that pass the filtering of the outbound
policy engine can be installed to the RIB (Adj-RIB-Out).
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Synchronization is performed between IBGP and IGP to prevent


misleading routers in other ASs.
ht

Topology description (when synchronization is enabled)


R4 learns the route to 10.0.0.0/24 advertised by R1 through
s:

BGP and checks whether local IGP routing tables contain the
ce

route. If so, R4 advertises the route to R5. If not, R4 does not


advertise the route to R5.
ur

Precautions: By default synchronization is disabled on VRP


so

platform, and it can not be changed. Only under two


conditions,we can disable the synchronization:
Re

The local AS is not a transit AS.


All the routers within the local AS set up full-mesh IBGP
ng

connections.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP route attributes are a set of parameters that further describe BGP
routes. Using BGP route attributes, BGP can filter and select routes.
ht

Common attributes are as follows:


Origin: A well-known mandatory attribute.
s:

AS_Path: A well-known mandatory attribute.


Next_Hop: A well-known mandatory attribute.
ce

Local_Pref: A well-known discretionary attribute.


ur

Community: An optional transitive attribute.


MED: An optional non-transitive attribute.
so

Originator_ID: An optional non-transitive attribute.


Cluster_List: An optional non-transitive attribute.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The Origin attribute defines the origin of a route and marks the path of a
BGP route. The Origin attribute is classified into the following types:
ht

IGP: A route with the Origin attribute IGP is an IGP route and
has the highest priority. For example, the Origin attribute of the
s:

routes injected to the BGP routing table using the network


ce

command is IGP.
EGP: A route with the Origin attribute EGP is an EGP route
ur

and has the secondary highest priority.


Incomplete: A route with the Origin attribute Incomplete is
so

learned by other means and has the lowest priority. For


example, the Origin attribute of the routes imported by BGP
Re

using the import-route command is Incomplete.


ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The AS_Path attribute records all the ASs that a route passes through
from a source to a destination in the distance-vector order. To prevent
ht

inter-AS routing loops, a BGP device does not accept the EBGP routes
of which the AS_Path list contains the local AS number.
Assume that a BGP speaker advertises a local route:
s:

When advertising the route to other ASs, the BGP speaker


ce

adds the local AS number to the AS_Path list, and then


advertises it to neighboring routers in Update messages.
ur

When advertising the route to the local AS, the BGP speaker
creates an empty AS_Path list in an Update message.
so

Assume that a BGP speaker advertises a route learned in the Update


Re

message sent by another BGP speaker:

When advertising the route to other ASs, the BGP speaker


ng

adds the local AS number to the leftmost of the AS_Path list.


ni

According to the AS_Path attribute, the BGP router that


receives the route can determine the ASs through which the
ar

route has passed to the destination. The number of the AS that


is nearest to the local AS is placed on the leftmost of the list,
Le

and the other AS numbers are listed according to the


sequence in which the route passes through ASs.
When advertising the route to the local AS, the BGP speaker
re

does not change the AS_Path attribute of the route.


Mo
en
Topology description

m/
When R4 advertises route 10.0.0.0/24 to AS 400 and AS 100,
it adds the local AS number to the AS_Path list. When R5

co
advertises the route to AS 100, it also adds the local AS
number to the AS_Path list. When R1 and R3 in AS 100

.
advertise the route to R2 in the same AS, they keep the

ei
AS_Path attribute of the route unchanged. R2 selects the route

w
with the shortest AS_Path when other BGP routing rules are

ua
the same. That is, R2 reaches 10.0.0.0/24 through R3.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The Next_Hop attribute records the next hop that a route passes
through. The Next_Hop attribute of BGP is different from that of an IGP
ht

because it may not be the neighbor IP address. A BGP speaker


processes the Next_Hop attribute based on the following rules:
When advertising a locally originated route to an IBGP peer,
s:

the BGP speaker sets the Next_Hop attribute of the route to be


ce

the IP address of the local interface through which the BGP


peer relationship is established.
ur

When advertising a route to an EBGP peer, the BGP speaker


sets the Next_Hop attribute of the route to be the IP address of
so

the local interface through which the BGP peer relationship is


established.
Re

When advertising a route learned from an EBGP peer to an


IBGP peer, the BGP speaker does not change the Next_Hop
ng

attribute of the route.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Local_Pref attribute
This attribute indicates the BGP preference of a router. It is
ht

exchanged only between IBGP peers and not advertised to


other ASs.
This attribute helps determine the optimal route when traffic
s:

leaves an AS. When a BGP router obtains multiple routes to


ce

the same destination address but with different next hops from
IBGP peers, the router prefers the route with the highest
ur

Local_Pref.
so

Topology description
R1,R2,R3 are IBGP Peers of each other in AS 100, R2 establish EBGP
Re

Peer with AS 200 and R3 establish EBGP Peer with AS 300. So R2


and R3 will learn route 10.0.0.0/24 from EBGP, R1 learns two routes to
10.0.0.0/24 from two IBGP peers (R2 and R3) in the local AS. Prefers
ng

R2 routing 10.0.0.0/24 to other ASs in AS100, it need configure the


ni

Local_Pref with R2 and R3: one with Local_Pref value 300 from R2 and
the other with Local_Pref value 200 from R3. R1 prefers the route
ar

learned from R2.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The MED attribute helps determine the optimal route when traffic enters
an AS. When a BGP router obtains multiple routes to the same
ht

destination address but with different next hops from EBGP peers, the
router selects the route with the smallest MED value as the optimal
route if the other attributes of the routes are the same.
s:
ce

The MED attribute is exchanged only between two neighboring ASs.


The AS that receives this attribute does not advertise the attribute to
ur

any other AS. This attribute can be manually configured. If the MED
attribute is not configured for a route, the MED attribute of the route
so

uses the default value 0.


Re

Topology description
R1 and R2 advertise routes 10.0.0.0/24 to their respective
ng

EBGP peers R3 and R4. When other routing rules are the
same, R3 and R4 prefer the route with a smaller MED value.
ni

That is, R3 and R4 access network 10.0.0.0/24 through R1.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The Community attribute is a set of destination addresses with the


same characteristics. It is expressed as a 4-byte list and in the aa:nn or
ht

community number format.


aa:nn: The value of aa or nn ranges from 0 to 65535. The
administrator can set a specific value as required. Generally,
s:

aa indicates the AS number and nn indicates the community


ce

identifier defined by the administrator. For example, if a route


is from AS 100 and its community identifier defined by the
ur

administrator is 1, the Community attribute is 100:1.


Community number: An integer that ranges from 0 to
so

4294967295. As defined in RFC 1997, numbers from 0


(0x00000000) to 65535 (0x0000FFFF) and from 4294901760
Re

(0xFFFF0000) to 4294967295 (0xFFFFFFFF) are reserved.

The Community attribute helps simplify application, maintenance, and


ng

management of routing policies. With the community, a group of BGP


ni

routers in multiple ASs can share the same routing policy. This attribute
is a route attribute and is transmitted between BGP peers without being
ar

restricted by ASs. Before advertising a route with the Community


attribute to peers, a BGP router can change the original Community
Le

attribute of this route.

Well-known community attributes


re

Internet: All routes belong to the Internet community by default.


Mo

A route with this attribute can be advertised to all BGP peers.


en
No_Advertise: A device does not advertise a received route

m/
with the No_Advertise attribute to any peer.
No_Export: A BGP device does not advertise a received route

co
with the No_Export attribute to devices outside the local AS. If
a confederation is defined, the route with the No_Export

.
attribute cannot be advertised to ASs outside of the

ei
confederation but to other sub-ASs in the confederation.

w
No_Export_Subconfed: BGP device does not advertise the

ua
received route with the No_Export_Subconfed attribute to
devices outside the local AS or to devices outside the local

.h
sub-AS in a confederation.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP routing rules


The next-hop addresses of routes must be reachable.
ht

The PrefVal attribute is a Huawei proprietary attribute and is


valid only on the device where it is configured.
If a route does not have the Local_Pref attribute, the
s:

Local_Pref attribute of the route uses the default value 100.


ce

You can use the default local-preference command to


change the default local preference of BGP routes.
ur

Locally generated routes include the routes imported using the


network or import-route command, manually summarized
so

routes, and automatically summarized routes.


Summarized routes have a higher priority than non-
Re

summarized routes.
Manually summarized routes generated using the
ng

aggregate command have a higher priority than


automatically summarized routes generated using the
ni

summary automatic command.


Routes imported using the network command have a
ar

higher priority than routes imported using the import-


route command.
Le

Prefers the route with the shortest AS_Path.


The AS_Path length does not include
AS_CONFED_SEQUENCE and AS_CONFED_SET.
re

An AS_SET counts as 1 no matter how many AS


Mo

numbers the AS_SET contains.


en
BGP does not compare the AS_Path attributes of

m/
routes after the bestroute as-path-ignore command is
executed.

co
Prefers the route with the lowest MED.
BGP compares only the MED values of routes sent

.
from the same AS (excluding a confederation sub-AS).

ei
That is, BGP compares the MED values of two routes

w
only when the first AS numbers in the AS_SEQUENCE

ua
attributes (excluding the AS_CONFED_SEQUENCE)
of the two routes are the same.

.h
If a route does not have the MED attribute, BGP
considers the MED value of the route as the default

g
value 0. After the bestroute med-none-as-maximum

in
command is executed, BGP considers the MED value

rn
of the route as the maximum value 4294967295.
After the compare-different-as-med command is

ea
executed, BGP compares the MEDs in the routes sent
from peers in different ASs. Do not use this command
/l
unless different ASs use the same IGP and route
selection mode, otherwise routing loops may occur.
:/
After the bestroute med-confederation command is
executed, BGP compares the MED values of routes
tp

only when the AS_Path does not contain external AS


ht

numbers (sub-ASs that do not belong to a


confederation) and the first AS number in
AS_CONFED_SEQUENCE is the same.
s:

After the deterministic-med command is executed,


routes are not selected in the sequence in which routes
ce

are received.
ur

Load Balancing
so

When there are multiple equal-cost routes to the same


Re

destination, you can perform load balancing among these


routes to load balance traffic.
Equal-cost BGP routes can be generated for traffic load
ng

balancing only when the rules before the attibutes "Prefers the
route with the lowest IGP metric are the same.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP security
MD5: BGP uses TCP as the transport layer protocol. To
ht

ensure BGP security, you can perform MD5 authentication


during the TCP connection setup. MD5 authentication,
however, does not authenticate BGP messages. Instead, it
s:

sets the MD5 authentication password for a TCP connection,


ce

and the authentication is performed by TCP. If the


authentication fails, no TCP connection is set up.
ur

After GTSM is enabled for BGP, an interface board checks the


TTL values in all BGP messages. In actual networking,
so

packets whose TTL values are not within the specified range
are either allowed to pass through or discarded by GTSM. To
Re

configure GTSM to discard packets by default, you can set a


correct TTL value range according the network topology.
ng

Subsequently, messages whose TTL values are not within the


specified range are discarded. This function avoids attacks
ni

from bogus BGP messages. This function is mutually


exclusive to multi-hop EBGP.
ar

The number of routes received from peers is limited to prevent


resource exhaustion attacks.
Le

The AS_Path lengths on the inbound and outbound interfaces


are limited. Packets that exceed the limit of the AS_Path
length are discarded.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Route dampening helps solve the problem of route instability. In most


cases, BGP is used on complex networks where route flapping occurs
ht

frequently. To prevent frequent route flapping, BGP uses route


dampening to suppress unstable routes.
s:

Route dampening measures the stability of a route using a penalty


ce

value. A larger penalty value indicates a less stable route. Each time
route flapping occurs, BGP increases the penalty of a route by a value
ur

of 1000. During route flapping, a route changes from active to inactive.


When the penalty value of the route exceeds the suppression threshold,
so

BGP suppresses this route and does not add it to the IP routing table or
advertise any Update message to BGP peers.
Re

After a route is suppressed for a period of time (half life), the penalty
value is reduced by half. When the penalty value of a route decreases
ng

to the reuse threshold, the route becomes reusable and is added to the
ni

routing table. At the same time, BGP advertises an Update message to


peers. The penalty value, suppression threshold, and half life can be
ar

manually configured.
Le

Route dampening applies only to EBGP routes but not IBGP routes.
IBGP routes often include the routes from the local AS, which requires
that the forwarding tables of devices within an AS be the same. In
re

addition, IGP fast convergence aims to achieve information


Mo

synchronization.
en
If IBGP routes were dampened, forwarding tables on devices would be

m/
inconsistent when these devices have different dampening parameters.
Route dampening therefore does not apply to IBGP routes.

. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
IP addresses used to interconnect devices are designed as
ht

follows:
If RTX connects to RTY, interconnected addresses are
XY.1.1.X and XY.1.1.Y.Network mask is 24.
s:

Loopback interface addresses of R1, R2, R3, R6, and


ce

R7 are shown in the figure.


ur

Case analysis
To establish stable IBGP peer relationships, use loopback
so

interface addresses and static routes within an AS.


To establish EBGP peer relationships, use physical interface
Re

addresses.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer as-number command sets the AS number of a
ht

specified peer (or peer group).


The peer connect-interface command specifies a source
interface that sends BGP messages and a source address
s:

used to initiate a connection.


The peer next-hop-local command configures a BGP device
ce

to set its IP address as the next hop of routes when it


ur

advertises the routes to an IBGP peer or peer group.


so

View
BGP process view
Re

Parameters
peer ipv4-address as-number as-number
ng

ip-address: specifies the IPv4 address of a peer.


ni

as-number: specifies the AS number of the peer.


peer ipv4-address connect-interface interface-type interface-
ar

number [ ipv4-source-address ]
ip-address: specifies the IPv4 address of a peer.
Le

interface-type interface-number: specifies the interface


type and number.
ipv4-source-address: specifies the IPv4 source address
re

used to set up a connection.


Mo

peer ipv4-address next-hop-local


en
ip-address: specifies the IPv4 address of a peer.

m/
Precautions

co
When using a loopback interface to send BGP messages:
Ensure that the loopback interface address of the BGP

.
peer is reachable.

ei
In the case of an EBGP connection, you need to run

w
the peer ebgp-max-hop command to enable EBGP to

ua
establish the peer relationship in indirect mode.
The peer next-hop-local and peer next-hop-invariable

.h
commands are mutually exclusive.
The PrefRcv field in the display bgp peer command output

g
indicates the number of route prefixes received from the peer.

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. Perform the configuration based on the configuration in


the previous case.
R1 prefers routes to 10.0.X.0/24 with next hop R2 because
s:

BGP prefers the route advertised by the router with the


ce

smallest router ID.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer route-policy command specifies a route-policy to
ht

control routes received from, or to be advertised to a peer or


peer group.
s:

View
BGP view
ce
ur

Parameters
peer ipv4-address route-policy route-policy-
so

name { import | export }


ipv4-address: specifies an IPv4 address of a peer.
Re

route-policy-name: specifies a route-policy name.


import: applies a route-policy to routes to be imported
ng

from a peer or peer group.


export: applies a route-policy to routes to be advertised
ni

to a peer or peer group.


ar

Configuration verification
Run the display bgp routing-table command to view the BGP
Le

routing table.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. Company A requires that R1 access network 10.0.1.0/24


through R7. To meet this requirement, you can enable R4 to
access network 10.0.1.0/24 through R7 using the MED
s:

attribute.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer route-policy command specifies a route-policy to
ht

control routes received from, or to be advertised to a peer or


peer group.
s:

View
BGP view
ce
ur

Parameters
peer ipv4-address route-policy route-policy-
so

name { import | export }


ipv4-address: specifies an IPv4 address of a peer.
Re

route-policy-name: specifies a route-policy name.


import: applies a route-policy to routes to be imported
ng

from a peer or peer group.


export: applies a route-policy to routes to be advertised
ni

to a peer or peer group.


ar

Configuration verification
Run the display bgp routing-table command to view the BGP
Le

routing table.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. To meet the requirement, use the Community attribute.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer route-policy command specifies a route-policy to
ht

control routes received from, or to be advertised to a peer or


peer group.
s:

View
BGP view
ce
ur

Parameters
peer ipv4-address route-policy route-policy-
so

name { import | export }


ipv4-address: specifies an IPv4 address of a peer.
Re

route-policy-name: specifies a route-policy name.


import: applies a route-policy to routes to be imported
ng

from a peer or peer group.


export: applies a route-policy to routes to be advertised
ni

to a peer or peer group.


ar

Configuration verification
Run the display bgp routing-table community command to
Le

view the attributes in the BGP routing table.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer route-policy command specifies a route-policy to
ht

control routes received from, or to be advertised to a peer or


peer group.
The peer default-route-advertise command configures a
s:

BGP device to advertise a default route to its peer or peer


ce

group.
View
ur

peer route-policy: BGP view


peer default-route-advertise: BGP view
so

Parameters
Re

peer ipv4-address route-policy route-policy-


name { import | export }
ng

ipv4-address: specifies an IPv4 address of a peer.


route-policy-name: specifies a route-policy name.
ni

import: applies a route-policy to routes to be imported


from a peer or peer group.
ar

export: applies a route-policy to routes to be advertised


to a peer or peer group.
Le

peer { group-name | ipv4-address } default-route-


advertise [ route-policy route-policy-name ] [ conditional-
re

route-match-all{ ipv4-address1 { mask1 | mask-length1 } }


Mo

&<1-4> | conditional-route-match-any { ipv4-


address2 { mask2 | mask-length2 } } &<1-4> ]
en
ipv4-address: specifies an IPv4 address of a peer.

m/
route-policy route-policy-name: specifies a route-
policy name.

co
conditional-route-match-all ipv4-
address1{ mask1 | mask-length1 }: specifies the IPv4

.
address and mask/mask length for conditional routes.

ei
The default routes are sent to the peer or peer group

w
only when all conditional routes are matched.

ua
conditional-route-match-any ipv4-
address2{ mask2 | mask-length2 }: specifies the IPv4

.h
address and mask/mask length for conditional routes.
The default routes are sent to the peer or peer group

g
only when any conditional route is matched.

in
rn
Configuration verification
Run the display ip routing-table command to view IP routing

ea
table information.

/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The maximum load-balancing command configures the
ht

maximum number of equal-cost routes.

View
s:

BGP view
ce

Parameters
ur

maximum load-balancing [ ebgp | ibgp ] number


ebgp: implements load balancing among EBGP routes.
so

ibgp: implements load balancing among IBGP routes.


number: specifies the maximum number of equal-cost
Re

routes in the BGP routing table.

Precautions
ng

The maximum load-balancing number command cannot be


ni

used together with the maximum load-balancing ebgp


number or maximum load-balancing ibgp number command.
ar

If the maximum load-balancing ebgp number or maximum


load-balancing ibgp number command is executed, the
Le

maximum load-balancing number command does not take


effect.
re
Mo
en
Configuration verification

m/
Run the display ip routing-table protocol bgp command to
view the load-balanced routes learned by BGP.

. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


After GTSM is enabled between R6 and R8, the hop count
should be 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer valid-ttl-hops command applies the GTSM function
ht

on the peer or peer group.


The gtsm default-action command configures the default
action to be taken on the packets that do not match the GTSM
s:

policy.
The gtsm log drop-packet command enables the log function
ce

on a board to log information about the packets discarded by


ur

GTSM on the board.


so

View
peer valid-ttl-hops: BGP view
Re

gtsm default-action: system view


gtsm log drop-packet: system view
ng

Parameters
ni

peer ipv4-address valid-ttl-hops [ hops ]


ipv4-address: specifies the IPv4 address of a peer.
ar

hops: specifies the number of TTL hops to be checked.


The value is an integer that ranges from 1 to 255. The default
Le

value is 255. If the value is configured as hops, the valid TTL


range of the detected packet is [255 - hops + 1, 255].
gtsm default-action { drop | pass }
re
Mo
en
drop: discards the packets that do not match the GTSM

m/
policy.
pass: allows the packets that do not match the GTSM

co
policy to pass through.

.
Precautions

ei
GTSM and EBGP-MAX-HOP affect the TTL values of sent

w
BGP packets. The two functions are mutually exclusive.

ua
If the default action is configured but the GTSM policy is not
configured, GTSM does not take effect.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In the topology, among the IP addresses that are not marked,
ht

Rx and Ry connect using IP addresses XY.1.1.X/24 and


XY.1.1.Y/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the displayvlan command to view the results.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp peer command to view the BGP peer
ht

relationship.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp routing-table command to view the BGP
ht

routing table. The command output shows that 2.2.2.2/32 and


3.3.3.3/32 have been advertised.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
The loop is the result of inconsistency between IGP route
ht

selection and BGP route selection.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In the topology, among the IP addresses that are not marked,
ht

Rx and Ry connect using IP addresses XY.1.1.X/24 and


XY.1.1.Y/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Analysis process
Run the display bgp routing-table community command to
ht

view the attributes.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp

Results
You will notice that the Community attribute of route
ht

10.0.0.0/24 is labeled as <400:1>, no-export on R2.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
You can add the AS_Path Attribute to change the route
ht

selection of R3.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To ensure connectivity between IBGP peers, you need to establish full-


mesh connections between IBGP peers. If there are n routers in an AS,
ht

you need to establish n(n-1)/2 IBGP connections. When there are a


large number of IBGP peers, many network resources and CPU
resources are consumed. A route reflector (RR) can be used between
s:

IBGP peers to solve this problem.


ce

In an AS, a router functions as an RR, and other routers function as


ur

clients. The RR and its clients establish IBGP connections and form a
cluster. The RR reflects routes to clients, removing the need to
so

establish BGP connections between clients.


Re

RR concepts
RR: a BGP device that can reflect the routes learned from an
ng

IBGP peer to other IBGP peers.


Client: an IBGP device of which routes are reflected by an RR
ni

to other IBGP devices. In an AS, clients only need to directly


connect to the RR.
ar

Non-client: an IBGP device that is neither an RR nor a client.


In an AS, a non-client must establish full-mesh connections
Le

with the RR and all the other non-clients.


Originator: a device that originates routes in an AS. The
Originator_ID attribute helps eliminate routing loops in a
re

cluster.
Mo

Cluster: a set of an RR and clients. The Cluster_List attribute


helps eliminate routing loops between clusters.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An RR advertises learned routes to IBGP peers based on the following


rules:
ht

The RR advertises the routes learned from an EBGP peer to


all the clients and non-clients.
The RR advertises the routes learned from a non-client IBGP
s:

peer to all the clients.


The RR advertises the routes learned from a client to all the
ce

other clients and all the non-clients.


ur

An RR is easy to configure because it needs to be configured only on


so

the device that functions as a reflector, and clients do not need to know
that they are clients.
Re

In some networks, if clients of an RR establish full-mesh connections


among themselves, they can directly exchange routing information. In
ng

this case, route reflection between clients is unnecessary and wastes


ni

bandwidth. You can run the undo reflect between-clients command


on the VRP Platform to prohibit an RR from reflecting the routes
ar

received from a client to other clients.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The originator ID identifies the originator of a route and is generated by


an RR to prevent routing loops in a cluster.
ht

When an RR reflects a route for the first time, the RR adds the
Originator_ID attribute to this route. The Originator_ID attribute
identifies the originator of the route. If the route already
s:

contains the Originator_ID attribute, the RR retains this


ce

Originator_ID attribute.
When a device receives a route, the device compares the
ur

originator ID of the route with the local router ID. If they are the
same, the device discards the route.
so

An RR and its clients form a cluster, which is identified by a unique


Re

cluster ID in an AS.
To prevent routing loops between clusters, an RR uses the Cluster_List
ng

attribute to record the cluster IDs of all the clusters that a route
passes through.
ni

When an RR reflects a route between clients, or between


clients and non-clients, the RR adds the local cluster ID to the
ar

top of the cluster list. If there is no cluster list, the RR creates a


Cluster_List attribute.
Le

When receiving an updated route, the RR checks the cluster


list of the route. If the cluster list contains the local cluster ID,
the RR discards the route. If the cluster list does not contain
re

the local cluster ID, the RR adds the local cluster ID to the
Mo

cluster list and then reflects the route.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Backup RR prevents single-point failures.


ht

Backup RR
On the VRP, you need to run the reflector cluster-id
command to set the same cluster ID for all the RRs in the
s:

same cluster.
When redundant RRs exist, a client receives multiple routes to
ce

the same destination from different RRs and then selects the
ur

optimal route according to BGP route selection policies.


The Cluster_List attribute prevents routing loops between
so

different RRs in the same AS.


Re

Topology description
When Client1 receives an updated route 10.0.0.0/24 from an
ng

external peer, it advertises the route to RR1 and RR2 through


IBGP.
ni

After RR1 receives the updated route, it reflects the route to


other clients (Client2 and Client3) and adds the local cluster ID
ar

to the top of the cluster list.


After RR2 receives the updated route, it checks the cluster list
Le

and finds that its cluster ID has been contained in the cluster
list. Subsequently, it discards the route without reflecting the
route to its clients.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A backbone network is divided into multiple clusters. RRs of the


clusters are non-clients and establish full-mesh connections with one
ht

other. Although each client only establishes an IBGP connection with


its RR, all the BGP routers in the AS can receive reflected routing
information.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A level-1 RR (RR1) is deployed in Cluster1, while RRs (RR2 and RR3)


in Cluster2 and Cluster3 function as clients of RR1.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Confederation
A confederation divides an AS into sub-ASs. Full-mesh IBGP
ht

connections are established in each sub-AS, while EBGP


connections are established between sub-ASs. ASs outside a
confederation still consider the confederation as an AS.
s:

After a confederation divides an AS into sub-ASs, it assigns a


ce

confederation ID (the AS number) to each router within the AS.


The original IBGP attributes are retained, including the
ur

Local_Pref attribute, MED attribute, and Next_Hop attribute.


Confederation-related attributes are automatically deleted
so

when being advertised outside a confederation. The


administrator therefore does not need to configure the rules for
Re

filtering information such as sub-AS numbers at the egress of


a confederation.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The AS_Path attribute is a well-known mandatory attribute. It consists


of ASs and has the following types:
ht

AS_SET: comprises a series of ASs in a disorderly manner


and is carried in an Update message. When network
summarization occurs, you can use policies to prevent path
s:

information loss using AS_SET.


AS_SEQUENCE: comprises a series of ASs in sequence and
ce

is carried in an Update message. Generally, the AS_Path type


ur

is AS_SEQUENCE.
AS_CONFED_SEQUENCE: comprises a series of member
so

ASs in a confederation in sequence and is carried in an


Update message. Similar to AS_SEQUENCE,
Re

AS_CONFED_SEQUENCE can only be transmitted within a


local confederation.
AS_CONFED_SET: comprises a series of member ASs in a
ng

confederation in a disorderly manner and is carried in an


ni

Update message. Similar to AS_SET, AS_CONFED_SET can


only be transmitted within a local confederation.
ar

Member AS numbers within a confederation are invisible to other ASs


Le

outside the confederation. When routes are therefore advertised to


other ASs outside the confederation, member AS numbers are
removed.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Comparison between a route reflector and a confederation


A confederation requires an AS to be divided into sub-ASs,
ht

changing the network topology a lot.


Only an RR needs to be configured, and clients do not need to
be configured. The confederation needs to be configured on all
s:

the devices.
RRs must establish full-mesh IBGP connections.
ce

Route reflectors are widely used, while confederations are


ur

seldom used.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The BGP routing table of each device on a large network is large. This
burdens devices, increases the route flapping probability, and affects
ht

network stability.

Route summarization is a mechanism that combines multiple routes


s:

into one route. This mechanism allows a BGP device to advertise only
ce

the summarized route but not all the specific routes to peers. It reduces
the BGP routing table size. If the specific routes flap, the network is not
ur

affected, therefore improving network stability.


so

Route summarization uses the Aggregator attribute. This attribute is an


optional transitive attribute and identifies the node where route
Re

summarization occurs and carries the router ID and AS number of the


node.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Precautions
The summary automatic command summarizes the routes
ht

imported by BGP, including direct routes, static routes, RIP


routes, OSPF routes, and IS-IS routes. After summarization is
configured, BGP summarizes routes according to the natural
s:

network segment and suppresses specific routes in the BGP


ce

routing table. This command is only valid for the routes


imported using the network command.
ur

BGP advertises only summarized routes to peers.


BGP does not start automatic summarization by default.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Manual summarization
Summarized routes do not carry the AS_Path attribute of detail
ht

routes.
Using the AS_SET attribute to carry the AS number can
prevent routing loops. Differences between AS_SET and
s:

AS_SEQUENCE are as follows: In AS_SET, the AS list is


ce

often used to perform route summarization, and AS numbers


are added to the AS list in a disorderly manner. In
ur

AS_SEQUENCE, AS numbers are added to the AS list in the


sequence in which a route passes through.
so

Adding the AS_SET attribute to summarized routes may cause


routing flapping.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RFC 5291 and RFC 5292 define the prefix-based BGP outbound route
filtering (ORF) capability to advertise required BGP routes. BGP ORF
ht

allows a device to send prefix-based inbound policies in a Route-


Refresh message to BGP peers. BGP peers then construct outbound
policies based on these inbound policies to filter routes before sending
s:

these routes. This capability has the following advantages:


Prevents the local device from receiving a large number of
ce

unnecessary routes.
ur

Reduces CPU usage of the local device.


Simplifies the configuration of BGP peers.
so

Improves link bandwidth efficiency.


Re

Case description
Among directly-connected EBGP peers, after negotiating the
ng

prefix-based ORF capability with R1, Client2 adds local prefix-


based inbound policies to a Route-Refresh message and
ni

sends the message to R1. R1 then constructs outbound


policies based on the received Route-Refresh message and
ar

sends required routes to Client1 using a Route-Refresh


message. Client1 receives only the required routes, and R1
Le

does not need to maintain routing policies. In this manner, the


configuration workload is reduced.
Client1 and Client2 are clients of the RR. Client1, Client2, and
re

the RR negotiate the prefix-based ORF capability. Client1 and


Mo

Client2 then add local prefix-based inbound policies to Route-


Refresh messages and send the messages to the RR.
en
The RR constructs outbound policies based on the received

m/
inbound policies and reflects required routes in Route-Refresh
messages to Client1 and Client2. Client1 and Client2 receive only

co
the required routes, and the RR does not need to maintain routing
policies. The configuration workload is thereby reduced.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Active-Route-Advertise
Once a route is preferred by BGP, the route can be advertised
ht

to peers by default. When Active-Route-Advertise is configured,


only the route preferred by BGP and also active at the routing
management layer is advertised to peers.
s:

Active-Route-Advertise and the bgp-rib-only command are


ce

mutually exclusive. The bgp-rib-only command prohibits BGP


routes from being advertised to the IP routing table.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

BGP dynamic update peer-groups


BGP sends routes based on peers by default, even though the
ht

peers have the same outbound policies.


After this feature is enabled, BGP groups each route only once
and then sends the route to all the peers in the update-group,
s:

improving grouping efficiency exponentially.


ce

Topology description
ur

RR1 has three clients and needs to reflect 100,000 routes to these
clients. If RR1 sends the routes grouped per peer to the three clients,
so

the total number of times that all routes are grouped is 300,000
(100,000 x 3). After the dynamic update peer-groups feature is used,
Re

the total number of grouping times changes to 100,000 (100,000 x 1),


improving grouping performance by a factor of 3.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Roles defined in 4-byte AS number


New speaker: a peer that supports 4-byte AS numbers
ht

Old speaker: a peer that does not support 4-byte AS numbers


New session: a BGP connection between new speakers
Old session: a BGP connection between a new speaker and
s:

an old speaker, or between old speakers.


ce

Protocol extension
ur

Two new optional transitive attributes, AS4_Path with attribute


code 0x11 and AS4_Aggregator with attribute code 0x12, are
so

defined to transmit 4-byte AS numbers in old sessions.


If a BGP connection is set up between a new speaker and an
Re

old speaker, a newly reserved AS_TRANS with value 23456 is


defined for interoperability between 4-byte AS number and 2-
ng

byte AS number.
New AS numbers have three formats:
ni

asplain: represents an AS number using a decimal


integer.
ar

asdot+: represents an AS number using two integer


values joined by a period character: <high order 16-bit
Le

value in decimal>.<low order 16-bit value in decimal>.


For example, 2-byte ASN123 is represented as 0.123,
and ASN 65536 is represented as 1.0. The largest
re

value is 65535.65535.
Mo
en
asdot: represents a 2-byte AS number using the

m/
asplain format and representing a 4-byte AS number
using the asdot+ format. (1 to 65535; 1.0 to

co
65535.65535)
Huawei supports the asdot format.

.
ei
Topology description

w
R2 receives a route with a 4-byte AS number 10.1 from R1.

ua
R2 establishes a peer relationship with R3 and needs to
enable R3 to consider the AS number of R2 as AS_TRANS.

.h
When advertising a route to R3, R2 records AS_TRANS in the
AS_Path attribute of the route and records 10.1 and its AS

g
number 20.1 to the AS4_Path attribute in the sequence

in
required by BGP.
R3 retains the unrecognized AS4_Path attribute and

rn
advertises the route to R4 according to BGP rules and

ea
considers the AS number of R2 as AS_TRANS.
When receiving the route from R3, R4 replaces AS_TRANS
/l
with the IP address recorded in the AS4_Path attribute and
records the AS4_Path as 30 20.1 10.1.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Next-hop iteration based on routing policy


BGP needs to iterate indirect next hops. If indirect next hops
ht

are not iterated according to the routing policy, routes may be


iterated to incorrect forwarding paths. Next hops should
therefore be iterated according to certain conditions to control
s:

the iterated routes. If a route cannot pass the routing policy,


ce

the route is ignored and route iteration fails.


ur

Topology description
IBGP peer relationships are established between R1 and R2,
so

and between R1 and R3 through loopback interfaces. R1


receives a BGP route with prefix 10.0.0.0/24 from R2 and R3.
Re

The original next hop of the BGP route received from R2 is


2.2.2.2. The IP address of Ethernet0/0/0 of R1 is 2.2.2.100/24.
When R2 is running normally, the BGP route with prefix
ng

10.0.0.0/24 is iterated to the IGP route 2.2.2.2/32. When the


ni

IGP on R2 becomes faulty, the IGP route 2.2.2.2/32 is


withdrawn. This causes route iteration again. On R1, a route is
ar

searched for in the IP routing table based on the original next


hop 2.2.2.2. Consequently, the route is iterated to 2.2.2.0/24.
Le

The user expects that: when the route with the next hop
2.2.2.2 becomes unreachable, the route with the next hop
3.3.3.3 is preferred. Actually, the fault is caused by BGP
re

convergence and results in an instant routing black hole.


Mo
en
With the next-hop iteration policy, you can control the mask

m/
length of the route through which the original next hop can be
iterated. After the next-hop iteration policy is configured, the

co
route with the original next hop 2.2.2.2 depends on only the
IGP route 2.2.2.2/32.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Session setup between peers


A session can be set up between BGP speakers through
ht

directly connected or loopback interfaces. Generally, IBGP


neighbors establish peer relationships through loopback
interfaces, while EBGP neighbors establish peer relationships
s:

through directly connected physical interfaces.


You can configure authentication to ensure security for
ce

sessions between peers.


ur

Logical full-mesh connections must be set up between IBGP


peers (no RR or confederation is used).
so

You can prohibit synchronization to reduce the IGP load.


Route update origin
Re

Routes can be imported into BGP using the import-route or


network command.
Routing policy optimization
ng

You can optimize BGP routes using inbound policies,


ni

outbound policies, and ORF.


Route filtering and attribute control
ar

You can filter the routes to be advertised or received.


You can control BGP route attributes to affect BGP route
Le

propagation.
Route summarization
Route summarization can optimize BGP routing entries and
re

reduce the routing table size.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Redundancy
Path redundancy ensures that a backup path is available when
ht

a network fault occurs.


Traffic symmetry
Scientific network design and policy application can ensure
s:

consistent paths for incoming and outgoing traffic.


ce

Load balancing
When multiple paths to the same destination exist, traffic can
ur

be load balanced through policies to fully utilize bandwidth.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Interaction between non-BGP routes and BGP routes


Generally, non-BGP routes can be imported into the BGP
ht

routing table using the import-route or network command.


Control of default routes
Default routes can be advertised or received according to
s:

conditions of routing policies.


ce

Policy-based routing
Traffic paths can be optimized through PBR.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Dynamic update peer-groups: greatly improves router performance.


Route reflector and confederation: reduces the number of IBGP
ht

sessions and optimizes large BGP networks.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Reduce unstable routes


Use stable IGPs.
ht

Improve router performance.


Reduce manual errors.
Expand link bandwidth.
s:

Improve BGP stability


Use BGP soft reset when using new BGP policies.
ce

Punish unstable routes correctly to reduce the impact of these


ur

routes on BGP.
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
IP addresses used to interconnect devices are as follows:
ht

If RTX connects to RTY, interconnected addresses are


XY.1.1.X and XY.1.1.Y. Network mask is 24.
If OSPF runs normally and the interconnected addresses and
s:

loopback interface addresses have been advertised into OSPF.


ce

However 10.0.X.0/24, 172.15.X.0/24, and 172.16.X.0/24 are


not advertised into OSPF.
ur

Case analysis
EBGP peer relationships are established using loopback
so

interfaces.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer as-number command sets an AS number for a
ht

specified peer or peer group.


The peer connect-interface command specifies a source
interface that sends BGP messages and a source address
s:

used to initiate a connection.


The peer next-hop-local command configures a BGP device
ce

to set its IP address as the next hop of routes when it


ur

advertises routes to an IBGP peer or peer group.


The group command creates a peer group.
so

View
Re

BGP process view

Parameters
ng

peer ipv4-address as-number as-number


ni

ip-address: specifies the IPv4 address of a peer.


as-number: specifies the AS number of the peer.
ar

peer ipv4-address connect-interface interface-type interface-


number [ ipv4-source-address ]
Le

ip-address: specifies the IPv4 address of a peer.


interface-type interface-number: specifies the interface
type and number.
re

ipv4-source-address: specifies the IPv4 source address


Mo

used to set up a connection.


en
peer ipv4-address next-hop-local

m/
ip-address: specifies the IPv4 address of a peer.
group group-name [ external | internal ]

co
group-name: specifies the name of a peer group.
external: creates an EBGP peer group.

.
internal: creates an IBGP peer group.

w ei
Precautions

ua
When configuring a device to use a loopback interface as the
source interface of BGP messages, note the following points:

.h
The loopback interface of the device's BGP peer must
be reachable.

g
In the case of an EBGP connection, the peer ebgp-

in
max-hop command must be executed to enable the

rn
two devices to establish an indirect peer relationship.
The peer next-hop-local and peer next-hop-invariable

ea
commands are mutually exclusive.
The Rec field in the display bgp peer command output
/l
indicates the number of route prefixes received from the peer.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. Perform the configurations based on the configuration in


the previous case.
If all the clients of the RR have established logically full-mesh
s:

connections, the clients can transmit routes to each other


ce

without requiring the RR to reflect routes to them. In this


situation, prohibit the RR from reflecting routes to clients so as
ur

to reduce the RR load.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The undo reflect between-clients command prohibits an RR
ht

from reflecting routes to clients. This command is executed on


an RR. After this command is executed, clients can directly
exchange BGP messages, while R2 does not need to reflect
s:

routes to these clients. However, R2 still reflects the routes


ce

that are advertised by non-clients.


ur

View
BGP view
so

Configuration verification
Re

Run the display bgp peer command to view detailed BGP


peer information.
To reduce the RR load, prohibit BGP routes from being added
ng

to the IP routing table and prevent the RR from forwarding


ni

packets. Disabling route reflection between clients however


can better meet the full-mesh scenario requirement.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. To meet the first requirement, use a route-policy to


advertise interface routing information.
To meet the second requirement, use an IP prefix list to filter
s:

routes.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer ip-prefix command configures a route filtering policy
ht

based on an IP prefix list for a peer or peer group.

View
s:

BGP view
ce

Parameters
ur

peer { group-name | ipv4-address } ip-prefix ip-prefix-


name { import | export }
so

group-name: specifies the name of a peer group.


ipv4-address: specifies the IPv4 address of a peer.
Re

ip-prefix-name: specifies the name of an IP prefix list.


import: applies a filtering policy to the routes received
ng

from a peer or peer group.


export: applies a filtering policy to the routes sent to a
ni

peer or peer group.


ar

Configuration verification
Run the display bgp routing-table command to view the BGP
Le

routing table.
For the same node in a route-policy, the relationship between
if-match clauses is AND. A route needs to meet all the
re

matching rules before the actions defined by apply clauses


Mo

are performed.
en
The relationship between the if-match clauses in the if-match route-

m/
type and if-match interface commands is "OR", but the relationship
between the if-match clauses in the two commands and other

co
commands is "AND".

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


In requirement 2, the delivery of a default route depends on
route 172.16.0.0/16. If route 172.16.0.0/16 disappears, the
s:

default route also disappears.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer route-policy command specifies a route-policy to
ht

control routes received from or to be advertised to a peer or


peer group.
The peer default-route-advertise command configures a
s:

BGP device to advertise a default route to its peer or peer


ce

group.
ur

View
peer route-policy: BGP view
so

peer default-route-advertise: BGP view


Parameters
Re

peer ipv4-address route-policy route-policy-


name { import | export }
ng

ipv4-address: specifies an IPv4 address of a peer.


route-policy-name: specifies a route-policy name.
ni

import: applies a route-policy to routes to be imported


from a peer or peer group.
ar

export: applies a route-policy to routes to be advertised


to a peer or peer group.
Le

peer { group-name | ipv4-address } default-route-


advertise [ route-policy route-policy-name ] [ conditional-
route-match-all{ ipv4-address1 { mask1 | mask-length1 } }
re

&<1-4> | conditional-route-match-any { ipv4-


Mo

address2 { mask2 | mask-length2 } } &<1-4> ]


en
ipv4-address: specifies an IPv4 address of a peer.

m/
route-policy route-policy-name: specifies a route-
policy name.

co
conditional-route-match-all ipv4-
address1{ mask1 | mask-length1 }: specifies the IPv4

.
address and mask/mask length for conditional routes.

ei
The default routes are sent to the peer or peer group

w
only when all conditional routes are matched.

ua
conditional-route-match-any ipv4-
address2{ mask2 | mask-length2 }: specifies the IPv4

.h
address and mask/mask length for conditional routes.
The default routes are sent to the peer or peer group

g
only when any conditional route is matched.

in
rn
Configuration verification
Run the display ip routing-table command to view

ea
information about the IP routing table.

/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The aggregate command creates an aggregated route in the
ht

BGP routing table.

View
s:

BGP view
ce

Parameters
ur

aggregate ipv4-address { mask | mask-length } [ as-


set | attribute-policy route-policy-name1 | detail-
so

suppressed | origin-policy route-policy-name2 | suppress-


policyroute-policy-name3 ] *
Re

ipv4-address: specifies the IPv4 address of an


aggregated route.
ng

mask: specifies the network mask of an aggregated


route.
ni

mask-length: specifies the network mask length of an


aggregated route.
ar

as-set: generates a route with the AS-SET attribute.


attribute-policy route-policy-name1: specifies the name
Le

of an attribute policy for aggregated routes.


detail-suppressed: advertises only the aggregated
route.
re

origin-policy route-policy-name2: specifies the name of


Mo

a policy that allows route aggregation.


en
suppress-policy route-policy-name3: specifies the

m/
name of a policy for suppressing the advertisement of
specified routes.

co
Precautions

.
During manual or automatic summarization, routes pointing to

ei
NULL0 are generated locally.

w
ua
Configuration verification
Run the display ip routing-table protocol bgp command to

.h
view the routes learned by BGP.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


BGP on-demand route advertisement requires ORF to be
enabled on R4, R5, and R6.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The peer capability-advertise orf command enables prefix-
ht

based ORF for a peer or peer group.

View
s:

BGP view
ce

Parameters
ur

peer { group-name | ipv4-address } capability-advertise


orf [ cisco-compatible ] ip-prefix { both | receive | send }
so

group-name: specifies the name of a peer group.


ipv4-address: specifies the IPv4 address of a peer.
Re

cisco-compatible: is compatible with Cisco devices.


both: allows the device to send and receive ORF
ng

packets.
receive: allows the device to receive only ORF packets.
ni

send: allows the device to send only ORF packets.


ar

Precautions
BGP ORF has three modes: send, receive, and both. In send
Le

mode, a BGP device can send ORF information. In receive


mode, a BGP device can receive ORF information. In both
mode, a BGP device can send and receive ORF information.
re
Mo
en
To enable a BGP device that advertises routes to receive ORF

m/
IP-prefix information, configure this device to work in receive or
both mode and the peer device to work in send or both mode.

co
Configuration verification

.
Run the display bgp peer 1.1.1.1 orf ip-prefix command to

ei
view prefix-based BGP ORF information received from a

w
specified peer.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
IP addresses used to interconnect devices are as follows:
ht

If RTX connects to RTY, interconnected addresses are


XY.1.1.X and XY.1.1.Y.Network mask is 24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp

Results
The configuration is the basic OSPF configuration.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp peer command to view the BGP peer
ht

status.
Run the display bfd session all command to view the BFD
session. In the command output, D_IP_IF indicates that a BFD
s:

session is dynamically created and bound to an interface.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp routing-table command to view BGP
ht

routing entries. The command output shows that R3 learns two


routes 10.0.0.0/24 from R2 and R4. According to BGP routing
rules, R3 prefers the route 10.0.0.0/24 learned from R2.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Analysis process
You can use commands peer groups to reduce the RR load.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp routing-table community command to
ht

view the Community attribute.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Run the display bgp routing-table community command to
ht

view the Community attribute. The Community attribute is no-


export. That is, the route is not advertised to EBGP peers.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

ACL
An ACL is a series of sequential rules composed of permit and
ht

deny clauses. These rules match packet information to classify


packets. Based on ACL rules, Routers permits or denies
packets.
s:

An Access Control List (ACL) is a set of sequential rules. The


ce

ACL filters packets according to the specified rules. With the


rules applied to a device, the device permits or denies the
ur

packets according to the rules.


so

IP prefix list
An IP prefix list filters matching routes in defined matching
Re

mode to meet requirements.


An IP prefix list filters only routing information but not packets.
ng

AS_Path filter
Each BGP route contains an AS path attribute. AS path
ni

filters specify matching rules regarding AS path attribute.


ar

AS path filters are exclusively used in BGP.


Le

Community filter
Community filters are exclusively used in BGP. Each BGP
re

route contains a community domain to identify a


community.Community filters specify matching rules
Mo

regarding community domains.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

ACL management rule


An ACL can contain multiple rules.
ht

A rule is identified by a rule ID, which can be set by a user or


automatically generated based on the ACL step. All the rules
in an ACL are arranged in ascending order of rule IDs.
s:

There is a step between rule IDs. If no rule ID is specified, the


ce

step is determined by the ACL step. You can add new rules to
a rule group based on the rule ID.
ur

ACL rule management


so

When a packet reaches a device, the search engine retrieves


information from the packet to constitute the key value and
Re

matches the key value with rules in an ACL. When a matching


rule is found, the system stops the matching, and the packet
ng

matches the rule.


If no matching rule is found, the packet does not match any
ni

rule.
ar

The action defined in the last rule of a Huawei ACL is permit by default.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Interface-based ACL
Match packets based on the rules defined on the inbound
ht

interface of packets. You can run the traffic-filter command to


reference an interface-based ACL.
s:

Basic ACL
Define rules based on the source IP address, VPN instance,
ce

fragment flag, and time range of packets.


ur

Advanced ACL
so

Define rules based on the source IP address, destination IP


address, IP preference, ToS, DSCP, IP protocol type, ICMP
Re

type, TCP source port/destination port, and UDP source


port/destination port number of packets. An advanced ACL can
ng

define more accurate, abundant, and flexible rules than a basic


ACL.
ni

Layer 2 ACL
ar

Define rules based on Ethernet frame header information in a


packet, including the source MAC address, destination MAC
Le

address, and Ethernet frame protocol type.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

ACL matching order


An ACL is composed of a list of rules. Each rule contains a
ht

deny or permit clause. These rules may overlap or conflict.


One rule can contain another rule, but the two rules must be
different.
s:

Devices support two types of matching order: configuration


ce

order and automatic order. The matching order determines the


priorities of the rules in an ACL. Rule priorities resolve the
ur

conflict between overlapping rules.


so

Automatic order
The automatic order follows the depth-first principle.
Re

ACL rules are arranged in sequence based on rule precision.


For an ACL rule (where a protocol type, a source IP address
ng

range, or a destination IP address range is specified), the


stricter the rule, the more precise it is considered. For example,
ni

an ACL rule can be configured based on the wildcard of an IP


address. The smaller the wildcard, the smaller the specified
ar

host and the stricter the ACL rule.


If rules have the same depth-first order, rules are matched in
Le

ascending order of rule IDs.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Packet fragmentation supported by ACLs


In traditional packet filtering, only the first fragment of a packet
ht

needs to match rules, while the other fragments are allowed to


pass through if the first fragment matches rules. In this
situation, network attackers may construct subsequent
s:

fragments to launch attacks.


In an ACL rule, the fragment parameter indicates that the rule
ce

is valid for all fragmented packets. The none-first-fragment


ur

parameter indicates that the rule is valid only for non-first


fragmented packets but not for non-fragmented packets or the
so

first fragmented packet. The rules that do not contain


fragment and none-first-fragment parameters are valid for all
Re

packets (including fragmented packets).

ACL time range


ng

You can make ACL rules valid only at the specified time or
ni

within a specified time range.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IP prefix list
An IP prefix list can contain multiple indexes. Each index has a
ht

node. The system matches a route against nodes by the index


in ascending order. If the route matches a node, the system
does not match the route against the other nodes. If the route
s:

does not match any node, the system filters the route.
According the matching prefix, an IP prefix list can be used for
ce

accurate matching, or matching within a specified mask length


ur

range.
An IP prefix list can implement accurate matching, or matching
so

within a specified mask length range. You can configure


greater-equal and less-equal to specify the prefix mask
Re

length range. If the two keywords are not configured, an IP


prefix is used to implement accurate matching. That is, only
ng

routes with the same mask length as that specified in the IP


prefix list are matched. If only greater-equal is configured, the
ni

mask length range is [greater-equal-value,32]. If only less-


equal is configured, the mask length range is [specified mask
ar

length, less-equal-value].
The mask length range can be specified as mask-
Le

length<=greater-equal-value<=less-equal-value<=32.

Characteristics of an IP prefix list


re

When all IP prefix lists are not matched, the last matching
Mo

mode is deny by default.


When the referenced IP prefix list does not exist, the default
matching mode is permit.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An AS_Path filter is only used to filter BGP routes to be advertised or


received based on the AS_Path attributes contained in the BGP
ht

routes.

Since the number of the last AS that a route passes through is added to
s:

the leftmost of an AS_Path list, configure an AS_Path filter with


ce

caution:
If a route originating from an AS passes through AS 300, AS
ur

200, and AS500, and then reaches AS 600, the AS_Path


attribute of the route is (500 200 300 100).
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A Community filter is only used to filter BGP routes to be advertised or


received based on the Community attributes contained in the BGP
ht

routes.

The Community attribute includes basic and advanced community


s:

attributes.
Self-defined community attributes and well-known
ce

communities are basic community attributes.


ur

RT and SOO in MPLS VPN are extended community attributes.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A route policy is used to filter routes and set attributes for routes. By
changing route attributes (including reachability), a route policy
ht

changes the path that network traffic passes through.

A route policy is often used in the following scenarios:


s:

Control route importing.


Using a route policy, you can preventing sub-optimal
ce

routes and routing loops during the import of routes.


ur

Control route receiving and advertising.


Using a route policy, you can receive or advertise
so

specified routes according to network requirements.


Set attributes for routes.
Re

Using a route policy, you can modify the attributes of


routes to optimize a network.
ng

Route policy principles


ni

A route policy consists of multiple nodes. The system checks


routes against the nodes of a route policy in ascending order
ar

of the node IDs. A node contains multiple if-match and apply


clauses. The if-match clauses define matching conditions of a
Le

node, while apply clauses define the actions to be performed


on the routes that match if-match clauses. The relationship
between the if-match clauses of a node is AND. That is, a
re

route matches a node only when the route matches all the if-
Mo

match clauses of the node. The relationship between the


nodes of a route policy is OR.
en
That is, a route matches a route policy as long as the route

m/
matches the route policy. If a route does not match any node,
the route fails to match the route policy.

co
The relationship between the if-match clauses of a node in a
route policy is AND. The actions defined by apply clauses can

.
be performed on a route only when the route meets all the

ei
matching conditions defined by the if-match clauses. The

w
relationship between the if-match clauses in the if-match

ua
route-type and if-match interface commands is OR, but the
relationship between the if-match clauses in the two

.h
commands and other commands is AND.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

In the topology, dual-node bidirectional route advertisement is


implemented.
ht

In the topology, R1 imports route 10.0.0.1/24 into OSPF. R3


imports OSPF routes into IS-IS, and R2 learns routes
10.0.0.0/24 through IS-IS. In this case, R2 learns two routes
s:

10.0.0.0/24 through OSPF and IS-IS. R2 prefers the route


ce

learned through IS-IS because this route has a higher priority


than the external route learned through OSPF. Therefore, R2
ur

reaches 10.0.0.0/24 along the path R4R3R1. To optimize


the path, modify the OSPF ASE priority to be higher than the
so

IS-IS priority using a route policy. This modification prevents


R2 from using a sub-optimal route.
Re

When the interface that connects R1 to network 10.0.0.0/24


goes Down, R2 imports route 10.0.0.0/24 into OSPF because
ng

it has learned the route through IS-IS even though the external
LSA has been aged in the OSPF area. R1 and R3 then learn
ni

the route 10.0.0.0/24. When R2 accesses network 10.0.0.0/24,


traffic passes through R4R3R1R2, causing a routing
ar

loop. In this scenario, use a tag to prevent routing loops.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Control route receiving and advertising


Only necessary and valid routes are received, which limits the
ht

routing table size and improves network security.

Topology description
s:

R4 imports routes 10.0.X.0/24 into OSPF. According to service


ce

requirements, R1 can only receive routes 10.0.0.0/24 and


10.0.1.0/24, while R2 can only receive routes 10.0.2.0/24 and
ur

10.0.3.0/24. You can use a filter policy to meet this


requirement.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Generally, only routing information is filtered, but link state information


is not filtered.
ht

In OSPF, incoming and outgoing Type 3, Type 5, and Type 7


LSAs can be filtered.
Link-state routing protocols, such as OSPF and IS-IS, can filter
s:

only incoming routes but not LSAs that carry these routes.
ce

That is, OSPF and IS-IS do not add the filtered routes to the
local routing tables, but LSAs of these routes are still
ur

transmitted in the OSPF or IS-IS area.


The routes imported from other protocols can also be filtered.
so

For example, you can use the filter-policy export command to


filter the imported routes to be advertised from RIP. Only the
Re

external routes that pass the filtering can be converted into


AS-external LSAs and advertised. In this situation, other
ng

neighbors do not have specified routes imported from RIP.


This configuration can only be performed in the outbound
ni

direction.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
You can modify the Local_Pref attribute contained in a route
ht

using a route policy to change the path of traffic. R2 learn the


route 10.0.0.0/24 from EBGP and modify the Local Pref value
300, R3 learn the route 10.0.0.0/24 from EBGP and modify the
s:

Local Pref value 200. R1,R2,R3 have routes of each other


ce

from IBGP, ultimate AS100 prefers R2 to reach the 10.0.0.0/24.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PBR is a mechanism that selects routes based on user-defined policies.


It includes local PBR, interface PBR, and SPR. This course discusses
ht

only local PBR.

IP unicast PBR has the following advantages:


s:

Allows you to define policies for route selection according to


ce

service requirements, which improves route selection flexibility


and controllability.
ur

Sends different data flows through different links, which


improves link efficiency.
so

Uses low-cost links to transmit service data without affecting


service quality, which reduces the cost of enterprise data
Re

services.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Matching process
If a device finds a matching local PBR node, the device
ht

processes packets as follows:


Step 1 Checks whether the priority of packets has been set.
If so, the device applies the configured priority
s:

to the packets and performs step 2.


If not, the device performs step 2.
ce

Step 2 Checks whether an outbound interface has been


ur

configured for the local PBR.


If so, the device sends packets from the
so

outbound interface.
If not, the device performs step 2.
Re

Step 3 Checks whether next hops have been configured


for the local PBR. You can configure two next hops to
ng

implement load balancing.


If so, the device sends packets to the next hops.
ni

If not, the device searches the routing table for


a route based on the destination addresses of
ar

packets. If no route is available, the devices


performs step 4.
Le

Step 4 Checks whether the default outbound interface has


been configured for the local PBR.
If so, the device sends the packets from the
re

default outbound interface.


Mo

If not, the device performs step 5.


en
Step 5 Checks whether the default next hop has been

m/
configured for the local PBR.
If so, the device sends the packets to the

co
default next hop.
If not, the device performs step 6.

.
Step 6 Discards the packets and generates

ei
ICMP_UNREACH messages.

w
If the device does not find a matching local PBR node, it

ua
searches the routing table for a route based on the destination
addresses of the packets and then sends the packets.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
IP addresses used to interconnect devices are as follows:
ht

If RTX connects to RTY, interconnected addresses are


XY.1.1.X and XY.1.1.Y. Network mask is 24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The route-policy command creates a route policy and
ht

displays the route-policy view.

View
s:

System view
ce

Parameters
ur

route-policy route-policy-name { permit | deny } node node


route-policy-name: specifies the name of a route policy.
so

permit: specifies the matching mode of the route policy


as permit. In permit mode, if a route matches all the if-
Re

match clauses of a node, the route matches the route


policy, and the actions defined by the apply clause of
ng

the route are performed on the route, otherwise, the


route continues to match the next node.
ni

deny: specifies the matching mode of the route policy as


deny. In deny mode, if a route matches all the if-match
ar

clauses of a route, the route does not match the route


policy and cannot match the next node.
Le

node node: specifies the index of a node in the route


policy.
re

Precautions
Mo

A route policy is used to filter routes and set attributes for the routes
that match the route policy. A route policy consists of multiple nodes.
en
One node contains multiple if-match and apply clauses.

m/
The if-match clauses define matching conditions for this node, and the
apply clauses define the actions to be performed on the routes that

co
meet the matching conditions. The relationship between if-match
clauses is AND. That is, a route must match all the if-match clauses of

.
a node. The relationship between the nodes of a route policy is OR.

ei
That is, if a route matches a node, the route matches the route policy. If

w
a route does not match any node, the route does not match the route

ua
policy.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. Perform the configuration based on the configuration in


the previous case.
In requirement 2, use the least number of commands to
s:

implement the optimal configuration.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The filter-policy export command filters imported routes to be
ht

advertised according to the policy.

View
s:

System view
ce

Parameters
ur

filter-policy { acl-number | acl-name acl-name | ip-prefix ip-


prefix-name } export [ protocol [ process-id ] ]
so

acl-number: specifies the number of a basic ACL.


acl-name acl-name: specifies the name of an ACL.
Re

ip-prefix ip-prefix-name: specifies the name of an IP


prefix list.
ng

protocol: specifies the protocol that advertises routing


information.
ni

process-id: specifies a process ID when the protocol that


advertises routing information is RIP, IS-IS, or OSPF.
ar

Precautions
Le

After external routes are imported into OSPF using the import-
route command, you can run the filter-policy export
command to filter the imported routes to be advertised.
re
Mo
en
This configuration allows only the external routes that meet the

m/
matching conditions to be translated into Type 5 LSAs (AS-
external-LSAs) and advertised. In this case, routing loops are

co
prevented.
You can specify protocol or process-id to filter the routes of a

.
specified protocol or process. If no protocol or process-id is

ei
specified, OSPF filters all of the imported routes.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology in this case is the same as that in the previous
ht

case. After meeting the requirements, check whether sub-


optimal routes and routing loops exist.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
After routing protocols import routes from each other, R4
ht

reaches 172.16.X.0/24 through a sub-optimal route (OSPF


route 172.16.X.0/24). This is because R4 first learns OSPF
route 172.16.X.0/24 and then learns RIP route 172.16.X.0/24.
s:

In fact, the optimal route is OSPF route 172.16.X.0/24.


ce

However, the preference of OSPF external routes is 150,and


the preference of RIP is 100,so R4 reaches 172.16.X.0/24
ur

through a sub-optimal route.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


To meet requirement 1, ensure that R4 accesses
172.16.X.0/24 through RIP, to void reaches 172.16.X.0/24
s:

through a sub-optimal route.


To meet requirement 2, use tags to control dual-node
ce

bidirectional route importing so as to prevent routing loops.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
If we do not filter routes when bidirectional route importing,
ht

routing loops occur when network environments change. In


order to avoid the loop should ensure that routing protocols
between imported only importing in the routing domain self
s:

routing. Based on the configuration in the previous, the


ce

advantage of using TAG is not required to specify the routing


entries specifically. When routing domain specific item or
ur

routing, the routing entries and restrictions will change, does


not need manual intervention, and has a good scalability.
so

Though the configuration in the previous could avoid routing


loops, but the sub-optimal route is still exist.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The reason of sub-optimal route is when dual-node bidirectional route


importing one of R3 and R4 will learn network 172.16.X.0/24 from both
ht

OSPF and RIP, and the preference of OSPF external routes is greater
than RIP, R3 or R4(one of them ) reaches 172.16.X.0/24 through a sub-
optimal. To slove this you need to modify the preference of OSPF
s:

external routes is smaller than RIP. The preference value of OSPF


ce

external routes smaller than the OSPF internal routes is unreasonable.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
When only route summarization is performed, two problems
ht

exist: R5 learns the summary route, and a routing loop occurs


between R3 and R4 when R2 pings a nonexistent IP address.
The reason why the first problem occurs is as follows: After R3
s:

and R4 learn the summary routes generated by themselves,


ce

they import the summary routes into the RIP area again.
The reason why the second problem occurs is as follows: After
ur

R3 and R4 learn the summary routes generated by themselves,


they add the summary routes to their routing tables.
so

To address the two problems, prevent R3 and R4 from


learning the summary routes generated by them and from
Re

importing the routes into the OSPF area. That is, filter the
summary route learned from each other on R3 and R4.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration filter policy on R3 and R4, avoid receive specify


summary routes of OSPF to ensure not importing this to the
ht

domain of RIP for avoiding routing loops.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The policy-based-route command creates or modifies a PBR.
ht

The ip local policy-based-route command enables local PBR.

View
s:

policy-based-route: system view


ip local policy-based-route: system view
ce
ur

Parameters
policy-based-route policy-name { permit | deny } node node-
so

id
policy-name: specifies the PBR name.
Re

permit: performs PBR on the routes that meet matching


conditions.
ng

deny: does not perform PBR on the routes that meet


matching conditions.
ni

node-id: specifies the ID of a node.


ip local policy-based-route policy-name
ar

policy-name: specifies a PBR name.


Le

Precautions
When deploying PBR, do not configure a broadcast interface
such as an Ethernet interface as the outbound interface of
re

packets.
Mo
en
Configuration verification

m/
Run the display bgp peer 1.1.1.1 orf ip-prefix command to
view prefix-based BGP ORF information received from a

co
specified peer.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
IP addresses used to interconnect devices are designed as
ht

follows:
If RTX connects to RTY, interconnected addresses are
XY.1.1.X and XY.1.1.Y. Network mask is 24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
wei
ua
g .h
in
rn
ea
/l
:/
tp

Results
When R5 imports routes, accurate matching must be
ht

performed.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
When you tracert a nonexistent IP address that belongs to
ht

10.0.0.0/16, a routing loop occurs. This is because no route


pointing to Null0 is generated when OSPF generates a
summary route.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
You can configure static routes pointing to Null0 on R5 using a
ht

command to prevent routing loops.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
This case is an extension to the previous case. Perform the
ht

configuration based on the configuration in the previous case.


IP addresses used to interconnect devices are designed as
follows:
s:

If RTX connects to RTY, interconnected addresses are


ce

XY.1.1.X and XY.1.1.Y. Network mask is 24.


The IP address of R1 S0/0/0 is 12.1.1.1/24, and the IP
ur

address of R2 S0/0/0 is 12.1.1.2/24. The IP address of


R1 S0/0/1 is 21.1.1.2/24, and the IP address of R2
so

S0/0/1 is 21.1.1.1/24.
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Use the ACL and route-policy commands to import two
ht

network segment into IS-IS, usually use the filter-policy XXX


export command to import routes.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
After you use tags to prevent routing loops, If IS-IS support
ht

Tags is necessary , the cost type must wide, otherwise the


routes of IS-IS can not be tagged.
To prevent the sub-optimal route, modify the preference of
s:

OSPF external route 10.0.0.0/16 to be smaller than that of IS-


ce

IS routes.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results
Configuration on this case avoid sub-optimal routes of R3 and
ht

R4. The difference of importing time cause one of R3 and R4


will learn 10.0.0.0/16 from ISIS or OSPF at the same time, If
R3 imported routes earlier, R4 will learn 10.0.0.0/16 from ISIS
s:

and OSPF at the same time, and compare their preference,


ce

the preference of OSPF external routes is 150, preference of


ISIS is 15, so R4 prefer ISIS to reach the network 10.0.0.0/16,
ur

but this one is sub-optimal route. So mofidy the preference of


10.0.0.0/16 on R4 smaller than the preference value of ISIS
so

can eliminate sub-optimal routes. The preference value of


OSPF external routes smaller than the OSPF internal routes is
Re

unreasonable.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni

Results
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
Use local PBR to meet this requirement.
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

VLAN technology brings the following benefits:


Limits broadcast domains. A broadcast domain is limited in a
ht

VLAN. This saves bandwidth and improves network


processing capabilities.
Enhances network security. Packets from different VLANs are
s:

separately transmitted. Hosts in a VLAN cannot directly


ce

communicate with hosts in another VLAN.


Improves network robustness. A fault in a VLAN does not
ur

affect hosts in other VLANs.


Flexibly sets up virtual groups. With VLAN technology, hosts in
so

different geographical areas can be grouped together. This


facilitates network construction and maintenance.
Re

Topology description
S1 and S2 are located in different positions. Each switch
ng

connects to two computers and the computers belongs to two


ni

different VLANs. The dashed box indicates a VLAN.


By default, PCs in VLAN 2 cannot communicate with PCs in
ar

VLAN 3. That is, broadcast packets are limited in a VLAN.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IEEE 802.1Q
IEEE 802.1Q is an Ethernet networking standard for a
ht

specified Ethernet frame format. It adds the 4-byte 802.1Q Tag


field between the Source address and the Length/Type fields
of the original frame.
s:
ce

Subfields in the 802.1q Tag field:


TPID: is short for Tag Protocol Identifier and indicates the
ur

frame type, which has 2 bytes. The value 0x8100 indicates an


802.1Q-tagged frame. An 802.1Q-incapable device discards
so

the received 802.1Q frame.


PRI: is short for priority and indicates the frame priority, which
Re

has 3 bits. The value ranges from 0 to 7. The greater the value,
the higher the priority. When QoS is deployed on a switch, the
ng

switch first sends data frames with higher priority.


CFI: is short for Canonical Format Indicator and indicates
ni

whether the MAC address is in canonical format. The value 0


indicates the MAC address in canonical format and the value 1
ar

indicates the MAC address in non-canonical format. CFI is


used to differentiate Ethernet frames, Fiber Distributed Digital
Le

Interface (FDDI) frames, and token ring network frames. The


value is 0 on the Ethernet.
VID: is short for VLAN ID and indicates the VLAN to which a
re

frame belongs, which has 12 bits.


Mo
en
Each frame sent by an 802.1Q-capable switch can carries a VLAN ID.

m/
In a VLAN, Ethernet frames are classified into the following types:
Tagged frame: frame with the 4-byte 802.1Q tag

co
Untagged frame: frame without the 4-byte 802.1Q tag

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The following link types are available:


Access link: Usually connects a host to a switch. Generally,
ht

a host does not need to know which VLAN it belongs to,


and host hardware cannot distinguish frames with VLAN
tags. Hosts therefore send and receive only untagged
s:

frames along access links.


Trunk link: Usually connects a switch to another switch or
ce

a router. Data of different VLANs is transmitted along a


trunk link. The two ends of a trunk link must be able to
ur

distinguish frames using VLAN tags, and so only tagged


so

frames are transmitted along trunk links.


Re

Topology description
A host does not need to know the VLAN to which it
belongs. It sends only untagged frames.
ng

After receiving an untagged frame from a host, a


switching device determines the VLAN to which the frame
ni

belongs based on the configured VLAN assignment


ar

method such as interface information. The switching


device then processes the frame accordingly.
Le

If a frame needs to be forwarded to another switching


device, the frame must be transparently transmitted
along a trunk link. Frames transmitted along trunk links
re

must carry VLAN tags to allow other switching devices to


properly forward the frame based on the VLAN
Mo

information.
en
After a switching device determines the outbound

m/
interface of a frame and before the switching device
sends the frame to the destination host, the switching

co
device connected to the destination host removes the
VLAN tag from the frame to ensure that the host receives

.
an untagged frame.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Interface types
An access interface on a switch connects to an interface on a
ht

host. It can only connect to access links.


The access interface allows only the VLAN whose ID is
the same as the Port Default VLAN ID (PVID).
s:

If the access interface receives untagged frames from


ce

the remote device, the switch adds the PVID to the


untagged frames.
ur

Ethernet frames sent by the access interface are


always untagged frames.
so

A trunk interface on a switch connects to another switch. It can


only connect to trunk links.
Re

The trunk interface allows frames from multiple VLANs


to pass through.
If the tag in the frame sent by the trunk interface is the
ng

same as the PVID, the switch removes the tag from the
ni

frame. The trunk interface sends untagged frames in


this situation only.
ar

If the tag in the frame sent by the trunk interface is


different from the PVID, the switch directly sends the
Le

frame.
A hybrid interface on a switch can connect to either a host or
another switch. It can connect to either access or trunk links.
re

The hybrid interface allows frames from multiple


Mo

VLANs to pass through and removes tags from frames


on the outbound interface.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Interface-based VLAN assignment


VLANs are assigned based on interface numbers.
ht

The network administrator configures a PVID for each switch


interface, that is, an interface belongs to a VLAN by default.
When an untagged data frame reaches a switch
s:

interface that has the PVID configured, the PVID is


ce

added to the frame.


When a data frame carries a VLAN tag, the switch
ur

does not add a VLAN tag to the data frame even if the
interface is configured with a PVID.
so

Different types of interfaces process VLAN frames in different


manners.
Re

MAC address-based VLAN assignment


VLANs are assigned based on MAC addresses.
ng

The network administrator needs to configure the mappings


ni

between MAC addresses and VLAN IDs. When the switch


receives an untagged frame, it searches for the VLAN entry
ar

matching the source MAC address of the frame and adds the
VLAN ID to the frame.
Le

IP subnet-based VLAN assignment


When receiving an untagged frame, the switch adds a VLAN
re

tag to the packet based on the source IP address of the packet.


Mo
en
Protocol-based VLAN assignment

m/
VLAN IDs are allocated to packets received on an interface
according to the protocol (suite) type and encapsulation format

co
of the packets. The network administrator needs to configure
the mappings between protocol types and VLAN IDs. When

.
the switch receives an untagged frame, it searches the

ei
protocol-VLAN mapping table for a VLAN tag mapping the

w
protocol of the frame and adds it to the frame.

ua
The protocol support vlan assignment contains
IPV4\IPV6\IPX\AppleTalk(AT), encapsulation type is Ethernet

.h
II802.3 raw802.2 LLC802.2 SNAP.

g
Policy-based VLAN assignment

in
Terminals MAC addresses and IP addresses need to be

rn
configured and associated with VLANs on the switch. Only
terminals matching conditions can be added to a specified

ea
VLAN. After terminals matching conditions are added to the
VLAN, changes of the IP addresses or MAC addresses may
/l
cause the terminals to be removed from the VLAN.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
To implement intra-communication in VLAN 2 and VLAN 3
ht

through the trunk link between S1 and S2, add Port 2 on S1


and Port 1 on S2 to VLAN 2 and VLAN 3.
PC1 sends a frame to PC2 as follows:
s:

The frame is first sent to Port 4 on S1.


Port 4 adds a tag to the frame. The VID field of the tag
ce

is 2, that is, the ID of the VLAN to which Port 4 belongs.


ur

S1 sends the frame to all interfaces in VLAN 2 except


Port 4 (Suppose the table of MAC address is empty).
so

Port 2 forwards the frame to S2.


After receiving the frame, S2 determines that the frame
Re

belongs to VLAN 2 based on the tag. S2 sends the


frame to all interfaces in VLAN 2 except for Port 1.
Port 3 sends the frame to PC2.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
R1 is a Layer 3 switch supporting sub-interfaces, and S1 is a
ht

Layer 2 switching device. LANs are connected using the


switched Ethernet interface on S1 and the routed Ethernet
interface on R1. To implement inter-VLAN communication,
s:

perform the following operations:


Create two sub-interfaces on the Ethernet interfaces
ce

connecting R1 and S1, and configure 802.1Q


ur

encapsulation on sub-interfaces corresponding to


VLAN 2 and VLAN 3.
so

Configure IP addresses for sub-interfaces to ensure the


two sub-interfaces have reachable routes.
Re

Configure Ethernet interfaces connecting S1 and R1 as


trunk or hybrid interfaces and configure them to allow
ng

frames from VLAN 2 and VLAN 3 to pass through.


Configure the default gateway address as the IP
ni

address of the sub-interface mapping the VLAN to


which the host belongs.
ar

PC1 communicates with PC2 as follows:


Le

PC1 checks the IP address of PC2 and determines that


PC2 is in another VLAN.
PC1 sends an ARP Request packet to R1 to request
re

R1's MAC address.


Mo
en
After receiving the ARP Request packet, R1 returns an

m/
ARP Reply packet in which the source MAC address is
the MAC address of the sub-interface mapping VLAN 2.

co
PC1 obtains R1's MAC address.
PC1 sends a packet in which the destination MAC

.
address is the MAC address of the sub-interface and

ei
the destination IP address is PC2's IP address to R1.

w
After receiving the packet, R1 forwards the packet and

ua
detects that the route to PC2 is a direct route. The
packet is forwarded by the sub-interface mapping

.h
VLAN 3.
R1 as the gateway in VLAN 3 broadcasts an ARP

g
Request packet requesting PC2's MAC address.

in
After receiving the ARP Request packet, PC2 returns

rn
an ARP Reply packet.
After receiving the ARP Reply packet, R1 sends the

ea
packet from PC1 to PC2. All packets sent from PC1 to
PC2 are sent to R1 first for Layer 3 forwarding.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A routing table must have correct routing entries so that new


data flows can be correctly forwarded. You can deploy VLANIF
ht

interfaces and routing protocols on Layer 3 switches to


implement Layer 3 connectivity.
s:

Topology description
VLAN 2 and VLAN 3 are assigned. To implement inter-
ce

VLAN communication, perform the following operations:


Create two VLANIF interfaces on S1 and configure
ur

IP addresses for them to ensure the two VLANIF


so

interfaces have reachable routes.


Configure the default gateway address as the IP
Re

address of the VLANIF interface mapping the


VLAN to which the user host belongs.
PC1 communicates with PC2 as follows:
ng

PC1 checks the IP address of PC2 and determines


that PC2 is in another VLAN.
ni

PC1 sends an ARP Request packet to S1 to request


ar

S1's MAC address.


After receiving the ARP Request packet, S1 returns
Le

an ARP Reply packet in which the source MAC


address is the MAC address of VLANIF 2.
PC1 obtains S1's MAC address.
re
Mo
en
PC1 sends a packet in which the destination MAC

m/
address is the MAC address of the VLANIF
interface and the destination IP address is PC2's IP

co
address to S1.
After receiving the packet, S1 forwards the packet

.
and detects that the route to PC2 is a direct route.

ei
The packet is forwarded by VLANIF 3.
S1 as the gateway in VLAN 3 broadcasts an ARP

w

Request packet requesting PC2's MAC address.

ua
After receiving the ARP Request packet, PC2

.h
returns an ARP Reply packet.
After receiving the ARP Reply packet, S1 sends the

g
packet from PC1 to PC2. All packets sent from PC1

in
to PC2 are sent to S1 first for Layer 3 forwarding.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

VLAN aggregation, also known as the super-VLAN, partitions a


broadcast domain using multiple VLANs on a physical network so
ht

that different VLANs can belong to the same subnet.


Super-VLAN: is a set of multiple sub-VLANs. In a super-VLAN,
only Layer 3 interfaces are created, and no physical interface
s:

exists.
Sub-VLAN: is used to isolate broadcast domains. In the sub-
ce

VLAN, only physical interfaces exist and Layer 3 VLAN


ur

interfaces cannot be created. The super-VLAN is used to


implement Layer 3 switching.
so

A super-VLAN can contain one or more sub-VLANs. IP


addresses of hosts in sub-VLANs of the super-VLAN belong to
Re

the subnet of the super-VLAN.


ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
The super-VLAN (VLAN 10) contains the sub-VLANs (VLAN 2
ht

and VLAN 3).


Proxy ARP between sub-VLANs is enabled on S1. The
communication process is as follows:
s:

After comparing PC2s IP address (1.1.1.20) with its IP


ce

address, PC1 finds that both IP addresses are on the


same network segment. The ARP table of PC1
ur

however has no corresponding entry for PC2.


PC1 broadcasts an ARP Request packet to request
so

PC2s MAC address.


PC2 is not in VLAN 2, and so PC2 cannot receive the
Re

ARP Request packet.


The gateway is enabled with proxy ARP between sub-
ng

VLANs, therefore after receiving the ARP Request


packet from PC1, the gateway finds that PC2s IP
ni

address (1.1.1.20) is the IP address of a directly


connected interface. The gateway then broadcasts an
ar

ARP Request packet to all the other sub-VLAN


interfaces to request for PC2s MAC address.
Le

After receiving the ARP Request packet, PC2 sends an


ARP Reply packet.
After receiving the ARP Reply packet from PC2, the
re

gateway replies its MAC address to PC1.


Mo

The ARP tables of both S1 and PC1 have


corresponding entries of PC2.
en
To send packets to PC2, PC1 first sends packets to the

m/
gateway, and then the gateway performs Layer 3
forwarding.

. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
The frame that enters S1 through Port 1 on PC1 is tagged with
ht

the ID of VLAN 2. The VLAN ID, however, is not changed to


the ID of VLAN 10 on S1 even if VLAN 2 is the sub-VLAN of
VLAN 10. After passing through Port 3, which is a trunk
s:

interface, this frame still carries the ID of VLAN 2. S1 discards


ce

the frames of VLAN 10 that are sent to S1 by other devices


because S1 has no physical interface corresponding to VLAN
ur

10.
A super-VLAN has no physical interface:
so

If you configure a super-VLAN and then a trunk interface, the


frames of a super-VLAN are filtered automatically according to
Re

the VLAN range configured on the trunk interface.


If you first configure a trunk interface and configure the trunk
ng

interface to allow all VLANs to pass through, you cannot


configure the super-VLAN on the device. The root cause is
ni

that any VLAN with physical interfaces cannot be configured


as the super-VLAN. The trunk interface allows frames from all
ar

VLANs to pass through, so no VLAN can be configured as a


super-VLAN.
Le

On S1, only VLAN 2 and VLAN 3 are valid, and all frames are
forwarded in these VLANs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
S2 is configured with super-VLAN 4, sub-VLAN 2, sub-VLAN 3,
ht

and common VLAN 10. S1 is configured with two common


VLANs, namely, VLAN 10 and VLAN 20. S2 is configured with
the route to the network segment 1.1.3.0/24, and S1 is
s:

configured with the route to the network segment 1.1.1.0/24.


ce

PC1 in sub-VLAN 2 of super-VLAN 4 then needs to


communicate with PC3 on connected to S1.
ur

After comparing PC3s IP address (1.1.3.2) with its IP


address, PC1 finds that two IP addresses are on
so

different network segments.


PC1 broadcasts an ARP Request packet to its gateway
Re

(S2) to request S2s MAC address.


After receiving the ARP Request packet, S2 checks the
ng

mapping between the sub-VLAN and the super-VLAN,


and sends an ARP Reply packet to PC1 through sub-
ni

VLAN 2. The source MAC address in the ARP Reply


packet is the MAC address of VLANIF 4 corresponding
ar

to super-VLAN 4.
PC1 learns S2s MAC address.
Le

PC1 sends the ARP Reply packet to S2. The ARP


Reply packet carries the destination MAC address as
the MAC address of VLANIF 4 corresponding to super-
re

VLAN 4 and the destination IP address of 1.1.3.2.


Mo
en
After receiving the ARP Reply packet, S2 performs

m/
Layer 3 forwarding and sends the ARP Reply packet to
S1, with the next hop address of 1.1.2.2 and outbound

co
interface as VLANIF 10.
After receiving the ARP Reply packet, Switch2

.
performs Layer 3 forwarding and sends the ARP Reply

ei
packet to PC3 through the directly connected interface

w
VLANIF 20.

ua
The ARP Reply packet from PC3 reaches S2 after
Layer 3 forwarding on S1.

.h
After receiving the ARP Reply packet, S2 performs
Layer 3 forwarding and sends the packet to PC1

g
through the super-VLAN.

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The MUX VLAN falls into the principal VLAN and subordinate VLAN.
The subordinate VLAN is classified into the separate VLAN and group
ht

VLAN.
Principal VLAN: A principal interface can communicate with all
interfaces in a MUX VLAN.
s:

Subordinate VLAN
Separate VLAN: A separate interface can communicate
ce

only with a principal interface and is isolated from other


ur

types of interfaces. A separate VLAN must be bound to


a principal VLAN.
so

Group VLAN: A group interface can communicate with


a principal interface and other interfaces in the same
Re

group VLAN, but cannot communicate with interfaces


in other group VLANs or a separate interface. A group
ng

VLAN must be bound to a principal VLAN.


ni

Topology description
The principal interface connects to the enterprise server;
ar

separate interfaces connect to enterprise customers; group


interfaces connect to enterprise employees. In this manner,
Le

enterprise customers and enterprise employees can access


the enterprise server, enterprise employees can communicate
with each other, enterprise customers cannot communicate
re

with each other, and enterprise customers and enterprise


Mo

employees cannot communicate with each other.


Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
To meet requirement 2, configure VLAN 2 and VLAN 3 to be
ht

permitted by the trunk link.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The port link-type command sets the link type of an interface.
ht

The port trunk allow-pass vlan command adds a trunk


interface to VLANs.
The port hybrid untagged vlan command adds a hybrid
s:

interface to VLANs. Frames of the VLANs then pass through


ce

the hybrid interface in untagged mode.


ur

View
Interface view
so

Parameters
port link-type { access | dot1q-tunnel | hybrid | trunk }
Re

Access: configures the link type of an interface as


access.
ng

dot1q-tunnel: configures the link type of an interface as


QinQ.
ni

hybrid: configures the link type of an interface as hybrid.


trunk: configures the link type of an interface as trunk.
ar

Precautions
Before changing the link type of an interface, you need to
Le

delete the VLAN configuration of the interface. That is, the


interface can join only VLAN 1.
If a specified VLAN does not exist, the port trunk allow-pass
re

vlan command does not take effect. The port trunk allow-
Mo

pass vlan command cannot be used on a member interface of


an Eth-Trunk.
en
A hybrid interface can connect to either a user host or a switch.

m/
When a hybrid interface is connected to a user host, it must be
added to VLANs in untagged mode because user hosts cannot

co
process untagged frames. The port hybrid untagged vlan
command is invalid on a member interface of an Eth-Trunk. A

.
super VLAN cannot be specified in the port hybrid untagged

ei
vlan command.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology is similar to that in slide 22. The difference is that
ht

MAC addresses are identified. Assign VLANs based on MAC


addresses to meet the requirement.
Before configuring MAC address-based VLAN assignment,
s:

ensure that the link type of the Layer 2 interface is hybrid.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The mac-vlan mac-address command associates a MAC
ht

address with a VLAN.


The mac-vlan enable command enables MAC address-
based VLAN assignment on an interface.
s:
ce

Precautions
After a MAC address is associated with a VLAN, it cannot
be associated with other VLANs.
ur

If MAC address-based assignment is enabled on an


so

interface:
When receiving an untagged packet, the interface
Re

searches for the VLAN entry matching the source MAC


address of the packet. If a matching entry is found, the
interface forwards the packet based on the VLAN ID. If no
ng

matching entry is found, the interface uses other


matching rules to forward the packet.
ni

When receiving a tagged packet, the interface forwards


ar

the packet based on the interface-based VLAN


assignment configuration.
Le

MAC address-based assignment can be configured only


on hybrid interfaces.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The topology is similar to that in slide 22.
ht

Before configuring IP subnet-based VLAN assignment, ensure


that the link type of the Layer 2 interface is hybrid.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The ip-subnet-vlan command associates an IP subnet
ht

with a VLAN.
The ip-subnet-vlan enable command enables IP subnet-
based VLAN assignment on an interface.
s:
ce

Precautions
The ip-subnet-vlan command associated with a VLAN
cannot be a multicast network segment or multicast
ur

address.
so

IP subnet-based assignment can be configured only on


hybrid interfaces.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
Protocol-based assignment can be configured only on
ht

hybrid interfaces.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The protocol-vlan command associates a protocol with a
ht

VLAN.
The protocol-vlan vlan command associates an interface with
a protocol-based VLAN.
s:
ce

Precautions
Protocol-based assignment can be configured only on hybrid
ur

interfaces.
When protocol-based assignment is used on an interface, the
so

switch needs to parse the protocol type in the received packet


and convert it.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
You can use the VLANIF interface or sub-interface to
ht

implement communication between VLANs.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The interface vlanif command creates a VLANIF interface
ht

and displays the VLANIF interface view.


The dot1q termination vid command configures the single
VLAN ID of dot1q encapsulation on a sub-interface.
s:

The arp broadcast enable command enables ARP


broadcast on a sub-interface.
ce
ur

Precautions
Before running the interface vlanif command, you must run
so

the vlan command to create a VLAN specified by vlan-id.


Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
Configure VLAN aggregation to meet the requirements.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The aggregate-vlan command configures a VLAN as a
ht

super-VLAN.
The access-vlan command adds one or more sub-VLANs
to a super-VLAN.
s:
ce

Precautions
VLAN 1 cannot be configured as a super-VLAN.
The super-VLAN must be different from all its sub-VLANs.
ur

A VLAN can be added to only one super-VLAN.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
wei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
Configure the MUX VLAN to meet the requirements.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The mux-vlan command configures a VLAN as a principal
ht

VLAN.
The subordinate group command configures subordinate
group VLANs for a principal VLAN.
s:

The subordinate separate command configures a


ce

subordinate separate VLAN for a principal VLAN.


ur

Precautions for the principal VLAN


The super-VLAN, sub-VLAN, or subordinate VLAN cannot be
so

configured as a principal VLAN.


The VLAN where a VLANIF interface has been created cannot
Re

be configured as a principal VLAN.


Precautions for the subordinate group VLAN
Before configuring a subordinate group VLAN, you must
ng

configure a principal VLAN and enter the principal VLAN view.


ni

The VLAN to be configured as a subordinate group VLAN


must have been created.
ar

The VLAN to be configured as a subordinate group VLAN


cannot have a VLANIF interface configured or be configured
Le

as a super-VLAN.
Before running the undo subordinate group command delete
a subordinate group VLAN to which interfaces have been
re

added, delete the interfaces from the subordinate group VLAN.


Mo

A subordinate group VLAN must be different from the principal


VLAN.
en
A subordinate group VLAN must be different from a

m/
subordinate separate VLAN.
Precautions for the subordinate separate VLAN

co
Before configuring a subordinate separate VLAN, you must
configure a principal VLAN and enter the principal VLAN view.

.
The VLAN to be configured as a subordinate separate VLAN

ei
must have been created.

w
The VLAN to be configured as a subordinate separate VLAN

ua
cannot have a VLANIF interface configured or be configured
as a super-VLAN.

.h
Before running the undo subordinate separate command
delete a subordinate separate VLAN to which interfaces have

g
been added, delete the interfaces from the subordinate

in
separate VLAN.
A subordinate separate VLAN must be different from the

rn
principal VLAN.

ea
A subordinate separate VLAN must be different from a
subordinate group VLAN.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Check whether MAC address entries on the switch are correct.


Run the display mac-address command on the switch to
ht

check whether the MAC addresses, interfaces, and VLANs in


the learned MAC address entries are correct. If the learned
MAC address entries are incorrect, run the undo mac-
s:

address mac-address vlan vlan-id command on the interface


ce

to delete the existing entries so that the switch can learn MAC
address entries again.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
To implement communication between VLANs through RIPv2,
ht

configure at least two VLANIF interfaces on the switch.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Result
Perform the ping operation. PC1 in VLAN 2 and VLAN 3 can
ht

communicate with each other.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Result
To implement communication between VLANs through RIPv2,
ht

configure at least two VLANIF interfaces on the switch.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Proxy ARP
Routed proxy ARP: Routed proxy ARP enables network
ht

devices on the same network segment but on different


physical networks to communicate.
Intra-VLAN proxy ARP: If two hosts belong to the same VLAN
s:

where user isolation is configured, enable intra-VLAN proxy


ce

ARP on an interface associated with the VLAN to allow the


hosts to communicate.
ur

Inter-VLAN proxy ARP: If two hosts belong to different VLANs,


enable inter-VLAN proxy ARP on interfaces associated with
so

the VLANs to implement Layer 3 communication between the


two hosts.
Re

Topology Description
Routed proxy ARP
ng

The IP addresses of PC1 and PC2 are on the same


ni

network segment. When PC1 needs to communicate


with S1, PC1 broadcasts an ARP Request packet,
ar

requesting the MAC address of PC2. However, PC1


and PC2 are on different physical networks (in different
Le

broadcast domains). PC2 therefore cannot receive the


ARP Request packet sent from PC1 and does not
respond with an ARP Reply packet. To solve this
re

problem, enable proxy ARP on S1.


Mo
en
After receiving the ARP Request packet, S1 searches

m/
for a routing entry corresponding to PC2. If the routing
entry corresponding to PC2 exists, S1 responds to the

co
ARP Request packet with its own MAC address. PC1
forwards data based on the MAC address of S1. S1

.
functions as the proxy of PC2.

ei
Intra-VLAN proxy ARP

w
PC1 cannot communicate with PC2 in the same VLAN

ua
because interface isolation is configured on the
interface of S1 connected to PC1 and PC2. To solve

.h
this problem, enable intra-VLAN proxy ARP on the
interfaces of S1. After S1's interface connected to PC1

g
receives an ARP Request packet destined for another

in
address, S1 does not discard the packet but searches

rn
for the ARP entry corresponding to PC2. If the ARP
entry corresponding to PC2 exists, S1 sends its MAC

ea
address to PC1 and forwards packets sent from PC1 to
PC2. S1 functions as the proxy of PC2.
Inter-VLAN proxy ARP
/l
This function is used in VLAN aggregation. Refer to the
:/
VLAN documentation.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Gratuitous ARP provides the following functions:


Checks for duplicate IP addresses: Normally, a host does not
ht

receive an ARP Reply packet after sending an ARP Request


packet with the destination address as its own IP address. If
the host receives an ARP Reply packet, another host has the
s:

same IP address.
Advertises a new MAC address: If the MAC address of a host
ce

changes because its network adapter is replaced, the host


ur

sends a gratuitous ARP packet to notify all hosts of the change


before the ARP entry is aged out.
so

Notifies of an active/standby switchover in a VRRP group:


After an active/standby switchover is performed, the master
Re

switch sends a gratuitous ARP packet in the VRRP group to


notify of the switchover.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

After the system is reset or the interface card is hot swapped or reset,
the dynamic entries will be lost but the static and the blackhole entries
ht

are not lost.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Secure MAC addresses are classified into the following types:


Secure dynamic MAC address: is learned on an
ht

interface where port security is enabled but the sticky


MAC function is disabled. After port security is enabled
on an interface, dynamic MAC address entries that
s:

have been learned on the interface are deleted and


ce

MAC address entries learned subsequently turn into


secure dynamic MAC address entries. Secure dynamic
ur

MAC addresses will not be aged out by default. After


the switch restarts, secure dynamic MAC addresses
so

are lost and need to be learned again.


Sticky MAC address: is learned on an interface where
Re

both port security and the sticky MAC function are


enabled. Sticky MAC addresses will not be aged out.
ng

After you save the configuration and restart the switch,


sticky MAC addresses still exist.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

MAC address anti-flapping


Increasing the MAC address learning priority of an interface:
ht

When the same MAC address entry is learned by interfaces


with different priorities, the MAC address entry learned by the
interface with the highest priority overwrites the one learned by
s:

other interfaces.
Preventing MAC address overwriting on interfaces with the
ce

same priority: If the priority of an interface on a bogus device is


ur

the same as that on the authorized device, the MAC address


of the bogus device learned later does not overwrite the
so

correct MAC address. If the device powers off, the MAC


address of the bogus device is learned. After the device
Re

powers on again, the device cannot learn the correct MAC


address.
ng

Topology description
ni

You can set a high MAC address learning priority on Port1 to


prevent PC3 from using the MAC address of PC1 to attack the
ar

switch.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
No loop prevention protocol is used on the switching network.
ht

If S2 and S4 are incorrectly connected with a network cable, a


loop occurs between S2, S3, and S4. When a broadcast
packet is sent, the packet is forwarded to S3 and received by
s:

Port1 on S1. When MAC address flapping detection is


ce

configured on Port1, S1 detects that the source MAC address


of the broadcast packet flaps between interfaces. If the MAC
ur

address flaps between interfaces frequently, S1 considers that


MAC address flapping occurs. The interface associated with
so

S1 can enter the error-down state or be removed from the


VLAN.
Re

MAC address flapping detection


Other dynamic VLAN technologies cannot be used with the
ng

removal of an interface from the VLAN where MAC address


ni

flapping occurs.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Link aggregation has the following advantages:


Increased bandwidth: The bandwidth of the link aggregation
ht

interface is the sum of bandwidth of member interfaces.


Higher reliability: When the physical link of a member interface
fails, the traffic can be switched to another available member
s:

link, improving reliability of the link aggregation interface.


Load balancing: In a Link Aggregation Group (LAG), traffic is
ce

load balanced among active member interfaces.


ur

Basic concepts of Ethernet link aggregation


Eth-Trunk: An LAG is the logical link bundled by many
so

Ethernet links, and is short for Eth-Trunk.


Member interfaces and member links: The interfaces that
Re

constitute an Eth-Trunk are member interfaces. The link


corresponding to a member interface is member link.
Active and inactive interfaces and links:
ng

Member interfaces are classified into active interfaces


ni

that forward data and inactive interfaces that do not


forward data.
ar

Links connected to active interfaces are called active


links, and links connected to inactive interfaces are
Le

called inactive links.


re
Mo
en
Upper threshold for the number of active interfaces: This

m/
setting guarantees higher network reliability. When the number
of active member interfaces reaches the upper threshold,

co
additional active member interfaces are set to Down and used
as backup links.

.
Lower threshold for the number of active interfaces: This

ei
setting ensure the minimum bandwidth of an Eth-Trunk. When

w
the number of active interfaces falls below this threshold, the

ua
Eth-Trunk goes Down.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Forwarding principle
An Eth-Trunk interface is assumed to be a physical interface at
ht

the MAC sub-layer. Therefore, frames transmitted at the MAC


sub-layer only need to be delivered to the Eth-Trunk module.
s:

Eth-Trunk forwarding entries:


HASH-KEY value: is calculated through the hash algorithm on
ce

the MAC address or IP address in the packet.


ur

Interface number: Eth-Trunk forwarding entries are relevant to


the number of member interfaces in an Eth-Trunk. Different
so

HASH-KEY values are mapped to different outbound


interfaces.
Re

Figure description
For example, If three physical interfaces, 1, 2, and 3, are
ng

bundled into an Eth-Trunk, the Eth-Trunk forwarding table


ni

contains three entries, as shown in the preceding figure. In the


Eth-Trunk forwarding table, the HASH-KEY values are 0, 1, 2,
ar

3, 4, 5, 6, 7, and the corresponding interface numbers are 1, 2,


3, 1, 2, 3, 1, 2.
Le
re
Mo
en
Forwarding process

m/
The Eth-Trunk module receives a frame from the MAC sub-
layer, and then extracts its source MAC address/IP address or

co
destination MAC address/IP address according to the load
balancing mode.

.
The Eth-Trunk module calculates the HASH-KEY value using

ei
the hash algorithm.

w
Based on the HASH-KEY value, the Eth-Trunk module

ua
searches the Eth-Trunk forwarding table for the interface
number, and then sends the frame from the corresponding

.h
interface.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Mis-sequencing in common load balancing mode


Because there are multiple physical links between devices of
ht

an Eth-Trunk, the first data frame of the same data flow is


transmitted on one physical link, and the second data frame
may be transmitted on another physical link. In this case, the
s:

second data frame may arrive at the peer device earlier than
ce

the first data frame. As a result, packet mis-sequencing occurs.


ur

Eth-Trunk load balancing


The Eth-Trunk uses the load balancing mechanism. This
so

mechanism uses the hash algorithm to calculate the address


in a data frame and generates a hash key value. The system
Re

then searches for the outbound interface in the Eth-Trunk


forwarding table based on the generated hash key value. Each
ng

MAC or IP address corresponds to a hash key value, so the


system uses different outbound interfaces to forward data.
ni

This mechanism ensures that frames of the same data flow


are forwarded on the same physical link and implements flow-
ar

based load balancing. Flow-based load balancing ensures the


sequence of data transmission, but cannot guarantee the
Le

bandwidth use efficiency.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Manual load balancing mode


If an active link fails, the other active links load balance the
ht

traffic evenly. If a high link bandwidth between two directly


connected devices is required but the device does not support
the LACP protocol, you can use the manual load balancing
s:

mode.
ce

LACP mode
ur

LACP uses a standard negotiation mechanism for switching


devices. LACP enables switching devices to automatically
so

create and enable aggregated links based on their


configurations. After aggregated links are created, LACP
Re

maintains the link status. If an aggregated link's status


changes, LACP automatically adjusts or disables the link
ng

aggregation.
ni

LACP concepts
LACP system priority: The LACP system priority (default value
ar

of 32768) is used to differentiate priorities of devices at both


ends of an Eth-Trunk. In LACP mode, active interfaces
Le

selected by both devices must be consistent; otherwise, the


LAG cannot be established. To keep active interfaces
consistent at both ends, set a higher priority for one end.
re
Mo
en
In this manner, the other end selects active member

m/
interfaces based on the selection of the peer. The smaller the
LACP system priority value, the higher the LACP system

co
priority. When LACP system priorities are the same, the device
with smaller MAC address functions as the Actor.

.
LACP interface priority: The LACP interface priority (default

ei
value of 32768) is used to determine whether a member

w
interface can be selected as an active interface. The smaller

ua
the LACP interface priority value, the higher the LACP
interface priority.

.h
In LACP mode, LACP determines active and inactive links in
an LAG. This mode is also called M:N mode, where M refers to

g
the number of active links and N refers to the number of

in
backup links. This mode guarantees high reliability and allows

rn
load balancing to be carried out across M active links.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

LACP implementation
After member interfaces are added to an Eth-Trunk in LACP
ht

mode, each end sends LACPDUs to inform its peer of its


system priority, MAC address, interface priority, interface
number, and keys. After being informed, the peer compares
s:

this information with that saved on itself, and selects which


ce

interfaces to be aggregated. Both ends determine active


interfaces and links.
ur

Negotiation process
so

Devices at both ends send LACPDUs to each other.


Create an Eth-Trunk in LACP mode on S1 and S2 and
Re

add member interfaces to the Eth-Trunk. The member


interfaces are then enabled with LACP, and devices at
ng

both ends send LACPDUs to each other.


Determine the Actor and active links.
ni

When S2 receives LACPDUs from S1, S2 checks and


records information about S1 and compares system
ar

priorities. If the system priority of S1 is higher than that


of S2, S1 acts as the Actor.
Le

After devices at both ends select the Actor, they select


active interfaces according to the priorities of the
Actor's interfaces.
re
Mo
en
LACP preemption

m/
E1 becomes faulty, and then recovers. When E1 fails,
E3 replaces E1 to transmit services. After E1 recovers,

co
if LACP preemption is not enabled on the Eth-Trunk,
E1 still retains a backup state. If LACP preemption is

.
enabled on the Eth-Trunk, E1 becomes the active

ei
interface and E3 becomes the backup interface

w
because E1 has higher priority than E3.

ua
LACP preemption delay
When LACP preemption occurs, the backup link waits

.h
for a given period of time before switching to the active
state.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

GVRP
GVRP is based on GARP and is used to maintain VLAN
ht

attributes dynamically on devices. Through GVRP, VLAN


attributes of one device can be propagated throughout the
entire switching network. GVRP enables network devices to
s:

dynamically deliver, register, and propagate VLAN attributes,


ce

reducing the workload of network administrators and ensuring


correct configuration.
ur

GVRP applies to only trunk links.


GVRP uses the multicast MAC address of 01-80-C2-00-00-21.
so

Participant
Re

On a device running GVRP, each GVRP-enabled port is


considered as a GVRP participant.
ng

VLAN registration and deregistration


ni

GVRP implements automatic registration and deregistration of


VLAN attributes.
ar

VLAN registration: adds an interface to a VLAN.


VLAN deregistration: removes an interface from a
Le

VLAN.
GVRP registers and deregisters VLAN attributes through
attribute declarations and reclaim declarations:
re

When an interface receives a VLAN attribute


Mo

declaration, it registers the VLAN specified in the


declaration.
en
That is, the interface is added to the VLAN.

m/
When an interface receives a VLAN attribute reclaim
declaration, it deregisters the VLAN specified in the

co
declaration. That is, the interface is removed from the
VLAN.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

GARP participants exchange attribute information by sending


messages. GVRP messages fall into Join, Leave, and LeaveAll
ht

messages.
Join message: When a GARP participant requires that other
devices register its attributes, receives Join messages from
s:

other GARP participants, or have attributes configured


ce

statically, it sends Join messages.


Leave message: A GARP participant sends Leave messages
ur

to have its attributes deregistered from other devices. The


GARP participant also sends Leave messages when
so

receiving Leave messages from other GARP participants or


when attributes are manually deregistered.
Re

LeaveAll message: A GARP participant sends LeaveAll


messages to deregister all its attributes from all the other
ng

GARP participants. LeaveAll messages are used to


periodically delete garbage attributes. For example, a garbage
ni

attribute may be created when a device fails to send a Leave


message, due to sudden loss of power, that is used to notify
ar

other devices to deregister an attribute that it has removed.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Join timer
To ensure that a Join message is reliably transmitted to
ht

other GARP participants, a GARP participant may send the


Join message twice. When sending the first Join message,
the GARP participant starts the Join timer. If a Join
s:

message is received before the Join timer expires, the


GARP participant does not send the second Join message.
ce

If not, the GARP participant re-sends the Join message.


The Join timer is configured on a per-port basis.
ur
so

Hold timer
When you configure an attribute on a participant or when
Re

the participant receives a request message, the


participant does not propagate the message to the other
devices immediately. Instead, it sends the request
ng

messages received within a period of time and sends


them in one GARP PDU. This period of time is specified by
ni

the Hold timer. By making full use of the data portion of


ar

GARP PDUs to send multiple messages in one packet, the


mechanism reduces the number of transmitted packets
Le

and contributes to network stability.


The Hold timer value must be no greater than half of the
Join timer value.
re
Mo
en
Leave timer

m/
Upon receiving a Leave or LeaveAll message, a GARP
participant starts its Leave timer. If it receives no Join message

co
containing the attribute carried in the Leave or LeaveAll
message when the Leave timer expires, it deregisters the

.
attribute.

ei
The Leave timer value is twice that of the Join timer value.

w
ua
LeaveAll timer
Upon startup, a GARP participant starts the LeaveAll timer.

.h
When the LeaveAll timer expires, the GARP participant sends
out a LeaveAll message, and then restarts the LeaveAll timer

g
to start another cycle.

in
When receiving a LeaveAll message, a GARP participant re-

rn
starts all timers, including the LeaveAll timer.
If LeaveAll timers of multiple devices expire at the same time,

ea
multiple LeaveAll messages will be sent at the same time,
creating unnecessary traffic. To avoid this problem, the actual
/l
LeaveAll timer value of a participant is a random value
between the LeaveAll timer value and the LeaveAll timer value
:/
multiplied by 1.5. A LeaveAll event is equivalent to
deregistering all attributes network wide by sending Leave
tp

messages.
The LeaveAll timer value must be at least larger than the
ht

Leave timer value.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

One-way registration of VLAN attributes


Manually create static VLAN 2 on S1. In response to this
ht

action, GVRP automatically assigns the GVRP-enabled ports


on S2 and S3 to VLAN 2 through one-way registration. The
process is as follows:
s:

After VLAN 2 is created on S1, E1 on S1 starts the Join


ce

timer and Hold timer. When the Hold timer expires, S1


sends the first JoinEmpty message to S2. When the
ur

Join timer expires, E1 restarts the Hold timer. When the


Hold timer expires again, Port 1 sends the second
so

JoinEmpty message.
After E2 on S2 receives the first JoinEmpty message,
Re

S2 creates dynamic VLAN 2 and adds E2 to VLAN 2.


In addition, S2 requests E3 to start the Join timer and
ng

Hold timer. When the Hold timer expires, E3 sends the


first JoinEmpty message to S3. When the Join timer
ni

expires, E3 restarts the Hold timer. When the Hold


timer expires again, E3 sends the second JoinEmpty
ar

message. After E2 receives the second JoinEmpty


message, S2 does not take any action because E2 has
Le

been added to VLAN 2.


re
Mo
en
After E4 of S3 receives the first JoinEmpty message,

m/
S3 creates dynamic VLAN 2 and adds E4 to VLAN 2.
After E4 receives the second JoinEmpty message, S3

co
does not take any action because E4 has been added
to VLAN 2.

.
Every time the LeaveAll timer expires or a LeaveAll

ei
message is received, each device restarts the LeaveAll

w
timer, Join timer, Hold timer, and Leave timer. E1 then

ua
repeats step 1 to send JoinEmpty messages. E3 of S2
sends JoinEmpty messages to S3 in the same way.

.h
Two-way registration of VLAN attributes

g
After one-way registration is complete, E1, E2, and E4 are

in
added to VLAN 2 but E3 is not added to VLAN 2 because only

rn
interfaces receiving a JoinEmpty or JoinIn message can be
added to dynamic VLANs. To transmit traffic of VLAN 2 in both

ea
directions, VLAN registration from S3 to S1 is required. The
process is as follows:
/l
After one-way registration is complete, static VLAN 2 is
created on S3 (the dynamic VLAN is replaced by the
:/
static VLAN). E4 on S3 starts the Join timer and Hold
timer. When the Hold timer expires, E4 on S3 sends
tp

the first JoinIn message (because it has registered


ht

VLAN 2) to S2. When the Join timer expires, E4


restarts the Hold timer. When the Hold timer expires,
E4 sends the second JoinIn message.
s:

After E3 on S2 receives the first JoinIn message, S2


adds E3 to VLAN 2 and requests E2 to start the Join
ce

timer and Hold timer. When the Hold timer expires, E2


sends the first JoinIn message to S1. When the Join
ur

timer expires, E2 restarts the Hold timer. When the


Hold timer expires again, E2 sends the second JoinIn
so

message. After E3 receives the second JoinIn


Re

message, S2 does not take any action because E3 has


been added to VLAN 2.
When S1 receives the JoinIn message, it stops sending
ng

JoinEmpty messages to S2. Every time the LeaveAll


timer expires or a LeaveAll message is received, each
ni

device restarts the LeaveAll timer, Join timer, Hold


timer, and Leave timer. E1 on S1 sends a JoinIn
ar

message to S2 when the Hold timer expires.


Le

S2 sends a JoinIn message to S3.


After receiving the JoinIn message, S3 does not create
dynamic VLAN 2 because static VLAN 2 has been
re

created.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

One-way deregistration of VLAN attributes


When VLAN 2 is not required on devices, the devices can
ht

deregister VLAN 2. The process is as follows:


After static VLAN 2 is manually deleted from S1, E1 on
S1 starts the Hold timer. When the Hold timer expires,
s:

E1 sends a LeaveEmpty message to S2. E1 needs to


ce

send only one LeaveEmpty message.


After E2 on S2 receives the LeaveEmpty message, it
ur

starts the Leave timer. When the Leave timer expires,


E2 deregisters VLAN 2. Then E2 is deleted from VLAN
so

2, but VLAN 2 is not deleted from S2 because E3 is still


in VLAN 2. At this time, S2 requests E3 to start the
Re

Hold timer and Leave timer. When the Hold timer


expires, E3 sends a LeaveIn message to S3. Static
ng

VLAN 2 is not deleted from S3, so E3 can receive the


JoinIn message sent from E4 after the Leave timer
ni

expires. In this case, S1 and S2 can still learn dynamic


VLAN 2.
ar

After S3 receives the LeaveIn message, E4 is not


deleted from VLAN 2 because VLAN 2 is a static VLAN
Le

on S3.

Two-way deregistration of VLAN attributes


re

To delete VLAN 2 from all devices, two-way deregistration is


Mo

required. The process is as follows:


en
After static VLAN 2 is manually deleted from S3, E4 on

m/
S3 starts the Hold timer. When the Hold timer expires,
E4 sends a LeaveEmpty message to S2.

co
After E3 on S2 receives the LeaveEmpty message, it
starts the Leave timer. When the Leave timer expires,

.
E3 deregisters VLAN 2. Then E3 is deleted from

ei
dynamic VLAN 2, and dynamic VLAN 2 is deleted from

w
S2. At this time, S2 requests E2 to start the Hold timer.

ua
When the Hold timer expires, E2 sends a LeaveEmpty
message to S1.

.h
After E1 on S1 receives the LeaveEmpty message, it
starts the Leave timer. When the Leave timer expires,

g
E1 deregisters VLAN 2. Then E1 is deleted from

in
dynamic VLAN 2, and dynamic VLAN 2 is deleted from

rn
S1.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Manually configured VLANs are called static VLANs and VLANs


created using GVRP are called dynamic VLANs.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
To enable PC1 and PC2 whose interfaces are isolated in
ht

VLAN 2 to communicate with each other, enable intra-VLAN


proxy ARP on S1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The port-isolate enable command enables port isolation.
ht

The arp-proxy inner-sub-vlan-proxy enable command


enables intra-VLAN proxy ARP.
s:

View
Interface view
ce
ur

Parameters
port-isolate enable [ group group-id ]
so

group-id: specifies the ID of a port isolation group. The


default value is 1.
Re

Precautions
You can use the display port-isolate command to view the
ng

port isolation group configuration.


ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
Preemption needs to be enabled to meet requirement 3.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The mode command configures the working mode of an Eth-
ht

Trunk.
The eth-trunk command adds an interface to an Eth-Trunk.
The load-balance command sets a load balancing mode of an
s:

Eth-Trunk.
The max active-linknumber command sets the upper
ce

threshold for the number of active member links on an Eth-


ur

Trunk.
The lacp priority command sets the LACP system or interface
so

priority.
The lacp preempt enable command enables priority
Re

preemption in static LACP mode.

Precautions
ng

When adding an interface to an Eth-Trunk, pay attention to the


ni

following points:
An Eth-Trunk contains a maximum of 8 member
ar

interfaces.
A member interface cannot be configured with any
Le

service or static MAC address.


The link type of the member interface added to the Eth-
Trunk must be hybrid.
re
Mo
en
An Eth-Trunk cannot be nested, that is, its member

m/
interface cannot be an Eth-Trunk.
An Ethernet interface can be added to only one Eth-

co
Trunk. To add the Ethernet interface to another Eth-
Trunk, delete it from the original Eth-Trunk first.

.
Member interfaces of an Eth-Trunk must be of the

ei
same type. That is, FE and GE interfaces cannot join

w
the same Eth-Trunk.

ua
Ethernet interfaces on different LPUs can join the same
Eth-Trunk.

.h
The remote interface directly connected to the local
Eth-Trunk member interface must also be bundled into

g
an Eth-Trunk; otherwise, the two ends cannot

in
communicate.

rn
When member interfaces use different rates,
congestion may occur on the low-rate interface,

ea
causing packet loss.
After interfaces are added to an Eth-Trunk, MAC
/l
addresses are learned on the Eth-Trunk but not the
member interfaces.
:/
When all member interfaces of an Eth-Trunk work in
half-duplex mode, the Eth-Trunk cannot negotiate an
tp

Up state.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
.hg
in
rn
ea
/l
:/
tp

Case description
Deploy GVRP to meet requirement 2.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The gvrp command enables GVRP globally or on an interface.
ht

Precautions
Before enabling GVRP on an interface, you must set the link
type of the interface to trunk.
s:

The display gvrp vlan-operation command displays the


ce

dynamic VLANs to which an interface is added.


ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PPP includes three protocols:


Link Control Protocol (LCP): is used to establish, monitor, and
ht

tear down PPP data links. LCP can automatically detect the
link environment, for example, check whether there are loops.
It also negotiates link parameters such as the maximum
s:

packet length and authentication protocol to be used.


ce

Compared with other data link layer protocols, PPP has an


important feature, that is, it can provide the authentication
ur

function. The two ends of a link can negotiate the


authentication protocol to be used and implement
so

authentication. The ends can be connected only when the


authentication succeeds. Due to this feature, PPP is
Re

appropriate for carriers to provide access to distributed users.


Network Control Protocol (NCP): is used to negotiate the
ng

format and type of packets transmitted on data links. For


example, IP Control Protocol (IPCP) and Internetwork Packet
ni

Exchange Control Protocol (IPXCP) are used to control


parameter negotiation of IP and IPX packets respectively.
ar

PPP extensions: give PPP support functions. For example,


PPP extensions provide the Password Authentication Protocol
Le

(PAP) and Challenge Handshake Authentication Protocol


(CHAP) to ensure network security.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PPP packet format


Flag field
The Flag field identifies the start and end of a physical
ht

frame and is always 0x7E.


Address field
The Address field identifies a peer. Two communicating
s:

devices connected by using PPP do not need to know the


data link layer address of each other because PPP is used
ce

on P2P links. This field must be filled with a broadcast


address of all 1s and is of no significance to PPP.

ur

Control field
The Control field value defaults to 0x03, indicating
an unsequenced frame. By default, PPP does not
so

use sequence numbers or acknowledgement


mechanisms to ensure transmission reliability.
Re

The Address and Control fields identify a PPP


packet, so the PPP packet header value is FF03.
Protocol field
ng

The Protocol field identifies the datagram


encapsulated in the Information field of a PPP data
ni

packet.
LCP packet format
Code field
ar

The Code field is 1 byte in length and identifies the


LCP packet type.
Le
re
Mo
en
Identifier field
The Identifier field is 1 byte long. It is used to match

m/
request and response packets. If a device receives a
packet with an invalid Identifier field, the device

co
discards the packet.
The sequence number of a Configure-Request

.
packet usually begins with 0x01 and increases by 1

ei
each time a Configure-Request packet is sent. After
a receiver receives a Configure-Request packet, it

w
must send a response packet with the same

ua
sequence number as that of the received Configure-
Request packet.
Length field

.h
The Length field specifies the total number of bytes
in the LCP packet. It specifies the length of an LCP

g
packet, including the Code, Identifier, Length and

in
Data fields.
The Length field value cannot exceed the maximum

rn
receive unit (MRU) of the link. Bytes outside the
range of the Length field are treated as padding and

ea
are ignored after they are received.
Data field
The Type field specifies the negotiation option type.
/l
The Length field specifies the total length of the Data
field, including Type, Length, and Data.
:/
The Data field contains the contents of the
negotiation option.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The PPP link establishment process is as follows:


Dead: PPP starts and ends with the Dead phase. After the physical
ht

status of two communicating devices becomes Up (marked as UP in


the figure), PPP enters the Establish phase.
Establish: The two devices negotiate link layer parameters in the
s:

Establish phase. If negotiation of link layer parameters fails (marked as


ce

FAIL in the figure), a PPP connection cannot be established and PPP


returns to the Dead phase. If negotiation of link layer parameters
ur

succeeds (marked as OPENED in the figure), PPP enters the


Authenticate phase.
so

Authenticate: In the Authenticate phase, the authenticating party


authenticates the authenticated party. If authentication fails (marked as
Re

FAIL in the figure), PPP enters the Terminate phase. If authentication


succeeds (marked as SUCCESS in the figure) or none authentication is
configured, PPP enters the Network phase.
ng

Network: In the Network phase, the two devices use NCP to negotiate
ni

network-layer parameters. If negotiation succeeds, a PPP connection


can be established and data packets can be transmitted over the PPP
ar

connection. When the upper-layer protocol determines that the PPP


connection (for example, an on-demand circuit)should be disconnected
Le

or an administrator manually disconnects the PPP connection, PPP


enters the Terminate phase.
Terminate: In the Terminate phase, the two devices use LCP to
re

disconnect the PPP connection. After the PPP connection is


Mo

disconnected (marked as Down in the figure), PPP enters the Dead


phase.
en
Note: The working phases of PPP listed in this slide are not the status

m/
of the PPP protocol because PPP is a protocol suite that does not have
a protocol status. Only specified protocols such as LCP and NCP can

co
have a protocol status that can change from one state to another.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

3 Type packets of LCP Protocal:


1.Link configure packet, used to establish and configure links:
ht

Configure-Request, Configure-Ack, Configure-Nak, Configure-Reject.


2.Link disconnection packet, used to end links: Terminate-Request,
Terminate-Ack.
s:

3.Link maintenance packet, used to management and debug links:


ce

Code-Reject, Protocol-Reject, Echo-Request, Echo-Reply, Discard-


Request.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

LCP is used to negotiate the following parameters:


MRU is used on the Versatile Routing Platform (VRP) to indicate the
ht

maximum transmission unit configured on an interface.


The PPP authentication protocols include PAP and CHAP. Two ends
of a PPP link can use different protocols to authenticate the peer.
s:

However, the authenticated party must support the authentication


ce

protocol used by the authenticating party and have authentication


information such as the user name and password correctly configured.
ur

LCP uses the magic number to detect link loops and other exceptions.
A magic number is a randomly generated digit. It should be ensured
so

that the two ends do not generate the same magic number.
After a device receives a Configure-Request packet, it compares the
Re

magic number in the Configure-Request packet received with the


locally generated magic number. If they are different, link loops do not
occur and the device sends a Confugure-Ack packet (if other
ng

parameters are successfully negotiated) to indicate that negotiation of


ni

the magic number succeeds. If subsequent packets contain the Magic-


Number field, the value of the field is set to the successfully negotiated
ar

magic number and LCP does not generate a new magic number.
If the magic number in the Configure-Request packet received is the
Le

same as that received previously, the receiver sends a Confugure-Nak


packet to the sender, carrying a new magic number. The sender sends
a new Configure-Request packet carrying a new magic number,
re

regardless of whether the magic number in the Configure-Nak packet


Mo

received is the same as that . If a link loop exists, the process persists.
If no link loop exists, packet exchange will soon be restored.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Link negotiation success:


As shown in the figure, R1 and R2 are connected in series and run
ht

PPP. When the physical status of the link becomes Up, R1 and R2
use the LCP to negotiate link layer parameters. In this example, R1
sends an LCP packet.
s:

R1 sends a Configure-Request packet to R2, carrying link-layer


ce

parameters configured on the sender (R1). The link-layer


parameters use the Type, Length, Value structure.
ur

After receiving the Configure-Request packet, R2 sends a


Configure-Ack packet to R1 if it can identify all the link-layer
so

parameters in the packet and determines that the value of each


parameter is acceptable.
Re

If R1 does not receive a Configure-Ack packet, it re-transmits a


Configure-Request packet once every 3 seconds. If R1 still cannot
ng

receive a Configure-Ack packet after the Configure-Request packet


is re-transmitted for 10 consecutive times, it determines that the
ni

peer is unavailable and stops sending Configure-Request packets.


Note: After the process is complete, R2 determines that the link-layer
ar

parameters configured on R1 are acceptable. R2 also needs to


send Configure-Request packets to R1, so that R1 can determine
Le

whether the link-layer parameters configured on R2 are acceptable.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Link negotiation failure:


After R2 receives a Configure-Request packet from R1, R2 sends a
ht

Configure-Nak packet to R1 if R2 can identify all the link-layer


parameters in the packet, but determines that all or some of the
parameter values are unacceptable, indicating that parameter
s:

negotiation fails.
The Configure-Nak packet contains only the parameters whose
ce

values are unacceptable, and the value of each parameter is changed


ur

to a value or value range that is acceptable on R2.


After receiving the Configure-Nak packet, R1 changes the parameter
so

values used locally based on the values in the Configure-Nak packet,


and then sends a Configure-Request packet.
Re

If negotiation still fails after the Configure-Request packet is sent for


five consecutive times, the parameters are disabled and parameter
negotiation stops.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The link negotiation parameters cannot be identified.


After receiving a Configure-Request packet from R1, R2 sends a
ht

Configure-Reject packet to R1 if R2 cannot identify all or some link-


layer parameters in the packet.
The Configure-Reject packet contains only the parameters that
s:

cannot be identified.
After receiving the Configure-Reject packet, R1 sends a Configure-
ce

Request packet to R2, carrying only parameters that can be identified


ur

by R2.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The link state detection process is as follows:


After a connection is set up using LCP, Echo-Request and Echo-
ht

Reply packets can be used to detect the link status. If a device


replies an Echo-Reply packet each time it receives an Echo-
Request packet, the link status is normal.
s:

By default, the VRP platform sends an Echo-Request packet once


ce

every 10 seconds.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The process of tearing down a connection is as follows:


LCP can tear down an existing connection if the authentication fails or
ht

an administrator manually shuts down the connection.


LCP uses Terminate-Request and Terminate-Ack packets to
disconnect a connection. The Terminate-Request packet is used to
s:

request the peer to disconnect the connection. After receiving a


ce

Terminate-Request packet, the device replies a Terminate-Ack packet


to confirm that the connection is to be disconnected.
ur

If a device fails to receive a Terminate-Ack packet, it re-transmits a


Terminate-Request packet once every 3 seconds. If the device still
so

does not receive a Terminate-Ack packet after sending the Terminate-


Request packet twice consecutively, it determines that the peer is
Re

unavailable, and then disconnects the connection.


ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

A PAP packet is encapsulated in the PPP packet directly.


ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The PAP authentication process is as follows:


The authenticated party sends an Authenticate-Request
ht

packet carrying the user name and password in plaintext to


the authenticating party. In this example, the user name
and password are huawei and hello.
s:

After receiving the user name and password from the


ce

authenticated party, the authenticating party compares the


user name and password with those configured locally to
ur

check whether they are correct. If the user name and


password are correct, the authenticating party returns an
so

Authenticate-Ack packet, indicating that the authentication


succeeds. If the user name and password are incorrect, the
Re

authenticating party returns an Authenticate-Nak packet,


indicating that the authentication fails.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The encryption algorithm Message Digest 5 (MD5) is used to calculate


a 16-byte character string, which is the concatenation of
ht

Identifier+password+challenge. The authenticated party adds the


calculated 16-byte character string to the Data field of the Response
packet and sends the packet to the authenticating party.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

CHAP is a three-way handshake authentication protocol. The Request


packet and Response packet exchanged between two communicating
devices during one CHAP process contain the same Identifier.
ht
s:

Unidirectional CHAP authentication is applicable to two scenarios: the


authenticating party is configured with a user name, and the
authenticating party is not configured with a user name. It is
ce

recommended that the authenticating party be configured with a user


name.
ur

When the authenticating party is configured with a user name (that is,
the ppp chap user username command is configured on the interface):
so

The authenticating party initiates an authentication request


by sending a Challenge packet that carries the local user
Re

name to the authenticated party.


After receiving the Challenge packet on an interface, the
authenticated party checks whether the ppp chap password
command is used on the interface. If this command is used,
ng

the authenticated party uses MD5 to calculate the


concatenation of Identifier, password generated by the ppp
ni

chap password command, and a random number. The


authenticated party then sends a Response packet carrying
ar

the calculated ciphertext password and local user name to


the authenticating party. If the ppp chap password
Le

command is not configured, the authenticated party


searches the local user table for the password matching
the user name of the authenticating party in the received
Challenge packet, and encrypts the matching password by
re

using MD5 in a similar way. The authenticated party sends


a Response packet carrying the calculated ciphertext
Mo

password and local user name to the authenticating party.


en
The authenticating party encrypts the locally saved

m/
password of the authenticated party by using MD5. The
authenticating party then compares the generated

co
ciphertext password with that carried in the received
Response packet, and returns a response based on the

.
check result.

ei
When the authenticating party is not configured with a user name

w
(that is, the ppp chap user username command is not configured on

ua
the interface):
The authenticating party initiates an authentication

.h
request by sending a Challenge packet.
After receiving the Challenge packet, the

g
authenticated party uses MD5 to calculate the

in
concatenation of Identifier, password generated by

rn
the ppp chap password command, and a random
number. It then sends a Response packet carrying

ea
the ciphertext password and local user name to the
authenticating party.
/l
The authenticating party encrypts the locally saved
password of the authenticated party by using MD5.
:/
The authenticating party then compares the
generated ciphertext password with that carried in
tp

the received Response packet, and returns a


ht

response based on the check result.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPCP negotiates IP addresses of two devices to transmit IP packets


over PPP links.
ht

IPCP and LCP have the same negotiation mechanism, packet type,
and working process.
Topology
s:

Configure two IP addresses 12.1.1.1/24 and 12.1.1.2/24 for the two


ce

ends. (IPCP can be used to negotiate IP addresses even if they are


not on the same network segment.)
ur

The static IP address negotiation process is as follows:


R1 and R2 send a Configure-Request packet carrying the
so

local IP address to each other.


After receiving the Configure-Request packet from the peer,
Re

R1 and R2 check the IP address in the packet. If the IP


address is a valid unicast IP address, and is different from
ng

the local IP address configured, R1/R2 determines that the


peer can use this address and returns a Configure-Ack
ni

packet.
IPCP uses Configure-Request and Configure-Ack packets
ar

to allow two ends at a PPP link to discover each others 32-


bit IP address.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

As shown in the figure, R1 requests the peer to allocate an IP address


for it and R2 is configured with a static IP address 12.1.1.2/24. R2 is
ht

enabled to allocate an IP address 12.1.1.1 to R1.


The dynamic IP address negotiation process is as follows:
s:

R1 sends a Configure-Request packet carrying the IP address 0.0.0.0


to R2, requesting R2 to allocate an IP address for it.
ce

After receiving the Configure-Request packet, R2 determines that the


IP address 0.0.0.0 is invalid and returns a Configure-Nak packet
ur

carrying a new IP address 12.1.1.1 to R1.


After receiving the Configure-Nak packet, R1 updates the local IP
so

address, and then sends a Configure-Request packet carrying the new


Re

IP address 12.1.1.1 to R2.


After receiving the Configure-Request packet, R2 determines that the
IP address 12.1.1.1 is valid, and returns a Configure-Ack packet to R1.
ng

In addition, R2 also sends a Configure-Request packet carrying the


IP address 12.1.1.2 to R1. R1 determines that the IP address 12.1.1.2
ni

is valid, and returns a Configure-Ack packet to R2.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multilink PPP fragments a packet and sends the fragments to the same
destination over multiple PPP links.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PPPoE overview
PPPoE allows a large number of hosts on an Ethernet to
ht

connect to the Internet using a remote access device and


controls each host using PPP. PPPoE features a large
application scale, high security, and convenient accounting.
s:
ce

Topology
A PPPoE session is set up between each PC and the
ur

router on the carrier network. Each PC functions as a


PPPoE client and has a unique account, which facilitates
so

user accounting and control by the carrier. The PPPoE


client software must be installed on the PCs.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The PPPoE session establishment process includes three stages:


Discovery, Session, and Terminate.
ht

Discovery stage:
A PPPoE client broadcasts a PPPoE Active Discovery
Initial (PADI) packet that contains service information
s:

required by the PPPoE client.


After receiving the PADI packet, all PPPoE servers
ce

compare the requested service with the services they can


provide. The PPPoE servers that can provide the
ur

requested service unicast PPPoE Active Discovery Offer


(PADO) packets to the PPPoE client.
so

Based on the network topology, the PPPoE client may


receive PADO packets from more than one PPPoE server.
Re

The PPPoE client selects the PPPoE server from which the
first PADO packet is received and unicasts a PPPoE Active
Discovery Request (PADR) packet to the PPPoE server.
ng

The PPPoE server generates a unique session ID to


identify the PPPoE session with the PPPoE client. The
ni

PPPoE server sends a PPPoE Active Discovery Session-


confirmation (PADS) packet containing this session ID to
ar

the PPPoE client. When the PPPoE session is established,


the PPPoE server and PPPoE client enter the PPPoE
Le

Session stage.
When the PPPoE session is established, the PPPoE server
and PPPoE client share the unique PPPoE session ID and
re

learns the peer Ethernet address.


Mo
en
Session stage:

m/
PPP negotiation at the PPPoE Session stage is the same
as common PPP negotiation.

co
When PPP negotiation succeeds, PPP data packets can be
forwarded.

.
At the PPPoE Session stage, the PPPoE server and client

ei
send all Ethernet data packets in unicast mode.
Terminate stage:

w
After a PPPoE session is established, the PPPoE client or

ua
the PPPoE server can unicast a PADT packet to terminate
the PPPoE session at any time. When a PADT packet is

.h
received, no further PPP traffic can be sent using this
session.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Four types of FR interfaces are available:


A user's device is called a DTE, and the corresponding
ht

interface type is DTE.


A network device that provides access services for DTE
devices is called a DCE, and the corresponding interface
s:

type is DCE or NNI.


A UNI interface interconnects the DTE and DCE.
ce

An NNI interface interconnects two FR switches.


A Virtual Circuit (VC) is a logical circuit established between two
ur

network devices on the same network.


Based on establishment mode, VCs are classified into two
so

types:
PVC: refers to the manually created VC.
Re

SVC: refers to the VC that can be created or deleted


automatically through negotiation.
The PVC status of the DTE is determined by the DCE. The
ng

PVC status of the DCE is determined by the network.


VCs are identified by the DLCI and a DLCI takes effect only on a local
ni

interface and its directly connected interface. On an FR network, a


DLCI can identify multiple VCs established on different physical
ar

interfaces.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

LMI: local management interface used to monitor the PVC status.


The system supports three LMI protocols: ITU-T Q.933
ht

Annex A, ANSI T1.617 Annex D, and non-standard


compatible protocol. The non-standard compatible protocol
is used for interconnection with a device from a vendor
s:

except Huawei.
The PVC status of the DTE is determined by the DCE. The
ce

PVC status of the DCE is determined by the network.


ur

When two network devices are directly connected, the PVC


status of the DCE is set by the device administrator.
so

The LMI negotiation process is as follows:


The DTE periodically sends Status Enquiry messages.
Re

After receiving the Status Enquiry message, the DCE


replies a Status message.
The DTE parses the received Status message to obtain the
ng

link status and PVC status.


ni

When the DTE and DCE can normally send and receive
LMI negotiation messages, the link protocol status changes
ar

to Up, and the PVC status changes to Active.


The FR LMI negotiation succeeds.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

After the FR LMI negotiation succeeds and the PVC status changes to
Active, two devices on a PVC start the InARP negotiation process:
ht

If a protocol address is configured on the local interface,


the local device (for example, R1) sends an Inverse ARP
Request packet to the peer device (for example, R2) over
s:

the VC. The Inverse ARP Request packet carries the


ce

protocol address of R1.


After receiving the Inverse ARP Request packet, R2
ur

obtains the protocol address of R1, generates an address


mapping, and sends an Inverse ARP Response packet to
so

R1.
After receiving the Inverse ARP Response packet, R1
Re

parses the address of R2 in the packet and generates an


address mapping.
R1 generates the address mapping 12.1.1.2 to 100, while
ng

R2 generates the address mapping 12.1.1.1 to 100.


ni

If a static mapping is configured manually or a dynamic mapping is


created, the local device does not send an InARP Request packet to
ar

the remote device over the VC regardless of whether the remote


address in the address mapping is correct. The local device sends an
Le

InARP Request packet to the remote device only when no mapping


exists.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Sub-interfaces can solve the problem caused by split horizon on an FR


network. One physical interface can contain multiple logical sub-
ht

interfaces. Each sub-interface can connect to a remote router over one


or multiple DLCIs. The routers are connected over the FR network.
You can define logical sub-interfaces on the serial line.
s:

Every sub-interface uses one or multiple DLCIs to connect


ce

to the remote router. After a DLCI is configured on a sub-


interface, the mapping between the destination protocol
ur

address and this DLCI needs to be created.


As shown in the figure, R4 has only one physical serial
so

interface S0; however, DLCIs are defined on S0 to connect


the sub-interfaces S0.1, S0.2, and S0.3 to R1, R2, and R3
Re

respectively.
Two types of sub-interfaces are available:
P2P sub-interface: used to connect to a single remote
ng

device. Each P2P sub-interface can be configured with only


ni

one PVC. In this case, the remote device can be


determined uniquely without the static address mapping.
ar

Therefore, when the PVC is configured for the sub-


interface, the peer address is identified.
Le
re
Mo
en
P2MP sub-interface: used to connect to multiple remote

m/
devices. Each sub-interface can be configured with multiple
PVCs. Each PVC maps the protocol address of its

co
connected remote device. In this way, different PVCs can
reach different remote devices. You can manually configure

.
the address mapping, or use InARP to dynamically create

ei
the address mapping.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The NCP protocol can be used to allocate an IP address to the peer.
ht

You need to configure the ppp chap user Huawei command on R1's
interface to enable R1 to send a Challenge packet to R2 carrying the
user name Huawei.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ppp authentication-mode: Configures the PPP authentication mode
in which the local device authenticates the remote device.
ht

ppp chap user: Configures a user name for CHAP authentication.


ppp chap password: Configures a password for CHAP
authentication.
s:

ip address ppp-negotiate: Configures IP address negotiation on an


interface to allow the interface to obtain an IP address from the remote
ce

device.
remote address: Configures the local device to assign an IP address
ur

or specify an IP address pool for the remote device.


Usage scenario
Interface view
so

Parameters
Re

ppp authentication-mode { chap | pap }


chap: Indicates the CHAP authentication mode.
pap: Indicates the PAP authentication mode.
ng

ppp chap user username


username: Specifies a user name for CHAP authentication.
ppp chap password { cipher | simple } password
ni

cipher: Indicates a ciphertext password.


Simple: Indicates a plaintext password.
ar

Password: Specifies the password for CHAP authentication.


remote address { ip-address | pool pool-name }
Le

cipher: Indicates a ciphertext password.


Simple: Indicates a plaintext password.
Password: Specifies the password for CHAP authentication.
re
Mo
en
Precautions
In CHAP authentication, the authenticated party does not send the

m/
password to the authenticating party.
The local device can use IPCP to learn the 32-bit host address from

co
the remote

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
interface mp-group: Creates an MP-Group interface and enters the
ht

MP-Group interface view.


ppp mp mp-group: Binds an interface to the MP-Group interface so
that the interface works in MP mode.
s:

restart: Restarts the current interface.


ce

Precautions
Data frames will be lost after you disable the interface. Exercise
ur

caution when you use the restart command.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
You need to get familiar with the configurations of the PPPoE
ht

server and PPPoE client in this case.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
virtual-template: Creates a VT interface and enters the VT interface
ht

view.
pppoe-server bind virtual-template: Binds a specified VT interface
to an Ethernet interface and enables PPPoE on the Ethernet interface.
s:

remote address: Configures the local device to assign an IP address


ce

or specifies an IP address pool for the remote device.


dialer-rule: Enters the dialer rule view.
ur

dialer-rule: Specifies a dialer ACL for a dialer access group and


defines conditions to initiate calls.
so

interface dialer: Creates a dialer interface and enters the dialer


interface view.
Re

dialer user: Enables the resource-shared DCC and specifies the


remote user name of the dialer interface.
dialer-group: Adds an interface to a dialer access group. That is, the
ng

number of the dialer rule is specified.


ni

dialer bundle: Specifies a dialer bundle for a dialer interface in the


resource-shared DCC.
ar

pppoe-client dial-bundle-number: Specifies a dialer bundle for a


PPPoE session.
Le

Parameters
remote address { ip-address | pool pool-name }
ip-address: Specifies an IP address to be allocated to the remote
re

device.
Mo

pool pool-name: Specifies the name of the IP address pool, from which
an IP address is allocated to the remote device.
en
dialer-rule dialer-rule-number { acl { acl-number | name acl-name }

m/
| ip { deny | permit } | ipv6 { deny | permit } }
dialer-rule-number: Specifies the number of a dialer access group. The

co
number is the same as the value of group-number in the dialer-group
command.

.
acl { acl-number |name acl-name }: Indicates the number or name of

ei
the dialer ACL.

w
ip { deny | permit }: Indicates whether the dialer ACL allows or forbids

ua
IPv4 packets.

.h
Precautions
To configure the local device to allocate an IP address to the remote

g
device, run the ppp ipcp remote-address forced command in the

in
interface view.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In the case of FR network, you do not need to manually
ht

configure the mapping relationship for a P2P sub-interface.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Precautions
You do not need to manually configure the mapping
ht

relationship if the sub-interface is a P2P sub-interface no


matter that has InARP disabled or not.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology Description
Broadcast storm
ht

Assume that STP is not enabled on the switching


devices. If PC1 broadcasts a request, the request is
received by port1 and forwarded by port2 on S1 and S2.
s:

On S1 and S2, port 2 receives the request broadcast


ce

by the other switch and port1 forwards the request. As


such transmission repeats and resources on the entire
ur

network are exhausted, causing the network to break


down.
so

MAC address table flapping


Port2 on S1 can learn the MAC address of the PC2.
Re

Since S2 forwards data frames sent by PC2 to its other


ports, S1 may learn the MAC address of PC2 on port1.
ng

S1 continuously modifies its MAC address table,


causing flapping of the MAC address table.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

STP
STP can eliminate network loops. STP is used to build a loop-
ht

free network (tree) to ensure the unique data transmission


path and prevent infinite looping of packets. STP works at the
data link layer of the OSI model.
s:

STP-capable switches exchange BPDUs and perform


ce

distributed calculation to determine which ports need to be


blocked to prevent loops.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Root bridge
The root bridge is the bridge with the smallest BID, which is
ht

composed of the priority and MAC address.


Root Port
The root port is the port with the smallest root path to the root
s:

bridge, and is responsible for forwarding data to the root bridge.


ce

The root port is determined based on the path cost. Among all
STP-capable ports on a network bridge, the port with the
ur

smallest root path cost is the root port. There is only one root
port on an STP-capable device, but there is no root port on the
so

root bridge.
Re

Designated port and bridge


The bridge closest to the root bridge on each network segment
ng

is used as the designated bridge. The port on the designated


bridge to the network segment is called designated port.
ni

The designated port is responsible for forwarding traffic, and


ar

the designated bridge is responsible for forwarding


configuration BPDUs.
Le

After the root bridge, root port, and designated port are selected
successfully, the entire tree topology is set up. When the topology is
re

stable, only the root port and the designated port forward traffic. All the
Mo

other ports are in Blocking state, and receive only STP BPDUs but not
forward user traffic.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A configuration BPDU is generated in one of the three following


ht

scenarios:
When ports are enabled with STP, the designated ports send
configuration BPDUs at intervals specified by the Hello timer.
s:

When a root port receives configuration BPDUs, the device


ce

where the root port resides sends a copy of the configuration


BPDUs to its designated port.
ur

When receiving a configuration BPDU with a lower priority, the


designated port immediately sends its own configuration
so

BPDUs to the downstream device.


Root identifier
Re

The root identifier is composed of the priority and MAC


address of the root bridge. The default priority is 32768.
Root path cost
ng

Cumulative cost of all links to the root bridge.


ni

Bridge Identifier (BID)


BID of the device sending configuration BPDUs. On a LAN,
ar

the BID is the ID of the designated bridge.


Port Identifier (PID)
Le

PID of the port sending configuration BPDUs. The PID


consists of the port priority and port number. On a LAN, the
PID is the ID of the designated port.
re
Mo
en
Hello Time

m/
The Hello timer specifies the interval at which an STP-capable
device sends configuration BPDUs to detect link faults.

co
When the network topology becomes stable, the change of the
interval takes effect only after a new root bridge takes over.

.
After a topology changes, TCN BPDUs will be sent. This

ei
interval is irrelevant to the transmission of TCN BPDUs.

w
The default value is 2 seconds.

ua
Max Age
After a non-root bridge running STP receives a configuration

.h
BPDU, the non-root bridge compares the Message Age value
with the Max Age value in the received configuration BPDU.

g
If the Message Age value is smaller than or equal to

in
the Max Age value, the non-root bridge forwards the

rn
configuration BPDU.
If the Message Age value is larger than the Max Age

ea
value, the configuration BPDU ages and the non-root
bridge directly discards it. In this case, the network size
/l
is considered too large and the non-root bridge
disconnects from the root bridge.
:/
In real world situations, each time a configuration BPDU
passes through a bridge, the value of Message Age increases
tp

by 1.
The default value is 20.
ht

Forward Delay
The Forward Delay timer specifies the delay for interface
s:

status transition. The default value is 15 seconds.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

STP Topology Calculation


After all devices on the network are enabled with STP, each
ht

device considers itself as the root bridge. Each device only


transmits and receives BPDUs but does not forward user
traffic. All ports are in Listening state. After exchanging
s:

configuration BPDUs, all devices participate in the selection of


ce

the root bridge, root port, and designated port.


During network initialization, every device considers itself as
ur

the root bridge and sets the root bridge ID as the device ID.
Devices exchange configuration BPDUs to compare the root
so

bridge IDs. The device with the smallest BID is elected as the
root bridge.
Re

The switch priority is configurable. The value ranges from 0 to


65535. The default priority is 32768.
Topology Description
ng

Assume that the priorities of S1 and S2 are 0 and 1. Port A on


ni

S1 connects to Port B on S2. S1 sends the configuration


BPDU of {0, 0, 0, Port A} and S2 sends the configuration
ar

BPDU of {1, 0, 1, Port B}. After the two switches compare the
configuration BPDUs, S1 is deemed to have a higher priority
Le

than S2, so S1 becomes the root bridge.


re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology Description
Priorities of S1, S2, and S3 are 0, 1, and 2, and the path costs
ht

between S1 and S2, between S1 and S3, and between S2 and


S3 are 5, 10, and 4 respectively.
Initial configuration BPDUs on ports of S1, S2, and S3:
s:

S1: {0, 0, 0, PortA1} on PortA1 and {0, 0, 0, Port A2} on


ce

Port A2
S2: {1, 0, 1, PortB1} on PortB1 and {1, 0, 1, Port B2} on
ur

Port B2
S3: {2, 0, 2, PortC1} on PortC1 and {21, 0, 2, Port C2}
so

on Port C2
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

First exchange of configuration BPDUs


Ports on S1, S2, and S3 send their configuration BPDUs. Each
ht

network bridge considers itself as the root bridge, so the RPC


is 0.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Comparison for the first exchange of configuration BPDUs


S1
ht

Port A1 receives the configuration BPDU {1, 0, 1, Port


B1} from Port B1 and finds that its configuration BPDU
s:

{0, 0, 0, Port A1} has higher priority than the


configuration BPDU {1, 0, 1, Port B1}, so Port A1
ce

discards the configuration BPDU {1, 0, 1, Port B1}.


Port A2 receives the configuration BPDU {2, 0, 2, Port
ur

C1} from Port C1 and finds that its configuration BPDU


{0, 0, 0, Port A2} has higher priority than the
so

configuration BPDU {2, 0, 2, Port C1}, so Port A2


Re

discards the configuration BPDU {2, 0, 2, Port C1}.


After finding that both the root and the designated
switch IDs refer to itself in the configuration BPDU on
ng

each port, S1 considers itself as the root bridge. S1


then sends configuration BPDUs from each port
ni

periodically without modifying the configuration BPDUs.


The configuration BPDU {0, 0, 0, Port A1} on Port
ar

A1 and configuration BPDU {0, 0, 0, Port A2} on


Le

Port A2 are optimal.


Because S1 is the root bridge, all ports on S1 are
designated ports.
re
Mo
en
S2

m/
Port B1 receives the configuration BPDU {0, 0, 0, Port
A1} from Port A1 and finds that its configuration BPDU

co
{0, 0, 0, Port A1} has a higher priority than the
configuration BPDU {1, 0, 1, Port B1}, so Port B1

.
updates its configuration BPDU.

ei
Port B2 receives the configuration BPDU {2, 0, 2, Port

w
C2} from Port C2 and finds that its configuration BPDU

ua
{1, 0, 1, Port B2} has a higher priority than the
configuration BPDU {2, 0, 2, Port C2}, so Port B2

.h
discards the configuration BPDU {2, 0, 2, Port C2}.
The configuration BPDU {0, 0, 0, Port A1} on Port

g
B1 and the configuration BPDU {1, 0, 1, Port B2} on

in
Port B2 are optimal.

rn
Comparison of configuration BPDUs on ports:
S2 compares the configuration BPDU on each

ea
port and finds that the configuration BPDU on
Port B1 has the highest priority, so Port B1 is
/l
used as the root port and the configuration
BPDU on Port B1 remains unchanged.
:/
S2 calculates the BPDU {0, 5, 1, Port B2} for
Port B2 based on the configuration BPDU and
tp

path cost of the root port, and compares the


ht

configuration BPDU {0, 5, 1, Port B2} with its


configuration BPDU {1, 0, 1, Port B2} on Port
B2. S2 finds that the calculated configuration
s:

BPDU has a higher priority, so Port B2 is used


as the designated port, and its configuration
ce

BPDU is replaced by the calculated


configuration BPDU and the calculated
ur

configuration BPDU is sent periodically.


S3
so

Port C1 receives the configuration BPDU {0, 0, 0, Port


Re

A2} from Port A2 and finds that the configuration BPDU


{0, 0, 0, Port A2} has a higher priority than its
configuration BPDU {2, 0, 2, Port C1}, so Port C1
ng

updates its configuration BPDU.


Port C2 receives the configuration BPDU {1, 0, 1, Port
ni

B2} from Port B2 and finds that the configuration BPDU


{1, 0, 1, Port B2} has a higher priority than its
ar

configuration BPDU {2, 0, 2, Port C2}, so Port C2


Le

updates its configuration BPDU.


re
Mo
en
The configuration BPDU {0, 0, 0, Port A2} on Port

m/
C1 and configuration BPDU {1, 0, 1, Port B2} on
Port C2 are optimal.

co
Comparison of configuration BPDUs on ports:
S3 compares the configuration BPDU on each

.
port and finds that the configuration BPDU on

ei
Port C1 has the highest priority, so Port C1 is

w
used as the root port and the configuration

ua
BPDU on Port C1 remains unchanged.
S3 calculates the configuration BPDU {0, 10, 2,

.h
Port C2} for Port C2 based on the configuration
BPDU and path cost of the root port, and

g
compares the configuration BPDU {0, 10, 2,

in
Port C2} with its configuration BPDU {1, 0, 1,

rn
Port B2} on Port C2. S3 finds that the calculated
configuration BPDU has a higher priority, so

ea
Port C2 is used as the designated port and its
configuration BPDU is replaced by the
/l
calculated configuration BPDU.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Second exchange of configuration BPDUs


S1 is the root bridge. Configuration BPDUs sent by S1
ht

The configuration BPDU sent by Port A1 is {0, 0, 0,


Port A1}.
The configuration BPDU sent by Port A2 is {0, 0, 0,
s:

Port A2}.
Configuration BPDUs sent by S2
ce

S1 is the root bridge, so S2 does not send


ur

configuration BPDUs to S1.


The configuration BPDU sent by Port B2 is {0, 5, 1,
so

Port B2}.
Configuration BPDUs sent by S3
Re

S1 is the root bridge, so S3 does not send


configuration BPDUs to S1.
The configuration BPDU sent by Port C2 is {0, 10, 2,
ng

Port C2}.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Comparison for the second exchange of configuration BPDUs


S2
ht

Port B1 receives the configuration BPDU {0, 0, 0, Port


A1} from Port A1 and finds that the received
s:

configuration BPDU is the same as its own


configuration BPDU, so Port B1 discards the received
ce

one.
Port B2 receives the configuration BPDU {0, 10, 2, Port
ur

C2} from Port C2 and finds that its configuration BPDU


{0, 5, 1, Port B2} has a higher priority, so Port B2
so

discards it.
After comparison, the optimal configuration BPDUs
Re

on Port B1 and Port B2 are {0, 0, 0, Port A1} and {0,


5, 1, Port B2} respectively.
ng

Because the optimal configuration BPDU on each port


remains unchanged, the port role does not change.
ni

S3
Port C1 receives the configuration BPDU {0, 0, 0, Port
ar

A2} from S1 and finds that the received configuration


Le

BPDU is the same as its own configuration BPDU, so


Port C1 discards the received one.
Port C2 receives the configuration BPDU {0, 5, 1, Port
re

B2} from S1 and compares it with its configuration


BPDU {0, 10, 2, Port C2}.
Mo
en
Because the root bridge ID is the same, the root path

m/
costs are compared. Port C2 finds that the received
configuration BPDU has a higher priority(10>9), so Port

co
C2 updates its BPDU as {0, 5, 1, Port B2}.
After comparison, the optimal configuration BPDUs

.
on Port C1 and Port C2 are {0, 0, 0, Port A2} and {0,

ei
5, 1, Port B2} respectively.

w
Comparison of configuration BPDUs on each port:

ua
S3 compares the root path cost of Port C1 (root
path cost of 0 in the received configuration

.h
BPDU + path cost 10 of the link) with the root
path cost of Port C2 (root path cost of 5 in the

g
received configuration BPDU + path cost 4 of

in
the link). The root path cost of Port C2 is

rn
smaller, so the configuration BPDU of Port C2
is preferred. Port C2 is used as the root port

ea
and its configuration BPDU remains unchanged.
S3 calculates the configuration BPDU {0, 9, 2,
/l
Port C1} for Port C1 according to the
configuration BPDU and path cost of the root
:/
port, and compares the calculated configuration
BPDU with its configuration BPDU. S3 finds
tp

that its configuration BPDU has a higher priority,


ht

so Port C1 is blocked and the configuration


BPDU of S3 remains unchanged. In this case,
Port C1 does not forward data. Furthermore,
s:

spanning tree calculation may be triggered, for


example, the link between S2 and S3 becomes
ce

Down.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology on the Left Side


According to the root bridge selection principle of STP, S1 is
ht

the root bridge. Then determine the root port, designated port,
and alternate port.
E0 and E1 on S2 receive BPDUs {0, 0, 0, E0} and {0, 0, 0, E1}
s:

from S1. In the two BPDUs, only the transmit port is different.
ce

The port with smaller PID has a higher priority, so E0 is the


root port and E1 is the alternate port.
ur

Topology on the Right Side


According to the root bridge selection principle of STP, S1 is
so

the root bridge. Then determine the root port, designated port,
and alternate port.
Re

E0 and E1 on S2 receive BPDUs {0, 0, 0, E0} and {0, 0, 0, E1}


from S1. The two BPDUs have the same priority, only the PIDs
ng

are compared. E0 has smaller PID, so E0 is the root port and


E1 is the alternate port.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Generally, only the root bridge generates and sends configuration


BPDUs. Other non-root-bridges only forward the configuration BPDU
ht

from the root port using their designated ports. The designated port on
a non-root-bridge sends the optimal BPDU only after receiving BPDUs
with a lower priority.
s:

Topology description:
ce

After S2 receives a BPDU with a lower priority from S4, S2


sends a configuration BPDU. This is because network bridges
ur

save the optimal configuration BPDU.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology Description
The figure on the left side shows the initial topology. The path
ht

costs are the same. S1, S2, and S3 are connected, S1 is the
root port, and interconnected ports are in forwarding state. In
the figure on the right side, a link between S1 and S2 is added.
s:

After S2 receives BPDUs from S1 and S3, S2 considers that


ce

the port connected to S1 is the new root port and the port
connected to S3 is the designated port. All ports are root ports
ur

or designated ports in forwarding state. In this case, a loop


occurs. The loop can be eliminated only when configuration
so

BPDUs are transmitted to each network bridge and S2 blocks


the port connected to S3 through calculation.
Re

There is a delay for a port (for example, port E on S2) to


change from non-forwarding to forwarding so that ports that
ng

want to enter the non-forwarding state can complete spanning


tree calculation.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Forward Delay
The default interval for port status transition is 15 seconds.
ht

There are specific calculation between Forwarding Delay, hello


timer and Max Age, the default value is based on the diameter
7 calculating.
s:
ce

Port Status Description


After a port is enabled, the port enters the Listening state and
ur

starts the spanning tree calculation.


If the port needs to be configured as the alternate port through
so

calculation, the port enters the Blocking state.


If the port needs to be configured as the root port or
Re

designated port through calculation, the port enters the


Learning state from the Listening state after a Forward Delay
ng

period. The port then enters the Forwarding state from the
Learning state after the Forward Delay period. The port in
ni

Forwarding state can forward data frames.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Huawei switch port status


Huawei datacom devices use MSTP by default. After a device
ht

transitions from the MSTP mode to the STP mode, its STP-
capable port supports the same port states as those supported
by an MSTP-capable port, including the Forwarding, Learning,
s:

and Discarding states.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Port status transition


The port is initialized or enabled.
ht

The port is blocked or the link fails.


The port is selected as the root port or designated port.
The port is no longer the root port or designated port.
s:

The Forward Delay timer expires.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

TCN BPDU processing:


After the network topology changes, the downstream device
ht

continuously sends a TCN BPDU to its upstream device which


the port status turn to forwarding.
After the upstream device receives the TCN BPDU from the
s:

downstream device, only the designated port processes it. The


ce

other ports may receive the TCN BPDU but do not process it.
The upstream device sets the TCA bit of the Flags field in the
ur

configuration BPDU to 1 and returns the configuration BPDU


to instruct the downstream device to stop sending TCN
so

BPDUs.
The upstream device sends a copy of the TCN BPDU to the
Re

root bridge.
Steps 1 to 4 repeat until the root bridge receives the TCN
ng

BPDU.
After receiving the TCN BPDU, the root bridge resets the TCA
ni

bit in the subsequent configuration BPDU for acknowledgment


and sets the TC bit of the Flags field in the configuration BPDU
ar

to 1 to notify all network bridges of the topology change.


After the periods of Max Age and Forward Delay, the root
Le

bridge sends the BPDU with the reset TC bit. The network
bridge that receives the BPDU reduces the aging time of MAC
address entries to the Forward Delay period.
re
Mo
en
Topology Description:

m/
Through STP calculation, S1 is the root bridge and port E1 on
S4 is blocked.

co
When the link of port E1 on S3 fails, the STP will be
calculation again, port E1 of S4 will turn to designated port and

.
the status is forwarding, S4 immediately sends a TCN BPDU

ei
to the upstream.

w
After S2 receives the TCN BPDU from S3, S2 resets the TCA

ua
bit in the subsequent configuration BPDU and sends it to S4
from port E3. S2 also sends the TCN BPDU to the root from

.h
the root port E1.
After S1 receives the TCN BPDU from S2, S1 resets the TCA

g
and TC bits in the subsequent configuration BPDU and sends

in
it to S2 from the designated port E1. Within the period of 35

rn
seconds (20 seconds + 15 seconds), S1 resets the TC bit in
the configuration BPDU. After receiving the configuration

ea
BPDU with the reset TC bit, each network bridge changes its
aging time of MAC address entries to 15 seconds.
/l
When the topology change, the MAC address table will
established soon, which can avoid wasting of bandwidth.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Root bridge failure:


When S1 becomes faulty, S2 and S3 cannot receive BPDUs
ht

from the root bridge. S2 and S3 detect the root bridge failure
only after a Max Age period. S2 and S3 then determine the
new root bridge, root port, and designated port. The topology
s:

convergence period is 50 seconds (BPDU aging period plus


ce

value twice the Forward Delay period).


Link failure:
ur

When the link between S3 and S1 fails, S3 can immediately


detect this event. The blocked port on S3 immediately enters
so

the Listening state and sends the configuration BPDU with


itself as the root. After S2 receives the BPDU with lower
Re

priority from S3, S2 sends a configuration BPDU with S1 as


the root. The port on S2 connected to S3 therefore becomes
ng

the root port, and the port on S3 connected to S2 becomes the


designated port. The period for the S3 port status change from
ni

Listening, Learning, to Forwarding is 30 seconds.


When a link fails or is added, the fault can be rectified after 30 seconds.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

STP Limitation:
Port statuses or port roles are not distinguished in a fine-
ht

granular manner. For example, ports in Listening and Blocking


states do not forward user traffic or learn MAC addresses.
The STP algorithm determines topology changes after the time
s:

set by the timer expires, which slows down network


ce

convergence.
The STP algorithm requires a stable network topology. After
ur

the root bridge sends configuration BPDUs, other devices


process the configuration BPDUs so that the configuration
so

BPDUs are advertised to the entire network.


Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RSTP has all functions of STP, and the RSTP-capable and STP-
capable network bridges can work together.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RSTP defines four port roles: root port, designated port, alternate port,
and backup port.
ht

The functions of the root port and designated port are the same as
those defined in STP. The alternate port and backup port are described
as follows.
s:

From the perspective of configuration BPDU transmission:


An alternate port is blocked after learning the
ce

configuration BPDUs with a higher priority from other


ur

bridges.
A backup port is blocked after learning the
so

configuration BPDUs with a higher priority than itself.


From the perspective of user traffic:
Re

An alternate port backs up the root port and provides


an alternate path from the designated bridge to the root
ng

bridge.
A backup port backs up the designated port and
ni

provides an alternate path from the root bridge to a


network segment.
ar

After all RSTP-capable ports are assigned roles, topology convergence


is completed.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Port statuses are simplified from five types to three types. Based on
whether a port forwards user traffic and learns MAC addresses, the port
ht

is in one of the following states:


If a port neither forwards user traffic nor learns MAC
addresses, the port is in Discarding state.
s:

If a port does not forward user traffic but learns MAC


ce

addresses, the port is in Learning state.


If a port forwards user traffic and learns MAC addresses, the
ur

port is in Forwarding state.


so

RSTP Calculation
Roles of ports in Discarding state are determined:
Re

The root port and designated port enter the learning


state after the Forward Delay period. A port in Learning
ng

state learns MAC addresses and enters the Forwarding


state after a Forward Delay period. RSTP accelerates
ni

this process using another mechanism.


An alternate port maintains a Discarding state.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration BPDUs in RSTP are differently defined. Port roles are


described based on the Flags field defined in STP. When compared
ht

with STP, RSTP slightly redefines the format of configuration BPDUs.


The value of the Type field is no longer set to 0 but 2. The
STP-capable device therefore always discards the
s:

configuration BPDUs sent by an RSTP-capable device.


The 6 bits in the middle of the original Flags field are reserved.
ce

Such a configuration BPDU is called an RST BPDU.


ur

Flags field in an RST BPDU:


Bit 0 indicates the TC bit, which is the same as that in STP.
so

Bit 1 indicates the Proposal flag bit, indicating that the BPDU is
the Proposal packet in the fast convergence mechanism.
Re

Bit 2 and bit 3 indicate the port role. The value 00 indicates the
unknown port; the value 01 indicates the root port; the value
ng

10 indicates the alternate or backup port; the value 11


indicates the designated port.
ni

Bit 4 indicates that the port is in Learning state.


Bit 5 indicates that the port is in Forwarding state.
ar

Bit 6 indicates the Agreement packet in the fast convergence


mechanism.
Le

Bit 7 indicates the TCA bit, which is the same as that in STP.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Configuration BPDUs are processed in a different manner.


Transmission of configuration BPDUs after the topology
ht

becomes stable
In STP, after the topology becomes stable, the root
bridge sends configuration BPDUs at an interval set by
s:

the Hello timer. A non-root-bridge does not send


ce

configuration BPDUs until it receives configuration


BPDUs sent from the upstream device. This renders
ur

the STP calculation complicated and time-consuming.


In RSTP, after the topology becomes stable, a non-
so

root-bridge sends configuration BPDUs at an interval


set by the Hello timer, regardless of whether it has
Re

received the configuration BPDUs sent from the root


bridge. Such operations are implemented on each
ng

device independently.
Shorter timeout interval of BPDUs
ni

In STP, a device has to wait for the Max Age period


before determining a negotiation failure. In RSTP, if a
ar

port does not receive configuration BPDUs sent from


the upstream device for three consecutive intervals set
Le

by the Hello timer, the negotiation between the local


device and its peer fails.
re
Mo
en
Processing of RST BPDUs with lower priority

m/
In RSTP, when a port receives an RST BPDU from the
upstream designated bridge, the port compares the

co
received RST BPDU with its own RST BPDU. If its own
RST BPDU has higher priority than the received one,

.
the port discards the received RST BPDU and

ei
immediately responds to the upstream device with its

w
own RST BPDU. After receiving the RST BPDU, the

ua
upstream device updates its own RST BPDU based on
the corresponding fields in the received RST BPDU. In

.h
this manner, RSTP processes BPDUs with lower
priority more rapidly, independent of any timer that is

g
used in STP.

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

STP convergence
To eliminate loops, STP uses timers to complete convergence.
ht

The default period from the time the port is enabled to the time
the port is in Forwarding state is 30 seconds. Shortening the
values of timers may cause the network to become unstable.
s:
ce

RSTP fast convergence


Edge port
ur

In RSTP, a designated port on the network edge is


called an edge port. An edge port directly connects to a
so

terminal and does not connect to any other switching


devices. An edge port does not receive configuration
Re

BPDUs, so it does not participate in the RSTP


calculation. It can directly change from the Disabled
ng

state to the Forwarding state without any delay, just like


an STP-incapable port. If an edge port receives bogus
ni

configuration BPDUs from attackers, it becomes a


common STP port. The STP recalculation is performed,
ar

causing network flapping.


Fast switching of the root port
Le

If the root port fails, the optimal alternate port on the


network becomes the root port and enters the
Forwarding state. This is because there must be a path
re

from the root bridge to a designated port on the


Mo

network segment connecting to the alternate port.


en
Proposal/Agreement mechanism

m/
When a port is selected as a designated port, in STP,
the port does not enter the Forwarding state until a

co
Forward Delay period expires; in RSTP, the port enters
the Discarding state, and then the Proposal/Agreement

.
mechanism allows the port to immediately enter the

ei
Forwarding state. The Proposal/Agreement mechanism

w
must be applied on the P2P links in full-duplex mode.

ua
The P/A mechanism is short for the
Proposal/Agreement mechanism

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Edge port
An edge port directly connects to a terminal. When the network
ht

topology changes, loops do not occur on the edge port. The


edge port therefore can directly enter the Forwarding state
without waiting for two Forward Delay periods.
s:

An edge port does not receive configuration BPDUs, so it does


ce

not participate in the RSTP calculation. It can directly change


from the Disabled state to the Forwarding state without any
ur

delay, just like an STP-incapable port. If an edge port receives


bogus configuration BPDUs from attackers, it becomes a
so

common STP port. The STP recalculation is performed,


causing network flapping.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Fast switching of the root port


In RSTP, an alternate port is the backup of the root port. When
ht

the root port of a network bridge becomes discarding, the


optimal alternate port is used as the new root port and
s:

becomes Forwarding states. Because the network segment


connects to this alternate port must have a designated port
ce

whitch can reach to the root bridge.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

P/A mechanism
The Proposal/Agreement (P/A) mechanism enables a
ht

designated port to rapidly enter the Forwarding state.


The P/A mechanism requires that the link between two
switching devices should be P2P and work in full-duplex mode.
s:

When P/A negotiation fails, the designated port is selected


ce

after two Forward Delay periods. The negotiation process is


the same as that in STP.
ur

After a new link is established, the negotiation process of the


P/A mechanism is as follows:
so

p0 and p1 become designated ports and send RST


BPDUs.
Re

After receiving an RST BPDU with higher priority, p1


on S2 determines that it will become a root port but not
ng

a designated port. p1 then stops sending RST BPDUs.


p0 on S1 enters the Discarding state and sends RST
ni

BPDUs with the Proposal field of 1.


After receiving an RST BPDU with the Proposal field of
ar

1, S2 sets the sync variable to 1 for all its ports.


As p2 has been blocked, its status remains unchanged;
Le

p4 is an edge port and does not participate in


calculation. Only the non-edge designated port p3
therefore needs to be blocked.
re
Mo
en
After p2, p3, and p4 enter the Discarding state, their

m/
synced variables are set to 1. The synced variable of
the root port p1 is then set to 1, and p1 sends an RST

co
BPDU with the Agreement field of 1 to S1. With
exception of the Agreement field that is set to 1 and the

.
Proposal field that is set to 0, the RST BPDU is the

ei
same as that received.

w
After receiving this RST BPDU, S1 identifies the RST

ua
BPDU as a response to the Proposal packet that it just
sent, and p0 immediately enters the Forwarding state.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The P/A negotiation with the downstream device as follows.


When a link between S1 and S2 is added, the P/A mechanism works
ht

as follows:
S1 sends an RST BPDU with the Proposal field of 1 to S2.
After receiving the RST BPDU, S2 determines that E2 is the
s:

root port. S2 blocks designated ports of E1 and E3, sets the


ce

root port to the Forwarding state, and sends an Agreement


packet to S1.
ur

After S1 receives the Agreement packet, its designated port


E1 immediately enters the Forwarding state.
so

The non-edge designated ports of E1 and E3 on S2 sends


Proposal packets.
Re

After S3 receives the Proposal packets from S2, S3


determines that E1 is the root port and starts synchronization.
ng

Because the downstream port of S3 is the edge port, S3


directly sends an Agreement packet.
ni

After S2 receives the Agreement packet from S3, its port E1


immediately enters the Forwarding state.
ar

The process on S4 is similar to that on S3.


After S2 receives the Agreement packet from S4, its port E3
Le

immediately enters the Forwarding state.


The P/A process is completed.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

In RSTP, if a non-edge port changes to the Forwarding state, the


topology changes.
ht

After a switching device detects the topology change (TC), it performs


the following operations:
Start a TC While timer for every non-edge port. The TC While
s:

Timer value doubles the Hello timer value. All MAC address
ce

entries learned by the ports whose status changes are cleared


before the timer expires. These ports send RST BPDUs with
ur

the TC field of 1. Once the TC While timer expires, the ports


stop sending the RST BPDUs.
so

After another switching device receives the RST BPDU, it


clears the MAC addresses learned by all ports excluding the
Re

one that receives the RST BPDU. The switching device then
starts a TC While timer for all non-edge ports and the root port.
ng

The process is similar.


In this manner, RST BPDUs flood the network.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When a port switches from RSTP to STP, the port loses RSTP features
such as fast convergence.
ht

On a network where both STP-capable and RSTP-capable devices are


deployed, STP-capable devices ignore RST BPDUs; if a port on an
RSTP-capable device receives a configuration BPDU from an STP-
s:

capable device, the port switches to the STP mode after two intervals
ce

specified by the Hello timer and starts to send configuration BPDUs. In


this manner, RSTP and STP are interoperable.
ur

After STP-capable devices are removed, Huawei RSTP-capable


datacom devices can switch back to the RSTP mode.
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RSTP, an enhancement to STP, implements fast convergence of the


network topology. There is a defect for both RSTP and STP: All VLANs
ht

on a LAN use one spanning tree, and VLAN-based load balancing


cannot be performed. Once a link is blocked, it will no longer transmit
traffic, wasting bandwidth and causing the failure in forwarding certain
s:

VLAN packets.
ce

Topology Description
ur

STP or RSTP is deployed on the LAN. The broken line shows


the spanning tree; S6 is the root switching device; the links
so

between S1 and S4 and between S2 and S5 are blocked.


VLAN packets are transmitted by using only the links marked
Re

with "VLAN2" or "VLAN3." PC2 and PC3 belong to VLAN 2 but


they cannot communicate with each other because the link
ng

between S2 and S5 is blocked and the link between S3 and


S6 rejects packets from VLAN 2.
ni

MSTP can be used to address this issue. MSTP implements fast


convergence and provides multiple paths to load balance VLAN traffic.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A Multiple Spanning Tree (MST) region contains multiple switching


devices and network segments between them. The switching devices
ht

of one MST region have the following identical characteristics:


MSTP-enabled
Region name
s:

VLAN-MSTI mappings
MSTP revision level
ce
ur

An instance is a collection of VLANs. Binding multiple VLANs to an


instance saves communication costs and reduces resource usage. The
so

topology of each MSTI is calculated independent of one another, and


traffic can be balanced among MSTIs. Multiple VLANs that have the
Re

same topology can be mapped to one instance. The forwarding status


of the VLANs for a port is determined by the port status in the MSTI.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The Common and Internal Spanning Tree (CIST), calculated using STP
or RSTP, connects all switching devices on a switching network.
ht

The CIST root is the network bridge with the highest priority on
the entire network, that is, root bridge of the CIST.
In the preceding topology, the lines in red in MSTIs and the
s:

lines in blue between MSTIs form a CIST. The root bridge of


ce

the CIST is S1 in MST region 1.


ur

A Common Spanning Tree (CST) connects all the MST regions on a


switching network.
so

The CST is calculated by all nodes using STP or RSTP.


In the preceding topology, the lines in blue form a CST. The
Re

CST root is MST region 1.

An Internal Spanning Tree (IST) resides within an MST region.


ng

Each spanning tree in an MST region has an MSTI ID. An IST


ni

is a special MSTI with the MSTI ID of 0, called MSTI 0. The


VLANs that do not map to other MSTIs map to MSTI 0.
ar

An IST is a segment of the CIST in an MST region.


In the preceding topology, the lines in red form a IST.
Le

The master bridge is the IST master, which is the switching device
closest to the CIST root in a region.
re

If the CIST root is in an MST region, the CIST root is the


Mo

master bridge of the region.


en
In the preceding topology, S1, S4, and S7 are master bridges.

m/
A Single Spanning Tree (SST) is formed in either of the following

co
situations:
A switching device running STP or RSTP belongs to only one

.
spanning tree.

ei
An MST region has only one switching device.

w
There is no SST in the preceding topology.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

MSTI
An MST region can contain multiple spanning trees, each
ht

called an MSTI. An MSTI regional root is the root of the MSTI.


Each MSTI has its own regional root.
MSTIs are independent of each other. An MSTI can map to
s:

one or more VLANs, but one VLAN can map to only one MSTI.
Each MSTI has an MSTI ID. The MSTI ID starts from 1, which
ce

is distinguished with the IST (MSTI 0).


ur

In the preceding topology, VLAN 2 maps to MSTI 2 and VLAN


4 to MSTI 4.
so

MSTI regional root


Re

The MSTI regional root is the network bridge with the highest
priority in each MSTI. You can specify different roots in
ng

different MSTIs.
In the preceding topology, assuming that S9 has the highest
ni

priority in MSTI 2, S9 is the regional root in MSTI 2. Assuming


that S8 has the highest priority in MSTI 4, S8 is the regional
ar

root in MSTI 2.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When compared to RSTP, MSTP has two additional port types. MSTP
ports include the root port, designated port, alternate port, backup port,
ht

edge port, master port, and regional edge port.


Master port
A master port is on the shortest path connecting MST
s:

regions to the CIST root.


BPDUs of an MST region are sent to the CIST root
ce

through the master port.


ur

Master ports are special regional edge ports,


functioning as root ports in the CIST and master ports
so

in instances.
In the preceding topology, the port on S7 connected to
Re

MST region 1 is the master port.


Regional edge port
A port connecting the network bridge in an MST region
ng

to another MST region or an STP or RSTP-enabled


ni

network bridge is a regional edge.


In the preceding topology, the port on S8 connected to
ar

MST region 2 is the regional edge port.


Le

Network bridges may have different roles in different MSTIs, so ports


with exception to the master port on network bridges may have
different roles. The master port retains its role in all MSTIs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Currently, there are two MST BPDU formats:


dot1s: BPDU format defined in IEEE 802.1s
ht

legacy: private BPDU format


In using the stp compliance command, you can configure a port
on a Huawei datacom device to automatically adjust the MST
s:

BPDU format.
ce

With exception to MSTP-specific fields, other fields in an intra-region or


ur

inter-region MST BPDU are the same as those in an RST BPDU.


The Root ID field in an RST BPDU indicates the CIST root ID
so

in an MST BPDU.
The EPC field in an MST BPDU indicates the total path cost
Re

from the MST region where the network bridge sending the
BPDU resides to the MST region where the CIST root resides.
The Bridge ID field in an MST BPDU indicates the regional
ng

root ID in the CIST.


ni

The Port ID field in an MST BPDU indicates the ID of the


designated port in the CIST.
ar

MSTP-specific fields:
Version 3 Length: indicates the BPDUv3 length, which
Le

is used to check received MST BPDUs.


MST Configuration Identifier: indicates the MST
configuration identifier, which has four fields.
re
Mo
en
This field identifies an MST region where a network

m/
bridge is located. Neighboring switches are in the same
MST region only when the following fields on the

co
switches are the same:
Format Selector: indicates the 802.1s-defined

.
protocol selector. It has a fixed value of 0.

ei
Name: indicates the configuration name, that is,

w
the MST region name of a switch. The value

ua
has 32 bytes. Each switch has an MST region
name configured. The default value is the

.h
switchs MAC address.
Config Digest: indicates the configuration digest,

g
which has 16 bytes. Switches in an MST region

in
should maintain the same mapping between

rn
VLANs and MSTIs. However, the MST
configuration table is too large (8192 bytes) and

ea
cannot be easily transmitted between switches.
This field is the digest calculated from the MST
/l
configuration table using the MD5 algorithm.
Revision Level: indicates the revision level of an
:/
MST region, which has two bytes. The default
value is all 0s. The value of the Config Digest
tp

field is the digest of the MST configuration table,


ht

there is a low probability that MST configuration


tables are different but the digest is the same.
In this case, switches in different MST regions
s:

may be incorrectly considered in the same MST


region. It is recommended that different MST
ce

regions use different revision levels to prevent


the preceding problem.
ur

CIST Internal Root Path Cost: indicates the total path


cost from the local port to the IST master. This value is
so

calculated based on link bandwidth.


Re

CIST Bridge Identifier: indicates the ID of the


designated switching device on the CIST.
CIST Remaining Hops: indicates the remaining hops of
ng

a BPDU in the CIST. This field is used to limit the MST


scale. A BPDU has the maximum hop count on the
ni

CIST regional root. The hop count decreases by 1


every time the BPDU passes a network bridge. The
ar

network bridge discards the BPDU with the hop of 0.


Le

MSTI Configuration Messages(may be absent):


indicates an MSTI configuration message.
MSTI Flag: has eight bits. Bits 1 to 7 are the
re

same as those in RSTP. Bit 8 indicates whether


the network bridge is the master bridge, and
Mo

replaces the TCA bit in RSTP.


en
MSTI region Root ID: indicates the regional root

m/
ID of the MSTI.
MSTI IRPC: indicates the path cost from the

co
network bridge sending the BPDU to the MSTI
regional root.

.
MSTI Bridge Priority: indicates the priority of the

ei
network bridge that sends the BPDU.

w
MSTI Port Priority: indicates the priority of the

ua
port that sends the BPDU.
MSTI Remaining Hops: indicates the remaining

.h
number of hops in an MSTI.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

MSTP Topology Calculation


In MSTP, the entire Layer 2 network is divided into multiple
ht

MST regions, which are interconnected by a single CST. In an


MST region, multiple spanning trees are calculated, each of
which is called an MSTI. Among these MSTIs, MSTI 0 is also
s:

known as the internal spanning tree (IST). Like STP, MSTP


ce

uses configuration BPDUs to calculate spanning trees, but the


configuration BPDUs are MSTP-specific.
ur

Vectors
so

Root switching device ID: identifies the root switching device in


the CIST. The root switching device ID consists of the priority
Re

value (16 bits) and MAC address (48 bits). The priority value is
the priority of MSTI 0.
External root path cost (ERPC): indicates the external root
ng

path cost from the CIST regional root to the CIST root. ERPCs
ni

saved on all switching devices in an MST region are the same.


If the CIST root is in an MST region, ERPCs saved on all
ar

switching devices in the MST region are 0s.


Regional root ID: identifies the MSTI regional root. The
Le

regional root ID consists of the priority value (16 bits) and MAC
address (48 bits).
Internal root path cost (IRPC): indicates the path cost from the
re

local bridge to the regional root.


Mo
en
Designated switching device ID: indicates the network bridge

m/
that sends the BPDU.
Designated port ID: identifies the port on the designated

co
switching device connected to the root port on the local device.
The port ID consists of the priority value (4 bits) and port

.
number (12 bits). The priority value must be a multiple of 16.

ei
Receiving port ID: identifies the port that receives the BPDU.

w
The port ID consists of the priority value (4 bits) and port

ua
number (12 bits). The priority value must be a multiple of 16.

.h
If the priority of a vector carried in the configuration message of a
BPDU received by a port is higher than the priority of the vector in the

g
configuration message saved on the port, the port replaces the saved

in
configuration message with the received one. In addition, the port

rn
updates the global configuration message saved on the device. If the
priority of a vector carried in the configuration message of a BPDU

ea
received on a port is equal to or lower than the priority of the vector in
the configuration message saved on the port, the port discards the
BPDU.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

CST Calculation
CST and IST calculation is similar to the calculation in RSTP.
ht

During CST calculation, an MST region is considered as a


network bridge and the ID of the network bridge is the IST
regional root ID.
s:

CIST uses the following vectors: {root switching device ID,


ce

ERPC, regional root ID, IRPC, designated switching device ID,


designated port ID, receiving port ID}. CST uses the following
ur

vectors: {CIST root, ERPC, regional root ID, designated port ID,
receiving port ID}.
so

Topology description:
Assume that S1, S4, and S7 are regional roots in
Re

Region1, Region2, and Region3 respectively. S1 has


the highest priority, S4 has the lowest priority, and the
ng

cost of each path is the same.


Each MST region is considered as a network bridge,
ni

and the ID of the network bridge is the regional root ID.


Each MST region sends a BPDU with itself as the CIST
ar

root and external cost of 0 to other MST regions.


Through RSTP calculation, S1 is the CIST root.
Le

Through ERPC comparison, the port of each regional


root connected to Region1 is the master port.
Through comparison of priorities in regional root IDs,
re

the regional edge port is determined.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IST Calculation
CST and IST calculation is similar to the calculation in RSTP.
ht

MSTP calculates an IST for each MST region, and computes a


CST to interconnect MST regions. The CST and ISTs
constitute a CIST for the entire network.
s:

CIST uses the following vectors: {root switching device ID,


ce

ERPC, regional root ID, IRPC, designated switching device ID,


designated port ID, receiving port ID}. IST uses the following
ur

vectors: {CIST root, IRPC, designated bridge ID, designated


port ID, receiving port ID}.
so

Topology description:
After CST calculation is complete, S1, S4, and S7 are
Re

regional roots in Region1, Region2, and Region3


respectively. In this situation, the regional root is the
ng

network bridge closest to the CIST root but not the


network bridge with the highest priority.
ni

The role of a port on each network bridge is determined


based on the regional root as the root bridge and IRPC,
ar

and then the IST is obtained.


Network bridges in an MST region compare IRPCs to
Le

determine the IST root port.


Port roles in the IST are determined based on priorities
in BPDUs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Region1 Calculation
In an MST region, MSTP calculates an MSTI for each VLAN
ht

based on mappings between VLANs and MSTIs. Each MSTI is


calculated independently. The calculation process is similar to
the process for STP to calculate a spanning tree.
s:

Topology description:
In Region1, VLAN 2 maps to MSTI 2, VLAN 4 to MSTI
ce

4, and other VLANs to MSTI 0.


ur

Different priorities are specified for network bridges in


different MSTIs. Assume that S2 is the root bridge in
so

MSTI 2 and S3 is the root bridge in MSTI 4.


In MSTI 2, S2, S1, and S3 are in descending order of
Re

priority. Through calculation, the port on S3 connected


to S1 is blocked.
In MSTI 4, S3, S1, and S2 are in descending order of
ng

priority. Through calculation, the port on S2 connected


ni

to S1 is blocked.
MSTIs have the following characteristics:
ar

The spanning tree is calculated independently for each MSTI,


and spanning trees of MSTIs are independent of each other.
Le

MSTP calculates the spanning tree for an MSTI in a manner


similar to STP.
Spanning trees of MSTIs can have different roots and
re

topologies.
Mo
en
Each MSTI sends BPDUs in its spanning tree.

m/
The topology of each MSTI is configured by using commands.
A port can be configured with different parameters for different

co
MSTIs.
A port can play different roles or have different statuses in

.
different MSTIs.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Region2 Calculation
Topology description:
ht

In Region2, VLAN 2 maps to MSTI 2, VLAN 3 to MSTI


3, and other VLANs to MSTI 0.
Different priorities are specified for network bridges in
s:

different MSTIs. Assume that S5 is the root bridge in


ce

MSTI 2 and S6 is the root bridge in MSTI 3.


In MSTI 2, S5, S4, and S6 are in descending order of
ur

priority. Through calculation, the port on S6 connected


to S4 is blocked.
so

In MSTI 3, S6, S4, and S5 are in descending order of


priority. Through calculation, the port on S5 connected
Re

to S4 is blocked.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Region3 Calculation
Topology description:
ht

In Region3, VLAN 2 maps to MSTI 2, VLAN 4 to MSTI


4, and other VLANs to MSTI 0.
Different priorities are specified for network bridges in
s:

different MSTIs. Assume that S9 is the root bridge in


ce

MSTI 2 and S8 is the root bridge in MSTI 4.


In MSTI 2, S9, S10, S8, and S7 are in descending
ur

order of priority. Through calculation, the port on S7


connected to S8 and the port on S8 connected to S10
so

are blocked.
In MSTI 4, S8, S7, S10, and S9 are in descending
Re

order of priority. Through calculation, the port on S9


connected to S7 and the port on S10 connected to S7
ng

are blocked.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

MSTI Calculation
After CIST and MSTI calculations are complete, the mapping
ht

between VLANs and MSTIs in each MST region is


independent.
On an MSTP-aware network, a VLAN packet is forwarded
s:

along the following paths:


MSTI including the IST in an MST region
ce

CST among MST regions


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Interoperability between MSTP and RSTP


An RSTP or STP-enabled network bridge considers an MST
ht

region as the RSTP-enabled bridge with the bridge ID as the


regional root ID.
When an RSTP or STP-enabled network bridge receives an
s:

MST BPDU, it obtains the CIST root, ERPC, regional root ID,
ce

and designated port ID in the MST BPDU as the RID, RPC,


BID, and PID.
ur

When an MSTP-enabled network bridge receives an STP or


RST BPDU, it obtains the RID, RPC, BID, and PID as the
so

CIST root, ERPC, regional root ID, and designated port ID.
The BID is used as the regional root ID and designated switch
Re

ID, and the IRPC is 0.


ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

In MSTP, the P/A mechanism works as follows:


The upstream device sends a Proposal packet to the
ht

downstream device, requesting fast switching. After receiving


the Proposal packet, the downstream device sets its port
connecting to the upstream device to the root port and blocks
s:

all non-edge ports.


The upstream device continues to send an Agreement packet.
ce

After receiving the Agreement packet, the root port enters the
ur

Forwarding state.
The downstream device replies with an Agreement packet.
so

After receiving the Agreement packet, the upstream device


sets its port connecting to the downstream device to the
Re

designated port, and the port enters the Forwarding state.

By default, Huawei datacom devices use the enhanced P/A mechanism.


ng

To enable a Huawei datacom device to communicate with third-party


ni

devices that use the ordinary P/A mechanism, run the stp no-
agreement-check command to configure the ordinary P/A mechanism
ar

on the Huawei datacom device.


Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
S1, S2, and S3 must be in descending order of priority to meet
ht

requirements 2 and 3.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The stp mode command sets the working mode of a spanning
ht

tree protocol on a switching device.


The stp root command configures a switching device as the
root bridge or secondary root bridge of a spanning tree.
s:

The stp priority command sets the priority of the switching


ce

device in a spanning tree.


The stp cost command sets the path cost of a port in a
ur

spanning tree.
so

Parameters
stp mode { mstp | rstp | stp }
Re

mstp: indicates the MSTP mode.


rstp: indicates the RSTP mode.
ng

stp: indicates the STP mode.


stp [ instance instance-id ] root { primary | secondary }
ni

instance instance-id: specifies the ID of a spanning tree


instance. It needs to be specified in MSTP.
ar

primary: indicates that the switching device functions as


the primary root bridge of a spanning tree.
Le

secondary: indicates that the switching device functions


as the secondary root bridge of a spanning tree.
re
Mo
en
stp [ instance instance-id ] priority priority

m/
priority priority: specifies the priority of the switching
device in a spanning tree. The priority ranges from 0 to

co
61440. The value is a multiple of 4096, such as 0, 4096
and 8192. The default is 32768.

.
stp [ instance instance-id ] cost cost

ei
cost: specifies the path cost of a port. When the path

w
cost of a port changes, spanning tree recalculation will

ua
be performed.

.h
Precautions
On an STP/RSTP/MSTP network, each spanning tree has only

g
one root bridge, which is responsible for sending BPDUs and

in
connecting devices on the entire network. Because the root

rn
bridge is important on a network, the switching device with
high performance and network hierarchy is required to be

ea
selected as the root bridge. Such a device may not have high
priority, so you can run the stp root command to configure a
/l
switching device as the root bridge in a spanning tree.
A switching device in a spanning tree cannot function as both
:/
the primary and secondary root bridges.
After the stp root command is run to configure a switching
tp

device as the primary root bridge, the priority value of the


ht

switching device is 0 in the spanning tree and the priority


cannot be modified.
After the stp root command is run to configure a switching
s:

device as the secondary root bridge, the priority value of the


switching device is 4096 in the spanning tree and the priority
ce

cannot be modified.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In the preceding topology:
ht

Requirement 1 involves interoperability between RSTP


and STP.
Requirement 2 involves the stp root command usage.
s:

Requirement 3 involves the edge port, BPDU filtering,


ce

and BPDU protection.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The stp mcheck command configures a port to automatically
ht

switch from the STP mode back to the RSTP/MSTP mode.


The stp edged-port default command configures all ports on
a switching device as edge ports.
s:

The stp bpdu-filter default command configures all ports on a


ce

switching device as BPDU-filter ports.


The stp bpdu-protection command enables BPDU protection
ur

on a switching device.
The stp root-protection command enables root protection on
so

a port.
Precautions
Re

After the stp bpdu-filter default and stp edged-port default


commands are run in the system view, none of the ports on
ng

the device will initiate any BPDUs or negotiate with the directly
connected port on the remote device, and all the ports are in
ni

Forwarding state. This may lead to a loop and cause a


broadcast storm. Exercise caution when using the stp bpdu-
ar

filter default and stp edged-port default commands in the


system view.
Le

After BPDU protection is enabled on a switching device, the


switching device sets an edge port in error-down state if the
edge port receives a BPDU and retains the port as an edge
re

port.
Mo
en
The role of a designated port enabled with root protection

m/
cannot be changed. When a designated port enabled with root
protection receives a BPDU with a higher priority, the port

co
enters the Discarding state and does not forward packets. If
the port does not receive any BPDUs with higher priority after

.
a given period of time (generally two Forward Delay periods),

ei
the port automatically enters the Forwarding state.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
S1 must be configured as the root bridge in MSTI2 and S3
ht

must be configured as the root bridge in MSTI3 to meet


requirement 3, the Alternate port as figure above. So, S1 need
be configured as the root bridge in MSTI2, S2, S3, and S4
s:

must be in descending order of priority; and S3 need be


ce

configured as the root bridge in MSTI3, S1, S4, and S2 must


be in descending order of priority.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The region-name command configures the MST region name
ht

of a switching device.
The instance command maps a VLAN to an MSTI.
The revision-level command configures the revision level of
s:

an MST region of a switching device. The default value is 0.


The active region-configuration command activates the
ce

configuration of an MST region.


ur

The stp loop-protection command enables loop protection on


a port.
so

Precautions
Re

Two switching devices belong to the same MST region only


when they have the following identical configurations:
MST region name
ng

Mappings between MSTIs and VLANs


ni

MST region revision level


Loop protection
ar

On a network running a spanning tree protocol, a


switching device maintains the status of the root port
Le

and blocked port by continuously receiving BPDUs


from the upstream switching device.
re
Mo
en
If ports cannot receive BPDUs from the upstream

m/
switching device due to link congestion or
unidirectional link failure, the switching device will re-

co
select a root port. The original root port then becomes
a designated port and the original blocked port enters

.
the Forwarding state. As a result, loops may occur on

ei
the network.

w
Loop protection can be deployed to prevent this

ua
problem. If the root port or alternate port cannot receive
BPDUs from the upstream device for a long period of

.h
time after loop protection is enabled, the root port or
alternate port will send a notification message to the

g
NMS. The root port will enter the Discarding state, and

in
the alternate port remains in Blocking state and no

rn
longer forwards packets. This prevents loops on the
network. The root port or alternate port restores the

ea
Forwarding state after receiving BPDUs.

/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If the topology of an MSTI changes, the forwarding paths of VLANs that


are mapped to this MSTI change. As a result, ARP entries relevant to
ht

these VLANs need to be updated. Based on methods for processing


ARP entries, the convergence modes of a spanning tree protocol are
classified into fast and normal:
s:

In fast mode, the switch directly deletes the ARP entries that
ce

need to be updated in an ARP table.


In normal mode, the switch ages the ARP entries that need to
ur

be updated in the ARP table. If the number of ARP probes for


aging ARP entries is larger than 0, the switch probes these
so

ARP entries before aging them.


In fast mode, frequent ARP entry deletion will affect services
Re

and even may cause 100% CPU usage. As a result, packet


processing will time out, causing network flapping.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Unicast
In unicast mode, the amount of data transmitted on a network
ht

is proportional to the number of users that require the data. If a


large number of users require the same data, the multicast
source must send many copies of data to these users,
s:

consuming high bandwidth on the multicast source and


ce

network. Therefore, the unicast mode is not suitable for batch


data transmission and is applicable only to networks with a
ur

small number of users.


Broadcast
so

In broadcast mode, data is sent to all hosts on a network


segment regardless of whether they require the data. This
Re

threatens information security and causes broadcast storms on


the network segment. Therefore, the broadcast mode is not
ng

suitable for data transmission from a source to specified


destinations. In addition, the broadcast mode wastes network
ni

bandwidth.
Multicast has the following advantages over unicast and broadcast:
ar

Compared with the unicast mode, the multicast mode starts to


copy data and distribute data copies on the network node as
Le

far from the source as possible. Therefore, the amount of data


and the level of network resource consumption will not
increase greatly when the number of receivers increases.
re
Mo
en
Compared with the broadcast mode, the multicast mode

m/
transmits data only to receivers that require the data. This
saves network resources and enhances data transmission

co
security.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast basic concepts


Multicast group: A group of receivers identified by an IP
ht

multicast address. User hosts (or other receiver devices) that


have joined a multicast group become members of the group
and can identify and receive the IP packets destined for the
s:

multicast group address.


Multicast source: A sender of multicast data. The server in the
ce

topology is a multicast source. A multicast source can


ur

simultaneously send data to multiple multicast groups. Multiple


multicast sources can simultaneously send data to the same
so

multicast group. A multicast source does not need to join any


multicast groups.
Re

Multicast group member: A host that has joined a multicast


group. PC1 and PC2 in the following topology are multicast
ng

group members. Memberships in a multicast group change


dynamically. Hosts can join or leave a multicast group anytime.
ni

Members of a multicast group are located anywhere on a


network.
ar

Multicast router: A router or Layer 3 switch that supports IP


multicast. The routers in the following topology are multicast
Le

routers. In addition to multicast routing functions, multicast


routers connected to user network segments provide multicast
membership management.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast service models are classified for receiver hosts and do not
affect multicast sources. All multicast data packets sent from a
ht

multicast source use the IP address of the multicast source as the


source IP address and use a multicast group address as the
destination address. Depending on whether receiver hosts can select
s:

multicast sources, two multicast models are defined: any-source


ce

multicast (ASM) model and source-specific multicast (SSM) model. The


two models use multicast group addresses in different ranges.
ur

ASM model: Receiver hosts can only specify the group they
want to join and cannot select multicast sources.
so

SSM model: Receiver hosts can specify the multicast sources


from which they want to receive multicast data when they join
Re

a group. After joining the group, the hosts receive only the data
sent from the specified sources.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast addresses
IP addresses 224.0.0.0 to 224.0.0.255 are reserved as
ht

permanent group addresses by the Internet Assigned


Numbers Authority (IANA). In this address range, 224.0.0.0 is
not allocated, and the other addresses are used by routing
s:

protocols for topology discovery and maintenance. These


ce

addresses are locally valid. Packets with these addresses will


not be forwarded by routers regardless of the time-to-live (TTL)
ur

values in the packets.


Addresses in the range of 224.0.1.0 to 231.255.255.255 and
so

233.0.0.0 to 238.255.255.255 are ASM group addresses and


are globally valid.
Re

Addresses 232.0.0.0 to 232.255.255.255 are SSM group


addresses available to users and are globally valid.
Addresses 239.0.0.0 to 239.255.255.255 are local
ng

administrative multicast addresses and are valid only in the


ni

local administrative domain. Local administrative group


addresses are private addresses. A local administrative group
ar

address can be used in different administrative domains.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Mapping from IPv4 multicast addresses to MAC addresses


The first four bits of an IPv4 multicast address are 1110,
ht

mapped to the leftmost 25 bits of a MAC multicast address.


Only 23 bits of the last 28 bits are mapped to a MAC address.
This means that 5 bits of the IP address are lost. As a result,
s:

32 multicast IP addresses are mapped to the same MAC


ce

address. For example, IP multicast addresses 224.0.1.1,


224.128.1.1, 225.0.1.1, and 239.128.1.1 are all mapped to
ur

MAC multicast address 01-00-5e-00-01-01. Address conflicts


must be considered in address assignment.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IGMP
IGMP is deployed between multicast routers and user hosts.
ht

On a multicast router, IGMP is configured on interfaces


connected to hosts.
On hosts, IGMP allows group members to dynamically join and
s:

leave multicast groups. On routers, IGMP manages and


ce

maintains group memberships and exchanges information with


upper-layer multicast routing protocols.
ur

PIM
PIM has two modes: PIM-DM and PIM-SM.
so

It must be enabled on all interfaces of all multicast routers.


It provides multicast routing and forwarding, and maintains the
Re

multicast routing table based on network topology changes.


IGMP snooping
IGMP snooping is deployed in VLANs on Layer 2 switches
ng

between multicast routers and hosts.


ni

It listens on IGMP messages exchanged between routers and


hosts to create and maintain a Layer 2 multicast forwarding
ar

table. In this manner, multicast data can be forwarded on a


Layer 2 network.
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IGMP
IGMP is an IPv4 group membership management protocol in
ht

the TCP/IP protocol suite. IP hosts use IGMP to report their


group memberships to any immediately-neighboring multicast
routers.
s:

IGMP is deployed between multicast routers and hosts. On a


ce

multicast router, IGMP is configured on interfaces connected


to hosts.
ur

On hosts, IGMP allows group members to dynamically join and


leave multicast groups. On routers, IGMP manages and
so

maintains group memberships and exchanges information with


upper-layer multicast routing protocols.
Re

The IGMP versions are backward compatible. Therefore, a


multicast router running a later IGMP version can identify
ng

Membership Report messages sent from hosts running an


earlier IGMP version, although the IGMP messages in different
ni

versions use different formats.


All of the IGMP versions support the any-source multicast
ar

(ASM) model. IGMPv3 can be independently used in the


source-specific multicast (SSM) model, whereas IGMPv1 and
Le

IGMPv2 must be used with SSM mapping.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IGMP messages are encapsulated in IP packets. IGMPv1 defines the


following types of messages:
ht

General Query: Sent by a querier to all hosts and routers on


the shared network segment to discover which multicast
groups have members on the network segment.
s:

Report: Sent by a host to request to join a multicast group or


ce

respond to a General Query message.


How IGMPv1 works
ur

IGMPv1 uses a query-report mechanism to manage multicast


groups. When multiple multicast routers exist on a network
so

segment, one router is elected as the IGMP querier to send


Query messages. In IGMPv1 implementation, a unique Assert
Re

winner or designated router (DR) is elected by Protocol


Independent Multicast (PIM) to work as the querier. (The
ng

election mechanism will be described later). The querier is the


only device that sends Membership Query messages on the
ni

local network segment.


General query and report
ar

In the multicast network, R1 and R2 connect to a user network


segment with three receivers: PC1, PC2, and PC3. R1 is the
Le

querier on the network segment. PC1 and PC2 want to receive


data sent to group G1, and PC3 wants to receive data sent to
group G2. The general query and report process is as follows:
re
Mo
en
The IGMP querier (R1) sends a General Query

m/
message with the destination address 224.0.0.1
(indicating all hosts and routers on the same network

co
segment). The IGMP querier sends General Query
messages at intervals. The interval can be configured

.
using a command, and the default interval is 60

ei
seconds.

w
All hosts on the network segment receive the General

ua
Query message. PC1 and PC2 then start a timer for G1
(Timer-G1), and PC3 starts a timer for G2 (Timer-G2).

.h
The timer length is a random value between 0 and 10,
in seconds.

g
The host with the timer expiring first sends a Report

in
message for the multicast group. In this example,

rn
Timer-G1 on PC1 expires first, and PC1 sends a
Report message with the destination address as G1.

ea
When PC2 detects the Report message sent by PC1,
PC2 stops Timer-G1 and does not send any Report
/l
messages for G1. This mechanism reduces the
number of Report messages transmitted on the
:/
network segment, lowering loads on multicast routers.
When Timer-G2 on PC3 expires, PC3 sends a Report
tp

message with the destination address as G2 to the


ht

network segment.
After the routers receive the Report message, they
know that multicast groups G1 and G2 have members
s:

on the local network segment. The routers use the


multicast routing protocol to create (*, G1) and (*, G2)
ce

entries, in which * stands for any multicast source.


Once the routers receive data sent to G1 and G2, they
ur

forward the data to this network segment.


A member joins a group
so

A new host PC4 connects to the network segment. PC4wants


Re

to join multicast group G3 but detects no multicast data for G3.


In this case, PC4 immediately sends a Report message for G3
without waiting for a General Query message. After receiving
ng

the Report message, the routers know that a member of G3


has connected to the network segment, and they create a (*,
ni

G3) entry. When the routers receive data sent to G3, they
forward the data to this network segment.
ar

A member leaves a group


Le

IGMPv1 does not define a Leave message. After a host leaves


a multicast group, it no longer responds to General Query
messages. Assume that PC4 has left group G3. It does not
re

send Report messages for G3 when receiving General Query


messages.
Mo
en
Because there is no other member of G3, routers no longer

m/
receive Report message for G3. After a period of time (130
seconds, Membership timeout interval = IGMP general query

co
interval x Robustness variable + Maximum response time), the
routers delete the multicast forwarding entry of G3.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IGMPv2 defines two types of new messages in addition to General


Query and Report messages:
ht

Group-Specific Query: sent by a querier to a specified group


on the local network segment to check whether the group has
members.
s:

Leave: sent by a host to notify routers on the local network


ce

segment that it has left a group.


IGMPv2 modifies the General Query message format by
ur

adding the Max Response Time field in the message. The field
value controls the response speed of group members and is
so

configurable.
Querier election
Re

IGMPv2 defines an independent querier election mechanism.


When multiple multicast routers are available on a shared
ng

network segment, the router with the smallest IP address is


elected as the querier. IGMPv1 depends on upper-layer
ni

multicast protocols such as PIM for querier election.


Topology description
ar

Each IGMPv2 router considers itself as a querier when


it starts and sends a General Query message to all
Le

hosts and routers on the local network segment.


When other routers receive the General Query
message, they compare the source IP address of the
re

message with their own interface IP addresses.


Mo
en
The router with the smallest IP address becomes the

m/
querier, and the other routers are non-queriers. In this
network, R1 has a smaller interface IP address than R2,

co
so R1 becomes the querier.
All non-querier routers start a timer (Other Querier

.
Present Timer, Timer length = Robustness variable x

ei
IGMP general query interval + (1/2) x Maximum

w
response time. If the robustness variable, IGMP

ua
general query interval, and maximum response time
are all default values, the Other Querier Present Timer

.h
length is 125 seconds.) If non-querier routers receive a
Query message from the querier before the timer

g
expires, they reset the timer. If non-querier routers

in
receive no Query message from the querier when the

rn
timer expires, they trigger election of a new querier.
Leave mechanism

ea
In IGMPv2 implementation, the following process occurs when
PC3 wants to leave multicast group G2 and if PC3 is the group
member of last response query:
/l
PC3 sends a Leave message for G2 to all multicast
:/
routers on the local network segment. The destination
address of the Leave message is 224.0.0.2.
tp

When the querier receives the Leave message, it


ht

sends Group-Specific Query messages for G2 at


intervals to check whether G2 has other members on
the network segment. The sending interval and number
s:

of Group-Specific Query messages sent by the querier


are configurable. By default, the querier sends a total of
ce

two Group-Specific Query messages, at an interval of 1


second. In addition, the querier starts the membership
ur

timer (Timer-Membership, Timer length = Interval for


sending Group-Specific Query messages x Number of
so

messages sent).
Re

If G2 has no other member on the network segment,


the routers cannot receive any Report message for G2.
After Timer-Membership expires, the routers delete the
ng

downstream interface connected to the network


segment from the (*, G2) entry. Then the routers no
ni

longer forward data of G2 to the network segment.


If G2 has other members on the network segment, the
ar

members send a Report message for G2 within the


Le

maximum response time. The routers continue


maintaining membership of G2.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IGMPv3 was developed to support the source-specific multicast (SSM)


model. IGMPv3 messages can contain multicast source information so
ht

that hosts can receive data sent from a specific source to a specific
group.
IGMPv3 also defines two types of messages: Query and Report.
s:

Compared with IGMPv2, IGMPv3 has the following changes:


In addition to General Query and Group-Specific Query
ce

messages, IGMPv3 defines a new Query message type:


ur

Group-and-Source-Specific Query. A querier sends a Group-


and-Source-Specific Query message to members of a specific
so

group on the shared network segment, to check whether the


group members want data from specific sources. A Group-
Re

and-Source-Specific Query message carries one or more


multicast source addresses.
A host can send a Report message to notify a multicast router
ng

that it wants to join a multicast group and receive data from


ni

specified multicast sources. IGMPv3 supports source filtering


and defines two filter modes: INCLUDE and EXCLUDE.
ar

Group-source mappings are represented as (G, INCLUDE, (S1,


S2...)) or (G, EXCLUDE, (S1, S2...)). The (G, INCLUDE, (S1,
Le

S2...)) entry indicates that a host only wants to receive data


sent from the listed multicast sources to group G. The (G,
EXCLUDE, (S1, S2...)) entry indicates that a host wants to
re

receive data sent from all multicast sources except the listed
Mo

ones to group G.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Group Record types in IGMPv3 Report messages


IS_IN
ht

Indicates that the source filter mode is INCLUDE for a


multicast group. That is, members of the group want to
receive only data sent from the specified sources.
s:

IS_EX
Indicates that the source filter mode is EXCLUDE for a
ce

multicast group. That is, members of the group want to


ur

receive data sent from multicast sources except the


specified sources.
so

TO_IN
Indicates that the source filter mode for a multicast
Re

group has changed from EXCLUDE to INCLUDE. If the


source list is empty, the members have left the
ng

multicast group.
TO_EX
ni

Indicates that the source filter mode for a multicast


group has changed from INCLUDE to EXCLUDE.
ar

ALLOW
Indicates that members of a multicast group want to
Le

receive data from the specified multicast sources in


addition to the current sources. If the source filter mode
for the multicast group is INCLUDE, the specified
re

sources are added to the source list. If the source filter


Mo

mode is EXCLUDE, the specified sources are deleted


from the source list.
en
BLOCK

m/
Indicates that members of a multicast group no longer
want to receive data from the specified multicast

co
sources. If the source filter mode for the multicast
group is INCLUDE, the specified sources are deleted

.
from the source list. If the source filter mode is

ei
EXCLUDE, the specified sources are added to the

w
source list.

ua
An IGMPv3 Report message can carry multiple groups, whereas an
IGMPv1 or IGMPv2 Report message can carry only one group. IGMPv3

.h
greatly reduces the number of messages transmitted on a network.
Unlike IGMPv2, IGMPv3 does not define a Leave message. Group

g
members send Report messages of a specified type to notify multicast

in
routers that they have left a group. For example, if a member of group

rn
225.1.1.1 wants to leave the group, it sends a Report message with
(225.1.1.1, TO_IN, (0)).

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If IGMPv1 or IGMPv2 is running between a host and its upstream router,


the host cannot select multicast sources when it joins group G. The
ht

host receives data from both S1 and S2, regardless of whether it


requires the data. If IGMPv3 is running between the host and its
upstream router, the host can choose to receive only data from S1
s:

using either of the following methods:


Method 1: Send an IGMPv3 Report (G, IS_IN, (S1)),
ce

requesting to receive only the data sent from S1 to G.


ur

Method 2: Send an IGMPv3 (G, IS_EX, (S2)), notifying the


upstream router that it does not want to receive data from S2.
so

Only data sent from S1 is then forwarded to the host.


Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Compatibility with IGMPv1 routers


When IGMPv2 hosts discover an IGMPv1 router, they must
ht

send IGMP Report messages to the router and cannot send


Leave messages.
If there are both IGMPv1 and IGMPv2 routers on a network
s:

segment, the querier must send IGMPv1 messages.


ce

Compatibility with IGMPv1 hosts


IGMP v2 hosts must allow their Report messages to be
ur

suppressed by IGMPv1 Report messages. Otherwise, the


querier will not know existence of IGMPv1 hosts on the shared
so

network segment. If the querier is an IGMPv2 router and


receives a Leave message for a group (there are IGMPv1
Re

hosts in the group), the IGMPv1 hosts will not receive traffic for
this group.
If an IGMPv2 router detects IGMPv1 hosts on the local
ng

network segment, the router ignores any subsequent Leave


ni

messages received.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

SSM mapping is implemented based on static SSM mapping entries. A


multicast router converts (*, G) information in IGMPv1 and IGMPv2
ht

Report messages to (S, G) information according to static SSM


mapping entries, so as to provide the SSM service for IGMPv1 and
IGMPv2 hosts. By default, SSM group addresses range from 232.0.0.0
s:

to 232.255.255.255.
ce

IGMP SSM mapping does not apply to IGMPv3 Report messages. To


enable hosts running any IGMP version on a network segment to
ur

obtain the SSM service, IGMPv3 must run on interfaces of multicast


routers on the network segment.
so

With SSM mapping entries configured, a router checks the group


Re

address G in each IGMPv1 or IGMPv2 Report message received, and


processes the message based on the check result:
If G is in the range of any-source multicast (ASM) group
ng

addresses, the router provides the ASM service for the host.
ni

If G is in the range of SSM group addresses:


When the router has no SSM mapping entry matching G,
ar

it does not provide the SSM service and drops the


Report message.
Le

If the router has an SSM mapping entry matching G, it


converts (*, G) information in the Report message into (S,
G) information and provides the SSM service for the host.
re

Topology description
Mo
en
On an SSM network, PC1 runs IGMPv3, PC2 runs IGMPv2, and

m/
PC3 runs IGMPv1. PC2 and PC3 cannot run IGMPv3. To
provide the SSM service for all the hosts on the network

co
segment, IGMP SSM mapping must be configured on R1.
Before SSM mapping is enabled, the group-source mappings

.
on R1 are as follows:

ei
Group 232.0.0.0/8 mapped to source 10.10.1.1

w
Group 232.1.0.0/16 mapped to source 10.10.2.2

ua
Group 232.1.1.0/24 mapped to source 10.10.3.3
After SSM mapping is enabled on R1, R1 checks group

.h
addresses of received packets to see whether the group
addresses are in the SSM group address range. If the group

g
addresses are in the SSM group address range, R1 generates

in
the following multicast entries according to the configured SSM

rn
mapping entries. If a group address is mapped to multiple
sources, R1 generates multiple (S, G) entries. The following are

ea
entries generated according to information in Report messages
sent from PC2 and PC3:
10.10.1.1232.1.2.2
10.10.2.2232.1.2.2
/l
:/
10.10.1.1232.1.3.3
10.10.2.2232.1.3.3
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Report message to the upstream device. The upstream device can


send multicast packets to the host after receiving the Report message.
ht

IGMP messages are encapsulated in IP packets (Layer 3 packets).


Layer 2 devices between hosts and multicast routers, however, cannot
process Layer 3 information carried in IP packets. In addition, Layer 2
s:

devices cannot learn any MAC multicast address because the source
ce

MAC addresses of link layer data frames are not MAC multicast
addresses. When a Layer 2 device receives a data frame with a
ur

multicast destination MAC address, the device cannot find a matching


entry in its MAC address table. Consequently, the device broadcasts
so

the multicast packet. This wastes bandwidth resources and poses


threats to network security.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Concepts
A router port is a link layer device's port towards a
ht

multicast router. The link layer multicast device


receives packets through the router port. Router ports
are classified into two types:
s:

Dynamic router port: A port that can receive


ce

IGMP Query messages or PIM Hello messages


whose source addresses are not 0.0.0.0.
ur

Dynamic router ports are dynamically


maintained based on protocol packets
so

exchanged between multicast devices and


hosts. Each dynamic router port has a timer.
Re

When the timer expires, the member port ages


out.
Static router port: Manually specified using a
ng

command. Static router ports will not age out.


ni

A group member port is a port towards user hosts. A


link layer multicast device sends multicast packets to
ar

receiver hosts through group member ports. Group


member ports are classified into two types:
Le

Dynamic member port: A port that can receive


IGMP Report messages. Dynamic member
ports are dynamically maintained based on
re

protocol packets exchanged between multicast


Mo

devices and hosts.


en
Each dynamic member port has a timer. When

m/
the timer expires, the member port ages out.
Static member port: Manually specified using a

co
command. Static member ports will not age out.
The output port list is important information for layer-2

.
multicast, include port of router and port of member.

ei
Working mechanisms

w
When a router port on an Ethernet switch receives an

ua
IGMP General Query message, the switch resets the
aging timer of the router port. If the port that receives

.h
the General Query message is not a router port, the
switch starts the aging timer for the port. (The aging

g
time is 180 seconds or the Holdtime value carried in

in
PIM Hello messages received by the switch. The

rn
default Holdtime value is 105 seconds.)
When an Ethernet switch receives an IGMP Report

ea
message, it checks whether there is a MAC multicast
group matching the IP multicast group that the user
wants to join.
/l
If the MAC multicast group does not exist, the
:/
switch creates the MAC multicast group, adds
the port that receives the Report message to
tp

the MAC multicast group, and starts the aging


ht

timer on the port (Timer length = Robustness


variable x General query interval + Maximum
response time). In addition, the switch adds all
s:

router ports in the same VLAN as the member


port to the MAC multicast forwarding entry. It
ce

then creates an IP multicast group and adds the


port that receives the Report message to the IP
ur

multicast group.
If the MAC multicast group exists but the port
so

that receives the IGMP Report message is not


Re

in the group, the switch adds the port to the


MAC multicast group and starts the aging timer
on the port. The switch then checks whether the
ng

IP multicast group exists. If the IP multicast


group does not exist, the switch creates the IP
ni

multicast group and adds the port to it. If the IP


multicast group exists, the switch adds the port
ar

to the group directly.


Le

If the MAC multicast group exists and the port


that receives the IGMP Report message is
already in the group, the switch resets the aging
re

timer on the port.


Mo
en
IGMP Leave message: When an Ethernet switch

m/
receives an IGMP Leave message for a group on a
port, it sends an IGMP Group-Specific Query

co
message to the port to check whether the group has
other members on the port. At the same time, the

.
switch starts the query response timer (Timer length =

ei
Group-specific query interval x Robustness variable).

w
If the switch does not receive any IGMP Report

ua
message for the group when the query response
timer expires, it deletes the port from the matching

.h
MAC multicast group. If the MAC multicast group has
no member port, the switch requests the upstream

g
multicast router to delete this branch from the

in
multicast tree.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Layer 2 multicast
If users in different VLANs require the same multicast data, the
ht

upstream router still has to send multiple copies of identical


multicast data to different VLANs.
Users in VLAN 2 and VLAN 3 need to receive the same
s:

multicast data flow. Multicast router R1 replicates the multicast


ce

data in each VLAN and sends two copies of data to


downstream switch S1. This wastes bandwidth between the
ur

router and Layer 2 device and increases loads on the router.


Multicast VLAN
so

The multicast VLAN feature allows Layer 2 network devices to


replicate multicast data across VLANs.
Re

After the multicast VLAN function is configured on S1, R1


replicates multicast data in the multicast VLAN (VLAN 4) and
ng

sends only one copy to S1. As the router does not need to
replicate multicast data in VLAN 2 and VLAN 3, network
ni

bandwidth is conserved and loads on the router are reduced.


Concepts
ar

Multicast VLAN: VLAN to which a network-side interface


belongs. A multicast VLAN is used to aggregate multicast data
Le

flows. One multicast VLAN can be bound to multiple user


VLANs.
User VLAN: VLAN to which a user-side interface belongs. A
re

user VLAN is used to receive multicast data flows from the


Mo

multicast VLAN. A user VLAN can be bound only to one


multicast VLAN.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

We have learned about the Internet Group Management Protocol


(IGMP). The IGMP protocol runs between receiver hosts and multicast
ht

routers, whereas a multicast routing protocol needs to run between


routers.
A multicast routing protocol is used to create and maintain multicast
s:

routes, and to forward multicast data packets correctly and efficiently.


ce

Multicast routes construct a unidirectional loop-free data transmission


path from a data source to multiple receivers. This transmission path is
ur

a multicast distribution tree. Multicast routing protocols can be intra-


domain or inter-domain protocols. This course introduces PIM, a typical
so

intra-domain multicast routing protocol.


Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PIM router
Routers with PIM enabled on interfaces are called PIM routers.
ht

A multicast distribution tree contains the following types of PIM


routers:
Leaf router: The PIM router directly connected to a user
s:

host, which may not be multicast group members.


First-hop router: The PIM router directly connected to a
ce

multicast source on the multicast forwarding path and


ur

responsible for forwarding multicast data from the


multicast source.
so

Last-hop router: The PIM router directly connected to a


multicast group member on the multicast forwarding
Re

path and responsible for forwarding multicast data to


the member.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast distribution tree


On a PIM network, a point-to-multipoint multicast forwarding
ht

path is set up for each multicast group on routers. The


multicast forwarding path is in a tree topology, so it is also
called a multicast distribution tree.
s:

There are two types multicast distribution trees: source tree


ce

and shared tree.


Source tree
ur

A source tree is rooted at a multicast source and combines the


shortest paths from the source to receivers.
so

Therefore, a source tree is also called a shortest path tree


(SPT). For a multicast group, routers need to establish an SPT
Re

from each multicast source that sends packets to the group.


In this example, there are two multicast sources (S1 and S2)
ng

and two receivers (PC1 and PC2). Therefore, two source trees
are established on the network.
ni

PIM routing entry


PIM routing entries are created by the PIM protocol to guide
ar

multicast forwarding.
An (S, G) entry contains a known multicast source for a group,
Le

and is used to establish an SPT on PIM routers. (S, G) entries


apply to both PIM-DM and PIM-SM networks.
If an (S, G) entry exists on a PIM router, the router forwards
re

multicast packet according to the (S, G) entry.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast distribution tree


On a PIM network, a point-to-multipoint multicast forwarding
ht

path is set up for each multicast group on routers. The


multicast forwarding path is in a tree topology, so it is also
called a multicast distribution tree.
s:

There are two types multicast distribution trees: source tree


ce

and shared tree.


Shared tree
ur

A shared tree is rooted at a rendezvous point (RP) and


combines shortest paths from the RP and all receivers. It is
so

therefore also called a rendezvous point tree (RPT). Each


multicast group has only one shared tree. All multicast sources
Re

and receivers of a group send and multicast data packets


along the shared tree. A multicast source first sends data
ng

packets to the RP, which then forwards the packets to all


receivers.
ni

In this example, multicast sources S1 and S2 share one RPT.


PIM routing entry
ar

PIM routing entries are created by the PIM protocol to guide


multicast forwarding.
Le

A (*, G) entry contains a known multicast group, with the


multicast source unknown. It is used to establish an RPT on
PIM routers. (*, G) entries apply only to PIM-SM networks.
re

If no (S, G) entry is available and only a (*, G) entry exists on a


Mo

router, the router creates an (S, G) entry based on this (*, G)


entry, and then forwards multicast packets according to the (S,
G) entry.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PIM DM overview
PIM-DM uses the push mode to forward multicast packets and
ht

is often used on small-scale networks with densely distributed


multicast group members. PIM-DM assumes that each
network segment has multicast group members. When a
s:

multicast source sends multicast packets, PIM-DM floods the


ce

multicast packets to all PIM routers on the network and prunes


the branches with no members. PIM-DM establishes and
ur

maintains a unidirectional loop-free SPT (source-specific


shortest path tree) through periodical flood-and-prune
so

processes. If a new group member connects to a leaf router on


a pruned branch, the router can initiate a grafting process to
Re

restore multicast forwarding before the next flood-and-prune


process.
PIM-DM uses the following mechanisms: neighbor discovery, flooding,
ng

pruning, grafting, assert, and state refresh. The flooding, pruning, and
ni

grafting mechanisms are used to establish an SPT.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PIM routers send Hello messages through all PIM-enabled interfaces.


The multicast packet encapsulating a Hello message has a destination
ht

IP address of 224.0.0.13 (indicating all PIM routers on a network


segment), and the source IP address is the IP address of the interface
sending the multicast packet. The TTL value of the multicast packet is 1.
s:

Hello messages are used to discover PIM neighbors, adjust PIM


ce

protocol parameters, and maintain neighbor relationships.


Discovering PIM neighbors
ur

PIM routers on the same network segment must


receive multicast packets with the destination address
so

224.0.0.13. By exchanging Hello messages, directly


connected PIM routers learn neighbor information and
Re

establish neighbor relationships.


A PIM router can receive other PIM messages to
ng

create multicast routing entries only after it establishes


neighbor relationships with other PIM routers.
ni

Adjusting PIM protocol parameters


A Hello message carries the following PIM protocol
ar

parameters to control PIM message exchange between


PIM neighbors:
Le

DR_Priority: indicates the priority used by an


interface in DR election. The interface with the
highest priority becomes the DR. This
re

parameter is used for DR election only on PIM-


Mo

SM networks.
en
Holdtime: indicates timeout interval of a

m/
neighbor relationship. A PIM router considers its
neighbor reachable within the Holdtime interval.

co
LAN_Delay: indicates the delay in transmitting
Prune messages on a shared network segment.

.
Neighbor-Tracking: indicates the neighbor

ei
tracking function.

w
Override-Interval: indicates the interval for

ua
overriding a pruning operation.
Maintaining neighbor relationships

.h
PIM routers periodically send Hello messages to each
other. If a PIM router does not receive any Hello

g
message from a PIM neighbor within the Holdtime

in
interval, the router considers the neighbor unreachable

rn
and deletes the neighbor from the neighbor list.
Changes of PIM neighbors lead to changes in the

ea
multicast network topology. If an upstream or
downstream neighbor in the multicast distribution tree
/l
is unreachable, multicast routes need to re-converge,
and the multicast distribution tree will change.
:/
IGMPv1 querier election
Routers on a PIM-DM network compare the priorities and IP
tp

addresses carried in Hello messages to elect a DR for each


ht

network segment. The DR functions as the IGMPv1 querier on


the network segment.
If the DR fails, neighboring routers trigger a new DR election
s:

process when the Hello timeout timer expires.


Hello timers
ce

The default Hello interval is 30 seconds.


The default Hello timeout interval is 105 seconds.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

On a PIM-DM network, multicast packets sent from a multicast source


are flooded throughout the entire network. When a PIM router receives
ht

a multicast packet, the router performs an RPF check on the packet


against the unicast routing table. If the packet passes the RPF check,
the router creates an (S, G) entry, in which the downstream interface
s:

list contains all the interfaces connected to downstream PIM neighbors.


ce

The router then forwards subsequent multicast packets through each


downstream interface.
ur

When multicast packets reach a leaf router, the leaf router processes
so

the packets as follows:


If the network segment connected to the leaf router has group
Re

members, the leaf router adds its interface connected to the


network segment to the downstream interface list of the (S, G)
ng

entry, and forwards subsequent multicast packets to the group


members.
ni

If the network segment connected to the leaf router has no


group member and the leaf router does not need to forward
ar

multicast packets to downstream PIM neighbors, the leaf


router initiates a pruning process.
Le

Topology description
Multicast source S sends a multicast packet to multicast group
G.
re
Mo
en
When R1 receives the multicast packet, it performs an RPF

m/
check on the packet against the unicast routing table. After the
packet passes the RPF check, R1 creates an (S, G) entry, in

co
which the downstream interface list contains interfaces
connected to R2 and R5. R1 then forwards subsequent

.
packets to R2 and R5.

ei
R2 receives the multicast packet from R1. After the packet

w
passes the RPF check, R2 creates an (S, G) entry, in which

ua
the downstream interface list contains the interfaces
connected to R3 and R4. R2 then forwards subsequent

.h
packets to R3 and R4.
R5 receives the multicast packet from R1. Because the

g
downstream network segment does not have group members

in
or PIM neighbors, R5 triggers a pruning process.
R3 receives the multicast packet from R2. After the packet

rn
passes the RPF check, R3 creates an (S, G) entry, in which

ea
the downstream interface list contains the interface connected
to PC1. R3 then forwards subsequent packets to PC1
/l
R4 receives the multicast packet from R2. Because the
downstream network segment does not have group members
:/
or PIM neighbors, R4 triggers a pruning process.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When a PIM router receives a multicast packet, it performs an RPF


check on the packet. If the packet passes the RPF check but the
ht

downstream network segment does not have any group member, the
PIM router sends a Prune message to the upstream router. After
receiving the Prune message from the downstream interface, the
s:

upstream router deletes the downstream interface from the downstream


ce

interface list of the (S, G) entry. The multicast packets will not be
forwarded to this downstream interface. A pruning operation is initiated
ur

by a leaf router. The Prune message is sent upstream hop by hop, and
PIM routers receiving the Prune message deletes the downstream
so

interface from the (S, G) entry. Finally, the multicast distribution tree
contains only branches with group members.
Re

A PIM router starts a prune timer (210 seconds by default) for the
pruned downstream interface and resumes multicast forwarding on the
ng

interface after the timer expires. Multicast packets are then flooded on
ni

the entire network, and new group members can receive multicast
packets. Subsequently, leaf routers without group members attached
ar

trigger pruning processes. PIM-DM updates the SPT through periodic


flood-and-prune processes.
Le
re
Mo
en
After a downstream interface of a leaf router is pruned:

m/
If new members join the multicast group on the interface and
want to receive multicast packets before the next flood-and-

co
prune process, the leaf router initiates a grafting process.
If no member joins the multicast group and multicast

.
forwarding still needs to be suppressed on the interface, the

ei
leaf router initiates a state refresh process.

w
ua
Topology description
R5 sends a Prune message to R1 to notify R1 that the

.h
downstream network segment no longer needs to receive
multicast data.

g
After receiving the Prune message, R1 stops forwarding data

in
through its downstream interface connecting to R5, and

rn
deletes this downstream interface from the (S, G) entry. R1
has another downstream interface in forwarding state, so the

ea
pruning process ends. Subsequent multicast packets are only
forwarded to R2.
/l
R4 sends a Prune message to R2 to notify R2 that the
downstream network segment no longer needs to receive
:/
multicast data.
After receiving the Prune message, R2 waits for 3 seconds
tp

(LAN-delay +override-interval). R3 also receives the Prune


ht

message sent by R4. Because R3 connects to a downstream


receiver, R3 sends a Join message to override the Prune
message.
s:

After R2 receives the Join message, it ignores the Prune


message sent from R4 and continues forwarding multicast
ce

traffic to the downstream interface.


ur

The LAN-delay and override-interval are explained as follows:


Hello messages carry the LAN-delay and override-interval
so

parameters. The LAN-delay parameter specifies the packet


Re

transmission delay (500 milliseconds by default), and the


override-interval specifies the interval during which
downstream routers can override a pruning operation (2500
ng

milliseconds by default).
If a router sends a Prune message upstream but other routers
ni

on the same network segment still need to receive multicast


data, they must send a Join message to override the pruning
ar

operation within the override-interval.


Le

If routers on a link have different override-interval values, the


maximum override-interval value used among the routers is
used on the link.
re
Mo
en
The total of LAN-delay and override-interval is the prune-

m/
pending timer (PPT). After a router receives a Prune message
from a downstream interface, it waits until the PPT expires,

co
and then prune the downstream interface. If the router receives
a Join message from the downstream interface before the PPT

.
expires, it cancels the pruning operation.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast routers prune branches without group members to establish a


new SPT according to received Prune messages. Although routers no
ht

longer forward multicast packets to pruned branches, the


corresponding (S, G) entry still exists on each router. Once new
members join the group on the pruned branches, the downstream
s:

interfaces can be quickly added to the entry to resume multicast


ce

forwarding.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PIM-DM uses the grafting mechanism to enable new group members


on a pruned network segment to rapidly obtain multicast data. A leaf
ht

router can determine that a multicast group G has new members on a


network segment according to IGMP messages. The leaf router then
sends a Graft message to notify the upstream router that the
s:

downstream network segment needs multicast data. After receiving the


ce

Graft message, the upstream router adds the downstream interface to


the downstream interface list of the (S, G) entry.
ur

A grafting process is initiated by a leaf router and ends on the router


that can receive multicast packets.
so

Topology description
Re

Pruned downstream nodes can resume multicast forwarding


when the prune timer expires, but they must wait for 210
ng

seconds before the prune timer expires. This is quite a long


time for new group members. To reduce the waiting time, a
ni

pruned downstream router can send a Graft message to notify


the upstream router.
ar

When the network segment connected to R5 has a new group


member, R5 sends a Graft message towards the multicast
Le

source S. When R1 receives the Graft message, it replies with


a Graft ACK message. After that, multicast data can be
forwarded to the previously pruned branch.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To prevent pruned interfaces from resuming multicast forwarding after


the prune timer expires, the first-hop router nearest to the multicast
ht

source periodically sends a State-Refresh message throughout the


entire PIM-DM network. Other PIM routers reset the prune timer after
receiving the State-Refresh message. In this way, pruned downstream
s:

interfaces remain suppressed if leaf routers connected to the interfaces


ce

have no new group members attached.


ur

Topology description
R1 sends a State-Refresh message to R2 and R5 to initiate a
so

state refresh process.


R5 has a pruned interface and resets the prune timer on the
Re

interface. If R5 still has no group member on the connected


network segment when the next flood-and-prune process
ng

starts, the pruned interface is still suppressed.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If multicast PIM routers forward multicast packets to the same network


segment after the multicast packets pass the RPF check, only one PIM
ht

router can be selected through the assert mechanism to forward


multicast packets to the network segment. When a PIM router receives
a multicast packet that is the same as the multicast packet it sends to
s:

other neighbors, the PIM router sends an Assert message with the
ce

destination address 224.0.0.13 to all other PIM routers on the same


network segment. When the other PIM routers receive the Assert
ur

message, they compare local parameters with those carried in the


Assert message for assert election. The assert election is performed
so

according to the following rules:


The router with the highest priority of the unicast routing
Re

protocol wins.
If these routers have the same priority, the router with the
ng

smallest route cost to the multicast source wins.


If these routers have the same priority and the same route cost
ni

to the multicast source, the router with the largest downstream


interface IP address wins.
ar

The PIM routers perform the following operations based on assert


Le

election results:
The downstream interface of the router that wins the election is
the assert winner and forwards multicast packets to the shared
re

network segment.
Mo
en
The downstream interfaces the PIM routers that lose the

m/
election are assert losers and no longer forward multicast
packets to the shared network segment. The PIM routers

co
delete the downstream interfaces from the downstream
interface list of their (S, G) entries.

.
After the assert election is complete, only one downstream

ei
interface is active on the network segment, so only one copy of

w
multicast packets is transmitted to the network segment. All

ua
assert losers can resume multicast packet forwarding after a
specified interval (180 seconds by default), triggering periodic

.h
assert elections.

g
Topology description

in
In this example, R2 has a smaller cost to the multicast source

rn
than R3.
R2 and R3 receive a multicast packet from each other through

ea
their downstream interfaces, but both the packets fail the RPF
check and are dropped. R2 and R3 then send an Assert
message to the network segment.
/l
R2 compares its routing information with that carried in the
:/
Assert message sent by R3 and finds that its own route cost to
the multicast source is smaller. Therefore, R2 wins the election.
tp

R2 continues forwards multicast packets to the network


ht

segment, whereas R3 drops subsequent multicast packets


because these packets fail the RPF check.
R3 compares its routing information with that carried in the
s:

Assert message sent by R2 and finds that its own router cost
to the multicast source is larger. Therefore, R3 fails the
ce

election. R3 then blocks multicast forwarding on its


downstream interface and deletes the interface from the
ur

downstream interface list of the (S, G) entry.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PIM-SM applies to the any-source multicast (ASM) and source-specific


multicast (SSM) models. In the ASM model, PIM-SM uses the pull
ht

mode to forward multicast packets. This mode is used in networks with


a lot of sparsely distributed group members. PIM-SM is implemented as
follows:
s:

A PIM router works as the rendezvous point (RP) to serve


ce

group members or multicast sources that appear on the


network. All PIM routers on the network know the RP's position.
ur

When a new group member appears on the network (a host


sends an IGMP message to request to join a multicast group
so

G), the last-hop router sends a Join message to the RP. The
Join message is transmitted hop by hop, and all the routers
Re

receiving the message create a (*, G) entry. Finally, an RPT


rooted at the RP is set up.
When an active multicast source appears on the network (the
ng

multicast source sends the first multicast packet to a multicast


ni

group G), the first-hop router encapsulates the multicast data


in a Register message and sends the Register message to the
ar

RP in unicast mode. The RP then creates an (S, G) entry, and


the multicast source is registered on the RP.
Le

PIM-SM uses the following mechanisms in the ASM model: neighbor


discovery, DR election, RP discovery, RPT setup, multicast source
re

registration, SPT switchover, pruning, and assert. You can also


Mo

configure a bootstrap router (BSR) to implement fine-grained


management in a PIM-SM domain.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The network segment of a multicast source or receivers may connect to


multiple PIM routers. The PIM routers exchange Hello messages to set
ht

up PIM neighbor relationships. The Hello message sent by a router


carries the DR priority of the router and IP address of the interface
connected to the network segment. Each PIM router compares its own
s:

information with the information carried in the Hello messages received


ce

from its neighbors. The DR elected among the PIM routers is


responsible for forwarding multicast packets for the multicast source or
ur

receivers. The DR is elected according to the following rules:


The PIM router with the highest DR priority wins (all routers on
so

the network segment support the DR priority).


If PIM routers have the same DR priority or at least one PIM
Re

router does not allow the DR priority field in Hello messages,


the PIM router with the largest IP address wins.
If the current DR fails, other PIM routers trigger a new DR
ng

election when the PIM neighbor timeout timer expires (105


ni

seconds by default).
ar

In the ASM model, the DR provides the following functions:


The DR on the shared network segment connected to a
Le

multicast source sends Register messages to the RP. This DR


is called the source DR.
The DR connected to the shared network segment of group
re

members sends Join messages to the RP. This DR is called


Mo

the receiver DR.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

On a PIM-SM network, an RPT is a multicast distribution tree with the


RP as the root and PIM routers that have group memberships as
ht

leaves. In the topology shown in the figure, when a group member


appears on the network (a user sends an IGMP message to join a
multicast group G), the receiver DR sends a Join message to the RP.
s:

The Join message is transmitted hop by hop, and routers receiving the
ce

message create a (*, G) entry. Finally, an RPT rooted at the RP is set


up.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

On a PIM-SM network, any new multicast source must register on the


RP so that the RP can forward multicast data from the multicast source
ht

to group members. The multicast source registration process is as


follows:
A multicast source sends a multicast packet to the source DR
s:

(R1).
After receiving the multicast packet, the source DR
ce

encapsulates the multicast packet into a Register message


ur

and sends the Register message to the RP (R2).


The RP decapsulates the received Register message, creates
so

an (S, G) entry, and forwards the multicast packet to group


members along the RPT.
Re

The RP no longer needs any Register message sent from R1,


so it sends a Register-Stop message to R1. R1 then stops
ng

sending Register messages to the RP.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

On a PIM-SM network, each multicast group can have only RP and one
RPT. Before an SPT switchover, all multicast packets destined for a
ht

multicast group must be encapsulated in Register messages and then


sent to the RP. The RP decapsulates Register messages and forwards
multicast packets along the RPT. All multicast packets pass through the
s:

RP. As the rate of multicast packets increases, the RP faces heavy


ce

loads. To resolve this problem, PIM-SM allows the RP or the receiver


DR to trigger an SPT switchover.
ur

SPT switchover conditions


so

When the multicast traffic rate exceeds the specified threshold,


PIM-SM triggers an RPT-to-SPT switchover.
Re

According to default configuration of the VRP, routers


connected to receivers join the SPT immediately after
ng

receiving the first multicast data packet from a multicast source.


ni

The receiver DR periodically checks the rate of multicast packets for an


(S, G) and triggers an SPT switchover when the rate exceeds the
ar

specified threshold.
The receiver DR sends a Join message to the source DR. The
Le

Join message is transmitted hop by hop, and routers receiving


the message create an (S, G) entry. Finally, an SPT is set up
from the source DR to the receiver DR.
re
Mo
en
After the SPT is set up, the receiver DR sends a Prune

m/
message to the RP. The Prune message is transmitted hop by
hop along the RPT, and routers receiving the message delete

co
their downstream interfaces from the (S, G) entry. After the
pruning process is complete, the RP no longer forwards

.
multicast packets along the RPT.

ei
If the SPT does not pass through the RP, the RP continues to

w
send a Prune message to the source DR, so that routers along

ua
the path between the RP and source DR delete their
downstream interfaces from the (S, G) entry. After the pruning

.h
process is complete, the source DR no longer forwards
multicast packets along the SPT to the RP.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

On a PIM-SM network, the root of a shared tree is an RP.


ht

An RP provides the following functions:


Forwards all multicast packets transmitted in the shared tree to
receivers.
s:

Forwards multicast data of several or all multicast groups. A


ce

network can have one or multiple RPs. You can configure an


RP to serve multicast groups in a specified range. An RP can
ur

serve multiple multicast groups, but each multicast group can


have only one RP. Multicast packets sent from a multicast
so

source to all receivers of a group are aggregated on the RP.


RP discovery:
Re

Static RP: A static RP address is specified on all PIM routers


in the PIM domain using the static-rp rp-address command.
Dynamic RP: Several PIM routers in a PIM domain are
ng

configured as candidate-RPs (C-RPs), among which an RP is


ni

elected. Candidate bootstrap routers (C-BSRs) also need to


be configured. A BSR is elected among the C-BSRs.
ar

An RP is the core router in a PIM-SM domain. If a small and simple


Le

network needs to transmit light multicast traffic and one RP is enough,


you can specify the RP address statically on all routers in the PIM-SM
domain. In most cases, PIM-SM networks have a large scale and need
re

to transmit heavy multicast traffic. To reduce loads on each RP and


Mo

optimize shared tree topology, different multicast groups should have


different RPs. Dynamic RP election is required in this condition, and a
BSR is required for RP election.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

During a BSR election, each C-BSR considers itself as the BSR and
sends a Bootstrap message to the entire network. The Bootstrap
ht

message carries the C-BSR address and priority. Each PIM router
receives Bootstrap messages from all C-BSRs and compares C-BSR
information to elect a BSR. The BSR is elected according to the
s:

following rules:
The C-BSR with the highest priority wins (larger priority value,
ce

higher priority).
ur

If C-BSRs have the same priority, the C-BSR with the largest
IP address wins.
so

An RP election process is as follows:


Re

Each C-RP sends an Advertisement message to the BSR. An


Advertisement message carries the C-RP address, the range
ng

of multicast groups the C-RP serves, and the C-RP priority.


The BSR summarizes the C-RP information in an RP-Set,
ni

encapsulates the RP-Set in a Bootstrap message, and


advertises the message all PIM-SM routers on the network.
ar

PIM routers follow the same rules to compare RP information


in the RP-Set and elect an RP from multiple C-RP for the
Le

same group. The RP election rules are as follows:


The C-RP interface with the longest address mask
wins.
re

The C-RP with the highest priority wins (larger priority


Mo

value, lower priority).


en
If C-RPs have the same priority, routers use a hash

m/
algorithm, and the C-RP with the largest hash value
wins.

co
If all the preceding parameters are the same, the C-RP
with the largest IP address wins.

.
All PIM routers use the same RP-Set and election rules, so

ei
they obtain mappings between RPs and multicast groups. The

w
PIM routers save the mappings for subsequent multicast

ua
forwarding.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The SSM model is implemented based on PIM-SM and


IGMPv3/MLDv2. In this model, an SPT can be established from a
ht

multicast source to group members without the need to maintain an RP,


establish an RPT, or register the multicast source.
In the SSM model, hosts can determine the location of the multicast
s:

sources. Therefore, they can specify the multicast sources from which
ce

they want to receive multicast data when joining a multicast group.


After the receiver DR receives the request from a host, it sends a Join
ur

message to the source DR. The Join message is then transmitted


upstream hop by hop. An SPT is then set up from the multicast source
so

to the host.
In the SSM model, PIM-SM uses the following mechanisms: neighbor
Re

discovery, DR election, and SPT setup.

An SPT setup process is as follows:


ng

R3 and R5 learn that hosts in the same multicast group


ni

request data from different multicast sources through IGMPv3.


Therefore, R3 and R5 send Join messages toward the sources.
ar

PIM routers that receive the Join message create (S1, G) and
(S2, G) entries according to the Join message. In this way,
Le

they set up an SPT from S1 to PC1 and an SPT S2 to PC2.


Multicast packets from the two multicast sources are then
forwarded to the respective receivers along the SPTs.
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RPF check
When a router receives a multicast packet, it searches the
ht

unicast routing table for the route to the source address of the
packet. After finding the route, the router checks whether the
outbound interface of the route is the same as the inbound
s:

interface of the multicast packet. If they are the same, the


ce

router considers that the multicast packet is received from a


correct interface. This process is called an RPF check, which
ur

ensures correct forwarding paths for multicast packets.


If multiple equal-cost routes are available, the route with the
so

largest next-hop address is used as the RPF route.


RPF checks can be performed based on unicast routes,
Re

Multiprotocol Border Gateway Protocol (MBGP) routes, or


static multicast routes. The priority order of these routes is
ng

static multicast routes > MBGP routes > unicast routes.


Topology description
ni

A multicast stream sent from the source 152.10.2.2 arrives at


interface S1 of the router. The router checks the routing table
ar

and finds that the multicast stream from this source should
arrive at interface S0. Therefore, the RPF check fails and the
Le

multicast stream is dropped by the router.


A multicast stream sent from the source 152.10.2.2 arrives at
interface S0 of the router. The router checks the routing table
re

and finds that the RPF interface is also S0. The RPF check
Mo

succeeds, and the multicast stream is correctly forwarded.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Static multicast routing


For R3, the RPF neighbor towards the multicast source
ht

(Source) is R1. Therefore, multicast packets sent from Source


are forwarded along the path Source -> R1 -> R3. If you
configure a multicast static route on R3 and specify R2 as the
s:

RPF neighbor, the transmission path of multicast packets sent


ce

from Source changes to Source-> R1-> R2-> R3. The


multicast path then diverges from the unicast path.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, interconnection IP addresses are configured
ht

according to the following rule:


If RTX connects to RTY, their interface IP addresses
used to connect to each other are XY.1.1.X and
s:

XY.1.1.Y, network mask is 24.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The multicast routing-enable command enables the
ht

multicast routing function.


The pim dm command enables PIM-DM on an interface.
The pim hello-option dr-priority command sets the DR priority
s:

for a PIM interface.


The igmp enable command enables IGMP on an interface.
ce

The igmp version command specifies the IGMP version


ur

running on an interface.
Precautions
so

In this network topology, R2 is the IGMP querier, and R3


forwards multicast packets to downstream receivers because
Re

R3 is the assert winner.


The display pim routing-table command displays entries in
ng

the PIM routing table.


The display pim routing-table fsm command displays
ni

detailed information about the finite state machine (FSM) in the


PIM routing table.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
The network topology is the same as that in PIM-DM
ht

configuration. The network runs PIM-SM, and the transmission


scope of Bootstrap messages needs to be limited.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
The pim sm command enables PIM-SM on an interface.
ht

The c-rp command configures a router to notify the BSR that it


is a C-RP.
The c-bsr command configures a C-BSR.
s:

The pim bsr-boundary command configures the BSR


ce

boundary of the PIM-SM domain on an interface.


Precautions
ur

In this network topology, R2 is the IGMP querier, and R3


forwards multicast packets to downstream receivers because
so

R3 is the assert winner.


The display pim routing-table command displays entries in
Re

the PIM routing table.


The display pim routing-table fsm command displays
ng

detailed information about the FSM in the PIM routing table.


ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The method for checking the SPT in a PIM-SM network is similar to the
method for checking the RPT.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The method for checking the SPT in a PIM-SM network is similar to the
method for checking the RPT.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, interconnection IP addresses are configured
ht

according to the following rules:


If RTX connects to RTY, their interface IP addresses
used to connect to each other are XY.1.1.X and
s:

XY.1.1.Y, network mask is 24.


The loopback interface address of RTX is X.X.X.X/32.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Pre-configuration
This page provides the basic OSPF configuration. In this case,
ht

R1 is the DR in the FR network.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results:
A Bootstrap message is transmitted from R1 to R2 and fails
ht

the RPF check on R2, so R2 drops the message. To enable


Bootstrap messages to be forwarded by R2, configure a static
multicast route on R2 to change the RPF path.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Results:
A Bootstrap message is transmitted from R1 to R2 and fails
ht

the RPF check on R2, so R2 drops the message. To enable


Bootstrap messages to be forwarded by R2, configure a static
multicast route on R2 to change the RPF path.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

Results:
The ACL restricts the multicast address range.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 characteristics are as follows:


Address space: An IPv6 address is 128 bits long. A 128-bit
ht

address structure allows for 2128 (4.3 billion x 4.3 billion x 4.3
billion x 4.3 billion) possible addresses. The biggest advantage
of IPv6 is its almost infinite address space.
s:

Packet format: IPv6 uses a new protocol header format rather


ce

than increasing the bits in the address field of an IPv4 packet


to 128 bits. The IPv6 data packets carry new packet headers.
ur

An IPv6 packet header includes IPv6 basic and extension


headers. Some optional fields are moved to the extension
so

header following the IPv6 header. This enables intermediate


routers on the network to process IPv6 packet headers more
Re

efficiently.
Autoconfiguration and readdressing: IPv6 provides address
ng

autoconfiguration, which allows hosts to automatically discover


networks and obtain IPv6 addresses. This significantly
ni

improves network manageability.


Hierarchical network structure: A huge address space allows
ar

for the hierarchical network design in IPv6. The hierarchical


network design facilitates route summarization and improves
Le

forwarding efficiency.
End-to-end security support: IPv6 supports IP Security (IPSec)
authentication and encryption at the network layer, so it
re

provides end-to-end security.


Mo
en
Quality of Service (QoS) support: IPv6 defines the Flow Label

m/
field in the packet header. This field enables network routers to
differentiate data flows and provide special processing for the

co
identified data flows. With this field, the routers can identify
data flows without checking the inner data packets being

.
transmitted. In this way, QoS can be implemented even if the

ei
valid payloads of data packets are encrypted.

w
Mobility: With the support for Router header and Destination

ua
option header, IPv6 provides built-in mobility.

g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

It should be noted that an IPv6 address can contain only one double
colon (::). Otherwise, a computer cannot determine the number of zeros
ht

in a group when restoring the compressed address to the original 128-


bit address.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If the first 3 bits of an IPv6 unicast address are not 000, the interface ID
must be of 64 bits. If the first 3 bits are 000, there is no such limitation.
ht

IEEE EUI-64 standards


The length of an interface ID is 64 bits. IEEE EUI-64 defines a
method to convert a 48-bit MAC address into a 64-bit IPv6
s:

interface ID. In the MAC address, c bits indicate the vendor ID,
ce

d bits indicate the vendor number ID, and 0 bit indicates a


global/local bit. g specifies whether the interface ID indicates a
ur

single host or a host group. The specific conversion algorithm


is as follows: convert 0 to 1 and insert two bytes (FFFE)
so

between c and d.
The method for converting MAC addresses into IPv6 interface
Re

IDs reduces the configuration workload. When stateless


address autoconfiguration (stateless address
ng

autoconfiguration will be explicated in the following pages) is


used, you only need an IPv6 network prefix before obtaining
ni

an IPv6 address.
The defect of this method is that an IPv6 address can be easily
ar

calculated based on a MAC address.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv4 addresses are classified into unicast, multicast, and broadcast


addresses. Compared to IPv4, IPv6 has no broadcast address and
ht

introduces a new address type: anycast address. IPv6 addresses are


classified into unicast, multicast, and anycast addresses.
An IPv6 unicast address identifies an interface. Packets sent
s:

to an IPv6 unicast address are delivered to the interface


ce

identified by the unicast address.


An IPv6 multicast address identifies a group of interfaces.
ur

Packets sent to an IPv6 multicast address are delivered to all


the interfaces identified by the multicast address.
so

An IPv6 anycast address identifies multiple interfaces. Packets


sent to an anycast address are delivered to the nearest
Re

interface that is identified by the anycast address, depending


on the routing protocols. In fact, anycast addresses and
ng

unicast addresses use the same address space. The router


determines whether to send a packet in unicast mode or
ni

anycast mode.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Global unicast address


An IPv6 global unicast address is an IPv6 address with a
ht

global unicast prefix, which is similar to an IPv4 public address.


IPv6 global unicast addresses support route prefix
summarization, helping limit the number of global routing
s:

entries.
A global unicast address consists of a global routing prefix,
ce

subnet ID, and interface ID.


ur

Global routing prefix: is assigned by a service provider


to an organization. A global routing prefix is of at least
so

48 bits. Currently, the first 3 bits of all the assigned


global routing prefixes are 001.
Re

Subnet ID: is used by organizations to construct a local


network (site). There are a maximum of 64 bits for both
ng

the global routing prefix and subnet ID. It is similar to


an IPv4 subnet number.
ni

Interface ID: refers to the interface identifier. It can be


used to identify a device (host).
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Link-local address
Link-local addresses have a limited application scope. An IPv6
ht

link-local address can be used only for communication


between nodes on the same link. A link-local address uses a
link-local prefix FE80::/10 as the first 10 bits (1111111010 in
s:

binary) and an interface ID as the last 64 bits.


When IPv6 runs on a node, each interface of the node is
ce

automatically assigned a link-local address that consists of a


ur

fixed prefix and an interface ID in EUI-64 format. This


mechanism enables two IPv6 nodes on the same link to
so

communicate without any additional configuration. Therefore,


link-local addresses are widely used in neighbor discovery and
Re

stateless address autoconfiguration.


Routing devices do not forward IPv6 packets with the link-local
ng

address as a source or destination address to devices on non-


local links.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Unique local address


Unique local addresses are used only within a site. Site-local
ht

addresses are deprecated in RFC 3879 and replaced by


unique local addresses in RFC 4193.
Unique local addresses are similar to IPv4 private addresses.
s:

Any organization that does not obtain a global unicast address


ce

from a service provider can use a unique local address.


Unique local addresses are routable only within a local
ur

network but not on the Internet.


Fields in a unique local address can be described as follows:
so

Prefix: is fixed as FC00::/7.


L: is set to 1 if the address is valid within a local
Re

network. The value 0 is reserved for future expansion.


Global ID: indicates a globally unique prefix, which is
ng

pseudo-randomly allocated (for details, see RFC 4193).


Subnet ID: identifies a subnet within the site.
ni

Interface ID: identifies an interface.


A unique local address has the following characteristics:
ar

Has a globally unique prefix. The prefix is pseudo-


randomly allocated and has a high probability of
Le

uniqueness.
Allows private connections between sites without
creating address conflicts.
re
Mo
en
Has a well-known prefix (FC00::/7) that allows for easy

m/
route filtering by edge routers.
Does not conflict with any other addresses or cause

co
Internet route conflicts if it is leaked outside of the site
through routing.

.
Functions as a global unicast address to upper-layer

ei
applications.

w
Is independent of the Internet Service Provider (ISP).

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Unspecified address
An IPv6 unspecified address is 0:0:0:0:0:0:0:0/128 or ::/128,
ht

indicating that an interface or a node does not have an IP


address. It can be used as the source IP address of some
packets, such as Neighbor Solicitation (NS) message in
s:

duplicate address detection. Devices do not forward the


ce

packets with the source IP address as an unspecified address.


Loopback address
ur

An IPv6 loopback address is 0:0:0:0:0:0:0:1/128 or ::1/128.


Similar to IPv4 loopback address 127.0.0.1, the IPv6 loopback
so

address is used when a node needs to send IPv6 packets to


itself. This IPv6 loopback address is usually used as the IP
Re

address of a virtual interface (a loopback interface for


example). The loopback address cannot be used as the
ng

source or destination IP address of packets that need to be


forwarded.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 multicast address


Like an IPv4 multicast address, an IPv6 multicast address
ht

identifies a group of interfaces, which usually belong to


different nodes. A node may belong to any number of multicast
groups. Packets sent to an IPv6 multicast address are
s:

delivered to all the interfaces identified by the multicast


ce

address.
An IPv6 multicast address is composed of a prefix, flag, scope,
ur

and group ID (global ID):


Prefix: is fixed as FF00::/8 (1111 1111).
so

Flag: is 4 bits long. Currently, only the last bit is used.


The high-order 3 bits are reserved and must be set to
Re

0s. The last bit 0 indicates a permanently-assigned


multicast address allocated by the Internet Assigned
ng

Numbers Authority (IANA). The last bit 1 indicates a


non-permanently-assigned (transient) multicast
ni

address.
Scope: is 4 bits long. It limits the scope where multicast
ar

data flows are sent on the network.


Group ID (global ID): is 112 bits long. It identifies a
Le

multicast group. RFC 2373 does not define all the 112
bits as a group ID but recommends using the low-order
32 bits as the group ID and setting all the remaining 80
re

bits to 0s.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 multicast addresses:


Like an IPv4 multicast address, an IPv6 multicast address
ht

identifies a group of interfaces, which usually belong to


different nodes. A node may belong to any number of multicast
groups. Packets sent to an IPv6 multicast address are
s:

delivered to all the interfaces identified by the multicast


ce

address.
An IPv6 multicast address is composed of a prefix, flag, scope,
ur

and group ID (global ID):


Prefix: is fixed as FF00::/8 (1111 1111).
so

Flag: is 4 bits long. Currently, only the last bit is used.


The high-order 3 bits are reserved and must be set to
Re

0s. The last bit 0 indicates a permanently-assigned


multicast address allocated by the Internet Assigned
ng

Numbers Authority (IANA). The last bit 1 indicates a


non-permanently-assigned (transient) multicast
ni

address.
Scope: is 4 bits long. It limits the scope where multicast
ar

data flows are sent on the network.


Group ID (global ID): is 112 bits long. It identifies a
Le

multicast group. RFC 2373 does not define all the 112
bits as a group ID but recommends using the low-order
32 bits as the group ID and setting all the remaining 80
re

bits to 0s.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 anycast address


Anycast addresses are exclusive to IPv6. An anycast address
ht

identifies a group of interfaces, and this group of interfaces


often belong to different nodes. Packets sent to an anycast
address are delivered to the nearest interface that is identified
s:

by the anycast address, depending on the routing protocols.


ce

The IPv6 anycast addresses can be used in One-to-One-of-


Many communications. The receiver can be one interface of a
ur

group. For example, a mobile subscriber needs to connect to


the nearest receive station. Using anycast addresses, the
so

mobile subscriber is not limited by physical locations.


Anycast addresses are allocated from the unicast address
Re

space, using any of the unicast address formats. Thus,


anycast addresses are syntactically indistinguishable from
ng

unicast addresses. The nodes to which an anycast address is


assigned must be explicitly configured to know that it is an
ni

anycast address. Currently, anycast addresses are used only


as destination addresses, and are assigned to only routers.
ar

A subnet-router anycast address is predefined in RFC 3513.


The interface ID of a subnet-router anycast address is all 0s.
Le

Packets addressed to a subnet-router anycast address are


delivered to a certain router (the nearest router that is
identified by the address) in the subnet specified by the prefix
re

of the address. The nearest router is defined as being closest


Mo

in terms of routing distance.


en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An IPv6 packet has three parts: an IPv6 basic header, one or more
IPv6 extension headers, and an upper-layer protocol data unit (PDU).
ht

IPv6 basic header


Each IPv6 packet must have an IPv6 basic header,
which is fixed as 40 bytes long.
s:

The IPv6 basic header provides basic packet


ce

forwarding information and will be parsed by all routers


on the forwarding path.
ur

Extension headers
An IPv6 extension header is an optional header that
so

may follow the IPv6 basic header. An IPv6 packet may


carry zero, one, or more extension headers. The
Re

extension headers may be different in lengths. The


IPv6 header and IPv6 extension header replace the
ng

IPv4 header and its options. The IPv6 extension


header enhances IPv6 functions and has great
ni

extensibility. Unlike the Options of an IPv4 header, the


maximum length of an IPv6 extension header is not
ar

limited. Therefore, an IPv6 extension header can


contain all the extension data required by IPv6
Le

communications.
The extension information about packet forwarding in
an IPv6 extension header is not parsed by all the
re

routers on the path, and is generally parsed by only the


Mo

destination router.
en
Upper-layer protocol data unit

m/
An upper-layer PDU is composed of the upper-layer
protocol header and its payload such as an ICMPv6

co
packet, a TCP packet, or a UDP packet.

.
Fields in an IPv6 packet header are described as follows:

ei
Version: is 4 bits long. In IPv6, the Version field value is 6.

w
Traffic Class: is 8 bits long. It indicates the class or priority of

ua
an IPv6 packet. The Traffic Class field is similar to the TOS
field in an IPv4 packet and is mainly used in QoS control.

.h
Flow Label: is 20 bits long. This field is added in IPv6 to
differentiate traffic. A flow label and source IP address identify

g
a data flow. Intermediate network devices can effectively

in
differentiate data flows based on this field.
Payload Length: is 16 bits long, which indicates the length of

rn
the IPv6 payload. The payload is the rest of the IPv6 packet

ea
following the basic header, including the extension header and
upper-layer PDU. This field indicates only the payload with the
/l
maximum length of 65535 bytes. If the payload length exceeds
65535 bytes, the field is set to 0. The payload length is
:/
expressed by the Jumbo Payload option in the Hop-by-Hop
Options header.
tp

Next Header: is 8 bits long. This field identifies the type of the
ht

first extension header that follows the IPv6 basic header or the
protocol type in the upper-layer PDU.
Hop Limit: is 8 bits long. This field is similar to the Time to Live
s:

field in an IPv4 packet, defining the maximum number of hops


that an IP packet can pass through. The field value is
ce

decremented by 1 by each router that forwards the IP packet.


When the field value becomes 0, the packet is discarded.
ur

Source Address: is 128 bits long, which indicates the address


of the packet originator.
so

Destination Address: is 128 bits long, which indicates the


Re

address of the packet recipient.


ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 extension header


An IPv4 packet header has an optional field (Options), which
ht

includes security, timestamp, and record route options. The


variable length of the Options field makes the IPv4 packet
header length range from 20 bytes to 60 bytes. When routers
s:

forward IPv4 packets with the Options field, many resources


ce

need to be used. Therefore, these IPv4 packets are rarely


used in practice.
ur

IPv6 uses extension headers to replace the Options field in the


IPv4 header. Extension headers are placed between the IPv6
so

basic header and upper-layer PDU. An IPv6 packet may carry


zero, one, or more extension headers. The sender of a packet
Re

adds one or more extension headers to the packet only when


the sender requests other routers or the destination device to
ng

perform special handling. Unlike IPv4, IPv6 has variable-length


extension headers, which are not limited to 40 bytes. This
ni

facilitates further extension. To improve extension header


processing efficiency and transport protocol performance, IPv6
ar

requires that the extension header length be an integer


multiple of 8 bytes.
Le

When multiple extension headers are used, the Next Header


field of an extension header indicates the type of the next
header following this extension header.
re
Mo
en
An IPv6 extension header contains the following fields:

m/
Next Header: is 8 bits long. It is similar to the Next Header field
in the IPv6 basic header, indicating the type of the next

co
extension header (if existing) or the upper-layer protocol type.
Extension Header Len: is 8 bits long, which indicates the

.
extension header length excluding the Next Header field.

ei
Extension Head Data: is of variable lengths. It includes a

w
series of options and the padding field.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Each extension header can only occur once in an IPv6 packet, except
for the Destination Options header. The Destination Options header
ht

may occur at most twice (once before a Routing header and once
before the upper-layer header).
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The Internet Control Message Protocol version 6 (ICMPv6) is one of the


basic IPv6 protocols.
ht

In IPv4, ICMP reports IP packet forwarding information and


errors to the source node. ICMP defines certain messages
such as Destination Unreachable, Packet Too Big, Time
s:

Exceeded, and Echo Request or Echo Reply to facilitate fault


ce

diagnosis and information management. In addition to the


common functions provided by ICMPv4, ICMPv6 provides
ur

mechanisms such as Neighbor Discovery (ID), stateless


address configuration including duplicate address detection,
so

and Path Maximum Transmission Unit (PMTU) discovery.


The protocol number of ICMPv6, namely, the value of the Next
Re

Header field in an IPv6 packet is 58.


Some fields in the packet are described as follows:
Type: specifies the message type. Values 0 to 127
ng

indicate the error message type, and values 128 to 255


ni

indicate the informational message type.


Code: indicates a specific message type.
ar

Checksum: indicates the checksum of an ICMPv6


packet.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Destination Unreachable message


When a data packet fails to be sent to the destination node
ht

or the upper-layer protocol, the router or destination node


sends an ICMPv6 Destination Unreachable message to the
source node. In an ICMPv6 Destination Unreachable message,
s:

the value of the Type field is 1. The value of the Code field can
be 0, 1, 2, 3, and 4. Each value has a specific meaning
ce

(defined in RFC2463)
Code=0: No route to the destination device.
ur

Code=1: Communication with the destination device is


so

administratively prohibited.
Code=2: Not assigned.
Re

Code=3: Destination IP address is unreachable.


Code=4: Destination port is unreachable.
Packet Too Big message
ng

If a data packet cannot be sent to the destination node


ni

because the size of the packet exceeds the link MTU of the
outbound interface, the router sends an ICMPv6 Packet Too
ar

Big message to the source node. The link MTU of the


outbound interface is carried in the message. PMTU discovery
Le

is implemented based on Packet Too Big messages. In a


Packet Too Big message, the value of the Type field is 2 and
the value of the Code field is 0.
re
Mo
en
Time Exceeded message

m/
If a router receives a packet with the hop limit being 0, it
discards the data packet and sends an ICMPv6 Time

co
Exceeded message to the source node. In a Time Exceeded
message, the value of the Type field is 3. The value of the

.
Code field can be 0 or 1.

ei
Code=0: Hop limit exceeded in packet transmission
Code=1: Fragment reassembly timeout

w
ua
Parameter Problem message
If an IPv6 node detects an error in the IPv6 packet header or

.h
extension header, the IPv6 node discards the data packet and
sends an ICMPv6 Parameter Problem message to the source

g
node, specifying the location and type of the error. In a

in
Parameter Problem message, the value of the Type field is 4.
The value of the Code field can be 0, 1, or 2. The 32-bit Point

rn
field indicates the location of the error. The Code field is

ea
defined as follows:
Code=0: A field in the IPv6 basic header or extension
header is incorrect.
/l
Code=1: The Next Header field in the IPv6 basic
:/
header or extension header cannot be identified.
Code=2: Unknown options exist in the extension
tp

header.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Echo Request messages


Echo Request messages are sent to destination nodes. After
ht

receiving an Echo Request message, the destination node


responds with an Echo Reply message. In an Echo Request
message, the value of the Type field is 128 and the value of
s:

the Code field is 0. The Identifier and Sequence Number fields


ce

are configured by the source host to match the Echo Reply


messages and Echo Request messages.
ur

Echo Reply messages


After receiving an Echo Request message, the destination
so

ICMPv6 node responds with an Echo Reply message. In an


Echo Reply message, the value of the Type field is 129 and
Re

the value of the Code field is 0. The Identifier and Sequence


Number fields in the Echo Reply message are assigned the
ng

same values as those in the Echo Request message.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 address resolution is completed at Layer 3. Layer 3 address


resolution brings the following advantages:
ht

Layer 3 address resolution enables Layer 2 devices to use the


same address resolution protocol.
Layer 3 security mechanisms, for example, IPSec, are used to
s:

prevent address resolution attacks.


Request packets are sent in multicast mode, reducing
ce

performance requirements on Layer 2 networks.


ur

Neighbor Solicitation (NS) packets and Neighbor Advertisement (NA)


so

packets are used during address resolution.


In an NS packet, the value of the Type field is 135 and the
Re

value of the Code field is 0. An NS packet is similar to the ARP


Request packet in IPv4.
In an NA packet, the value of the Type field is 136 and the
ng

value of the Code field is 0. An NA packet is similar to the ARP


ni

Reply packet in IPv4.


ar

The address resolution process is as follows:


PC1 needs to parse the link-layer address of PC2 before
Le

sending packets to PC2. Therefore, PC1 sends an NS


message on the network.
re
Mo
en
In the NS message, the source IP address is the IPv6 address

m/
of PC1, and the destination IP address is the multicast address
of PC2 (this multicast address is called a solicited-node

co
multicast address composed of the prefix FF02::1:FF00:0/104
and the last 24 bits of the corresponding unicast address).

.
The destination IP address to be parsed is the IPv6 address of

ei
PC2. This indicates that PC1 wants to know the link-layer

w
address of PC2. The Options field in the NS message carries

ua
the link-layer address of PC1.
After receiving the NS message,PC2 replies with an NA

.h
message. In the NA reply message, the source address is the
IPv6 address of PC2, and the destination address is the IPv6

g
address of PC1 (the NS message is sent to PC1 in unicast

in
mode using the link-layer address of PC1). The Options field

rn
carries the link-layer address of PC2. This is the whole
address resolution process.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An IPv6 unicast address that is assigned to an interface but has not


been verified by DAD is called a tentative address. An interface cannot
ht

use the tentative address for unicast communication but will join two
multicast groups: ALL-nodes multicast group and Solicited-node
multicast group.
s:
ce

IPv6 DAD is similar to IPv4 gratuitous ARP. A node sends an NS


message that requests the tentative address as the destination address
ur

to the Solicited-node multicast group. If the node receives an NA Reply


message, the tentative address is being used by another node. This
so

node will not use this tentative address for communication.


DAD process
Re

An IPv6 address 2000::1 is assigned to PC1 as a tentative


IPv6 address. To check the validity of 2000::1, PC1 sends an
ng

NS message to the Solicited-node multicast group to which


2000::1 belongs. The NS message contains the requested
ni

address 2000::1. Since 2000::1 is not specified, the source


address of the NS message is an unspecified address. After
ar

receiving the NS message, PC2 processes the message in the


following ways:
Le

If 2000::1 is one tentative address of PC2, PC2 will not


use this address as an interface address and not send
the NA message.
re
Mo
en
If 2000::1 is being used on PC2, PC2 sends an NA

m/
message to the All-nodes multicast group to which the
address belongs. The NA message carries IP address

co
2000::1. In this way, PC1 can find that the tentative
address is duplicate after receiving the message and

.
will not use the address.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 supports stateless address autoconfiguration. Hosts obtain IPv6


prefixes and automatically generate interface IDs. Router Discovery is
ht

the basics for IPv6 address autoconfiguration and is implemented


through the following two messages:
Router Advertisement (RA) message: Each router periodically
s:

sends multicast RA messages that carry network prefixes and


ce

identifiers on the network to declare its existence to Layer 2


hosts and routers. An RA message has a value of 134 in the
ur

Type field.
Router Solicitation (RS) message: After being connected to the
so

network, a host immediately sends an RS message to obtain


network prefixes. Routers on the network reply with an RA
Re

message. An RS message has a value of 133 in the Type field.

Address autoconfiguration
ng

The process of IPv6 stateless autoconfiguration is as follows:


ni

A host automatically configures the link-local address


based on the interface ID.
ar

The host sends an NS message for duplicate address


detection.
Le

If address conflict occurs, the host stops address


autoconfiguration. Then, the host address needs to be
configured manually.
re
Mo
en
If addresses do not conflict, the link-local address takes

m/
effect. The host is connected to the network and can
communicate with the local node.

co
The host sends an RS message or receives RA
messages routers periodically send.

.
The host obtains the IPv6 address based on the prefix

ei
carried in the RA message and the interface ID

w
generated in EUI-64 format.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To choose an optimal gateway router, the gateway router sends a


Redirection message to notify the sender that packets can be sent from
ht

another gateway router. A Redirection message is contained in an


ICMPv6 message. A Redirection message has the value of 137 in the
Type field and carries a better next hop address and destination
s:

address of packets that need to be redirected.


ce

The process of redirecting packets is as follows:


ur

PC1 needs to communicate with PC2. By default, packets sent


from PC1 to PC2 are sent through R1. After receiving packets
so

from PC1, R1 finds that sending packets to R2 is much better.


R1 sends a Redirection message to PC1 to notify PC1 that R2
Re

is a better next hop address. The destination address of PC2


is carried in the ICMPv6 Redirection message. After receiving
ng

the Redirection message, PC1 adds a host route to the default


routing table. Packets sent to PC2 will be directly sent to R2.
ni

A router sends a Redirection message in the following situations:


ar

The destination address of the packet is not a multicast


address.
Le

Packets are not routed to the router.


After route calculation, the outbound interface of the next hop
is the interface that receives the packets.
re
Mo
en
The router finds that a better next hop IP address of the packet

m/
is on the same network segment as the source IP address of
the packet.

co
After checking the source address of the packet, the router
finds a neighboring device in the neighbor entries that uses

.
this address as the global unicast address or the link-local

ei
address.

w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

In IPv6, packets are fragmented on the source node to reduce the


pressure on the transit device.
ht

The PMTU protocol is implemented through ICMPv6 Packet Too Big


messages. A source node first uses the MTU of its outbound interface
as the PMTU and sends a probe packet. If a smaller PMTU exists on
s:

the transmission path, the transit device sends a Packet Too Big
ce

message to the source node. The Packet Too Big message contains
the MTU value of the outbound interface on the transit device. After
ur

receiving the message, the source node changes the PMTU value to
the received MTU value and sends packets based on the new MTU.
so

This process is repeated until packets are sent to the destination


address. Then, the source node obtains the PMTU of the destination
Re

address.

The process of PMTU discovery.


ng

Packets are transmitted through four links. The MTU values of


ni

the four links are 1500, 1500, 1400, and 1300 bytes
respectively. Before sending a packet, the source node
ar

fragments the packet based on PMTU 1500. When the packet


is sent to the outbound interface with MTU 1400, the router
Le

returns a Packet Too Big message that carries MTU 1400.


After receiving the message, the source node fragments the
packet based on MTU 1400 and sends the fragmented packet
re

again.
Mo
en
When the packet is sent to the outbound interface with MTU

m/
1300, the router returns another Packet Too Big message that
carries MTU 1300. The source node receives the message

co
and fragments the packet based on MTU 1300. In this way, the
source node sends the packet to the destination address and

.
discovers the PMTU of the transmission path.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RIPng made the following modifications to RIP:


RIPng uses UDP port 521 (RIP uses UDP port 520) to send
ht

and receive routing information.


RIPng uses the destination addresses with 128-bit prefixes
(mask length).
s:

RIPng uses 128-bit IPv6 addresses as next hop addresses.


RIPng uses the link-local address FE80::/10 as the source
ce

address to send RIPng Update packets.


ur

RIPng periodically sends routing information in multicast mode


and uses FF02::9 as the multicast address.
so

A RIPng packet consists of a header and multiple route table


entries (RTEs). In a RIPng packet, the maximum number of
Re

RTEs depends on the MTU on the interface.


ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

OSPFv3 is based on links rather than network segments.


OSPFv3 runs on IPv6, which is based on links rather than
ht

network segments.
Therefore, you do not need to configure OSPFv3 on the
interfaces in the same network segment. It is only required that
s:

the interfaces enabled with OSPFv3 are on the same link. In


ce

addition, the interfaces can set up OSPFv3 sessions without


IPv6 global addresses.
ur

OSPFv3 does not depend on IP addresses.


This is to separate topology calculation from IP addresses.
so

That is, OSPFv3 can calculate the OSPFv3 topology without


knowing the IPv6 global address, which only applies to virtual
Re

link interfaces for packet forwarding.


OSPFv3 packets and LSA format change.
OSPFv3 packets do not contain IP addresses.
ng

OSPFv3 router LSAs and network LSAs do not contain IP


ni

addresses, which are advertised by Link LSAs and Intra Area


Prefix LSAs.
ar

In OSPFv3, Router IDs, area IDs, and LSA link state IDs no
longer indicate IP addresses, but the IPv4 address format is
Le

still reserved.
Neighbors are identified by Router IDs instead of IP addresses
in broadcast, NBMA, or P2MP networks.
re

Information about the flooding scope is added in LSAs of OSPFv3.


Mo
en
Information about the flooding scope is added in the LSA Type

m/
field of LSAs of OSPFv3. Thus, OSPFv3 routers can process
LSAs of unidentified types, which makes the processing more

co
flexible.
OSPFv3 can store or flood unidentified packets,

.
whereas OSPFv2 just discards unidentified packets.

ei
OSPFv3 floods packets in an OSPF area or on a link. It

w
sets the U flag bit of packets (the flooding area is

ua
based on the link local) so that unidentified packets are
stored or forwarded to the stub area.

.h
OSPFv3 supports multi-process on a link.
Only one OSPFv2 process can be configured on an OSPFv2

g
physical interface. In OSPFv3, one physical interface can be

in
configured with multiple processes that are identified by

rn
different instance IDs.
OSPFv3 uses IPv6 link-local addresses.

ea
As a routing protocol running on IPv6, OSPFv3 also uses link-
local addresses to maintain neighbor relationships and update
/l
LSDBs. Except Vlink interfaces, all OSPFv3 interfaces use
link-local addresses as the source address and that of the next
:/
hop to transmit OSPFv3 packets. The advantages are as
follows:
tp

The OSPFv3 can calculate the topology without


ht

knowing the global IPv6 addresses so that topology


calculation is independent of IP addresses.
The packets flooded on a link are not transmitted to
s:

other links, which prevents unnecessary flooding and


saves bandwidth.
ce

OSPFv3 packets do not contain authentication fields.


OSPFv3 directly adopts IPv6 authentication and security
ur

measures. Thus, OSPFv3 does not need to perform


authentication. It only focuses on the processing of packets.
so

OSPFv3 supports two new LSAs.


Re

Link LSA: A router floods a link LSA on the link where it


resides to advertise its link-local address and the configured
global IPv6 address.
ng

Intra Area Prefix LSA: A router advertises an intra-area prefix


LSA in the local OSPF area to inform the other routers in the
ni

area or the network, which can be a broadcast network or an


NBMA network, of its IPv6 global address.
ar

OSPFv3 identifies neighbors based on router IDs only.


Le

On broadcast, NBMA, and P2MP networks, OSPFv2 identifies


neighbors based on IPv4 addresses of interfaces.
re
Mo
en
OSPFv3 identifies neighbors based on router IDs only. Thus,

m/
even if global IPv6 addresses are not configured or they are
configured in different network segments, OSPFv3 can still

co
establish and maintain neighbor relationships so that topology
calculation is not based on IP addresses.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Extended IS-IS for IPv6 is defined in the draft-ietf-isis-ipv6-05 of the


IETF. To process and calculate IPv6 routes, IS-IS uses two new TLVs
ht

and one network layer protocol identifier (NLPID).

The two TLVs are as follows:


s:

TLV 236 (IPv6 Reachability): describes network reachability by


ce

defining the route prefix and metric.


TLV 232 (IPv6 Interface Address): is similar to the IP Interface
ur

Address TLV of IPv4, except that it changes a 32-bit IPv4


address to a 128-bit IPv6 address.
so

The NLPID is an 8-bit field that identifies the protocol packets of the
Re

network layer. The NLPID of IPv6 is 142 (0x8E). If IS-IS supports IPv6,
it advertises routing information through the NLPID value.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To support multiple network layer protocols, BGP requires NLRI and


Next_Hop attributes to carry information about network layer protocols.
ht

Therefore, MP-BGP uses the following new optional non-transitive


attributes:
MP_REACH_NLRI: indicates the multiprotocol reachable NLRI.
s:

It is used to advertise reachable routes and next hop


ce

information.
MP_UNREACH_NLRI: indicates the multiprotocol unreachable
ur

NLRI. It is used to withdraw unreachable routes.


so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Multicast Listener Discovery (MLD) is a protocol that manages IPv6


multicast members. It has similar principles and functions as IGMP.
ht

MLD is used to enable each IPv6 router to discover their directed


connected multicast listeners (nodes that expect to receive multicast
data) and learn the multicast addresses that the neighbor nodes are
s:

interested in. Then, MLD delivers the learnt information to the multicast
ce

routing protocols used by the routers to ensure that multicast data can
be sent to all links where the receivers reside.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Querier election mechanism


The working mechanism is similar to IGMPv2:
ht

Each MLD router considers itself as a querier when it


starts and sends a General Query message with
destination address FF02::1 to all hosts and routers on
s:

the local network segment.


When the routers receive a General Query message,
ce

they compare the source IPv6 address of the message


ur

with their own interface IPv6 address. The router with


the smallest IPv6 address becomes the querier, and
so

the other routers are considered non-queriers.


All non-queriers start a timer (Other Querier Present
Re

Timer). If non-queriers receive a Query message from


the querier before the timer expires, they reset the
ng

timer. If non-queriers receive no Query message from


the querier when the timer expires, they trigger election
ni

of a new querier.
ar

Member join mechanism


PC2 and PC3 need to receive IPv6 multicast data destined for
Le

IPv6 multicast group G1, and PC1 needs to receive IPv6


multicast data destined for IPv6 multicast group G2. The hosts
need to join their respective multicast groups, and then the
re

MLD querier (R1) needs to maintain IPv6 group memberships.


Mo
en
The query and report process is as follows:

m/
Hosts send Multicast Listener Report messages to the
IPv6 multicast groups that they want to join without

co
waiting to receive a Query message from the MLD
querier.

.
The MLD querier (R1) periodically multicasts General

ei
Query messages with destination address FF02::1 to

w
all hosts and routers on the local network segment.

ua
After PC2 and PC3 receive the Query message, the
host whose delay timer expires first sends a Report

.h
message to G1. If the delay timer of PC2 expires first,
PC2 multicasts a Report message to G1, declaring that

g
it belongs to G1. All hosts on the local network

in
segment can receive the Report message sent from

rn
PC2 to G1. When PC3 receives this Report message, it
does not send the same Report message to G1

ea
because MLD routers (R1 and R2) have known that G1
has members on the local network segment. This
/l
mechanism suppresses duplicate Report messages,
reducing information traffic on the local network
:/
segment.
PC1 still needs to multicast a Report message to G2,
tp

declaring that it belongs to G2.


After receiving the Report messages, MLD routers
ht

know that multicast groups G1 and G2 have members


on the local network segment. Then the routers use
s:

IPv6 multicast routing protocols (such as IPv6 PIM) to


create (*, G1) and (*, G2) entries for multicast data
ce

forwarding, in which * stands for any multicast source.


When IPv6 multicast data sent from an IPv6 multicast
ur

source reaches the MLD routers through multicast


routes, the MLD routers forward the received multicast
so

data to the local network segment because they have


Re

(*, G1) and (*, G2) entries. Subsequently, receiver


hosts can receive the IPv6 multicast data.
ng

Member Leave Mechanism


The host sends a Done message with destination
ni

address FF02::2 to all IPv6 multicast routers on the


local network segment.
ar
Le
re
Mo
en
When the MLD querier receives the Done message, it

m/
sends a Multicast-Address-Specific Query message to
the IPv6 multicast group that the host wants to leave.

co
The destination address and group address of the
Query message are the address of this IPv6 multicast

.
group.

ei
If the IPv6 multicast group has other members on the

w
network segment, the members send a Report

ua
message within the maximum response time.
If the querier receives the Report messages from other

.h
members within the maximum response time, the
querier continues to maintain memberships of the IPv6

g
multicast group. Otherwise, the querier considers that

in
the IPv6 multicast group has no member on the local

rn
network segment and stops maintaining memberships
of the IPv6 multicast group.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv6 multicast source filtering


MLDv2 supports IPv6 multicast source filtering and defines two
ht

filter modes: INCLUDE and EXCLUDE. When a host joins an


IPv6 multicast group G, the host can choose to accept or reject
IPv6 multicast data from a specific source S. When a host
s:

joins an IPv6 multicast group:


If the host only needs to receive data sent from sources
ce

S1, S2, and so on, the host can send a Report


ur

message with an INCLUDE Sources (S1, S2,) entry.


If the host wants to reject data sent from sources S1,
so

S2, and so on, the host can send a Report message


with an EXCLUDE Sources (S1, S2,) entry.
Re

IPv6 Multicast Group Status Tracking


Multicast routers running MLDv2 keep IPv6 multicast group
ng

state based on per multicast address per attached link. The


ni

IPv6 multicast group state includes:


Filter mode: The MLD querier tracks the INCLUDE or
ar

EXCLUDE state.
Source list: The MLD querier tracks the sources that
Le

are added or deleted.


Timers: include a filter timer when the MLD querier
switches to the INCLUDE mode after its IPv6 multicast
re

address expires and a source timer about source


Mo

records.
en
Receiver Host Status Listening

m/
Multicast routers running MLDv2 listen to the receiver host
status to record and maintain information about hosts that join

co
IPv6 multicast groups on the local network segment.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPv4/IPv6 dual stack is an efficient technology that implements IPv4-to-


IPv6 transition. In IPv4/IPv6 dual stack, network devices support both
ht

the IPv4 protocol stack and IPv6 protocol stack. The source device
selects a protocol stack according to the IP address of the destination
device. Network devices between the source and destination devices
s:

select a protocol stack to process and forward packets according to the


ce

packet protocol type. IPv4/IPv6 dual stack can be implemented on a


single device or on a dual-stack backbone network. On a dual-stack
ur

backbone network, all devices must support the IPv4/IPv6 dual stack,
and interfaces connected to the dual-stack network must have both
so

IPv4 and IPv6 addresses configured.


Re

The topology is described as follows:


The host sends a DNS request to the DNS server for the IP
ng

address of domain name www.huawei.com. The DNS server


replies with the requested IP address of the domain name. The
ni

IP address may be 10.1.1.1 or 3ffe:yyyy::1. If the host sends a


class-A query, the DNS server replies with the IPv4 address of
ar

the domain name. When the host sends a class-AAAA query,


the DNS server replies with the IPv6 address of the domain
Le

name.
re
Mo
en
The R1 in the figure supports IPv4/IPv6 dual stack. If the host

m/
needs to access network server at IPv4 address 10.1.1.1, the
host can access the network server through the IPv4 protocol

co
stack of R1.If the host needs to access the network server at
IPv6 address 3ffe:yyyy::1, the host can access the network

.
server through the IPv6 protocol stack of R1.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

During early transition, IPv4 networks are widely deployed, while IPv6
networks are isolated islands. IPv6 over IPv4 tunneling allows IPv6
ht

packets to be transmitted on an IPv4 network, interconnecting all IPv6


islands.
s:

Principles are as follows:


IPv4/IPv6 dual stack is enabled and an IPv6 over IPv4 tunnel
ce

is deployed on edge routing devices.


ur

After an edge routing device receives a packet from the IPv6


network, the device appends an IPv4 header to the IPv6
so

packet to encapsulate the IPv6 packet as an IPv4 packet if the


destination address of the packet is not the device and the
Re

outbound interface of the packet is a tunnel interface.


On the IPv4 network, the encapsulated packet is transmitted to
ng

the remote edge routing device.


The remote edge routing device decapsulates the packet,
ni

removes the IPv4 header, and then sends the decapsulated


IPv6 packet to the connected IPv6 network.
ar

The IPv4 address of the source end of an IPv6 over IPv4 tunnel must
Le

be manually configured, but the IPv4 address of the destination end


can be manually configured or automatically obtained. An IPv6 over
IPv4 tunnel can be a manual or an automatic tunnel depending on how
re

the destination end of the tunnel obtains its IPv4 address.


Mo
en
Manual tunnel: The edge routing device cannot automatically

m/
obtain the IPv4 address of the destination end, which must be
manually configured so that the packets can be correctly

co
forwarded to the tunnel end.
Automatic tunnel: The edge routing device can automatically

.
obtain the IPv4 address of the destination end and does not

ei
require you to manually configure an IPv4 address for the

w
destination end. In most cases, two interfaces on both ends of

ua
an automatic tunnel use IPv6 addresses that contain
embedded IPv4 addresses so that the destination IPv4

.h
address can be extracted from the destination IPv6 address of
IPv6 packets.

g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If an edge routing device needs to set up a manual tunnel with multiple


devices, multiple tunnels must be configured on the edge routing
ht

device. Such configuration is complex. To simplify the configuration, a


manual tunnel is often set up between two edge routing devices to
connect two IPv6 networks.
s:
ce

The manual tunnel has advantages and disadvantages:


Advantage: applies to any environment in which IPv6
ur

traverses IPv4.
Disadvantage: must be manually configured.
so

Packets are transmitted in an IPv6 over IPv4 manual tunnel as follows:


Re

When an edge device of the tunnel receives an IPv6 packet


from an IPv6 network, the device searches for the IPv6 routing
ng

table according to the destination address of the IPv6 packet.


If the packet is forwarded from the virtual tunnel interface, the
ni

device encapsulates the packet according to the tunnel source


and destination IPv4 addresses configured for the tunnel
ar

interface. The encapsulated packet becomes an IPv4 packet,


which is then processed by the IPv4 protocol stack. The IPv4
Le

packet is forwarded to the destination end of the tunnel over


an IPv4 network. After the destination end of the tunnel
receives the IPv4 packet, it decapsulates the packet and
re

sends the decapsulated packet to the IPv6 protocol stack.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An IPv6 over IPv4 GRE tunnel uses standard GRE tunneling


technology to provide a point-to-point connection and requires tunnel
ht

endpoint addresses to be manually configured. GRE tunnels have no


limitations on the encapsulation protocol and transport protocol, which
can be any protocol such as IPv4, IPv6, OSI, or Multiprotocol Label
s:

Switching (MPLS).
ce

Packet forwarding on an IPv6 over IPv4 GRE tunnel is similar to that on


ur

an IPv6 over IPv4 manual tunnel.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The destination address of IPv6 packets transmitted over an automatic


IPv4-compatible IPv6 tunnel is an IPv4-compatible IPv6 address (the
ht

special address used by the automatic tunnel). An IPv4-compatible


IPv6 address is an IPv6 unicast address that has zeros in the high-
order 96 bits and an IPv4 address in the low-order 32 bits.
s:

Disadvantages of an automatic IPv4-compatible IPv6 tunnel:


An automatic IPv4-compatible IPv6 tunnel requires that each
ce

host on both ends should have a valid IP address and support


ur

IPv4/IPv6 dual stack and automatic IPv4-compatible IPv6


tunnels. Therefore, automatic IPv4-compatible IPv6 tunnels
so

cannot be deployed in a large scale. Currently, automatic IPv4-


compatible IPv6 tunnels have been replaced by automatic
Re

6to4 tunnels.
Packet forwarding process is as follows:
After R1 receives an IPv6 packet destined for R2, R1 searches
ng

for an IPv6 route according to destination address ::2.1.1.1,


ni

and finds that the next hop is a tunnel interface. The tunnel
configured on R1 is an automatic IPv4-compatible IPv6 tunnel.
ar

Therefore, R1 encapsulates the IPv6 packet into an IPv4


packet. In the IPv4 packet, the source address is the tunnel
Le

source address 1.1.1.1, and the destination address is the low-


order 32 bits of IPv4-compatible IPv6 address ::2.1.1.1,
namely, 2.1.1.1. The IPv4 packet is forwarded by the tunnel
re

interface on R1 over the IPv4 network to R2 at 2.1.1.1.


Mo
en
After R2 receives the IPv4 packet, it decapsulates the IPv4

m/
packet to obtain the IPv6 packet and sends the IPv6 packet to
the IPv6 protocol stack for processing. An IPv6 packet is sent

co
from R2 to R1 following a similar process.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An automatic 6to4 tunnel is also a kind of automatic tunnel and set up


using the IPv4 address embedded in an IPv6 address. Unlike an
ht

automatic IPv4-compatible IPv6 tunnel, the 6to4 automatic tunnel can


be set up from a router to a router, from a host to a router, from a router
to a host, and from a host to a host.
s:
ce

The address format is as follows:


FP: is the format prefix of aggregatable global unicast
ur

addresses and fixed as 001.


TLA: is short for top level aggregator and fixed as 0x0002.
so

SLA: is short for site level aggregator.


Re

A 6to4 address starts with the prefix 2002::/16 in the format of


2002:IPv4-address::/48. A 6to4 address has a 64-bit network prefix, in
which the first 48 bits (2002:a.b.c.d) are the IPv4 address assigned to a
ng

router interface and cannot be changed, and the last 16 bits (SLA) can
ni

be configured by the user.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

An IPv4 address can only be used as the source address of one 6to4
tunnel. If one edge router connects to multiple 6to4 networks and uses
ht

the same IPv4 address as the tunnel source address, SLA IDs in 6to4
addresses are used to differentiate the 6to4 networks. These 6to4
networks, however, share the same 6to4 tunnel.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Common IPv6 networks need to communicate with 6to4 networks over


IPv4 networks. This requirement can be met through 6to4 relays. A
ht

6to4 relay is a next-hop device that forwards IPv6 packets of which the
destination address is not a 6to4 address but the next-hop address is a
6to4 address. The tunnel destination IPv4 address is obtained from the
s:

next-hop 6to4 address.


ce

If a host on 6to4 network 2 needs to communicate with devices on the


ur

IPv6 network, a route must be configured on the edge router, and the
next-hop address of the route to the IPv6 network is specified as the
so

6to4 address of the 6to4 relay. The 6to4 address of the relay matches
the source address of the 6to4 tunnel. Packets to be sent from 6to4
Re

network 2 to the IPv6 network are first sent to the 6to4 relay according
to the next hop specified in the routing table. The 6to4 relay then
forwards the packet to the IPv6 network. When a packet needs to be
ng

sent from the IPv6 network to 6to4 network , the 6to2 relay
ni

encapsulates the packet as an IPv4 packet according to the destination


address (a 6to4 address) of the packet so that the packet can be
ar

successfully sent to 6to4 network.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) is another


automatic tunneling mechanism. An ISATAP tunnel uses an IPv6
ht

address with an embedded IPv4 address. An ISATAP address uses an


IPv4 address as the interface identifier, while a 6to4 address uses an
IPv4 address as the network prefix.
s:
ce

The address is described as follows:


If the IPv4 address is globally unique, the u bit is 1. Otherwise,
ur

the u bit is 0. The g bit indicates whether the IPv4 address is


unicast or multicast. An ISATAP address can be a global
so

unicast address, link-local address, unique local address, or


multicast address. The first 64 bits of an ISATAP address are
Re

obtained through a request sent to an ISATAP router and can


be automatically configured. The Neighbor Discovery (ND)
ng

protocol can run between edge devices on both ends of an


ISATAP tunnel. An ISATAP tunnel regards an IPv4 network as
ni

a non-broadcast multi-access (NBMA) network.


ar

The forwarding process is described as follows:


The IPv4 network has two dual-stack hosts PC2 and PC3,
Le

each of which has a private IPv4 address. To implement the


ISATAP function, perform the following operations:
Configure ISATAP tunnel interfaces. The hosts
re

generate ISATAP interface IDs according to their IPv4


Mo

addresses.
en
The hosts then generate a link-local IPv6 address

m/
according to the ISATAP interface identifier. Then the
two hosts have IPv6 communication capabilities on the

co
local link.
The hosts perform address autoconfiguration and

.
obtain IPv6 global unicast addresses and ULA

ei
addresses.

w
The host obtains an IPv4 address from the next hop

ua
IPv6 address as the destination address, and forwards
packets through the tunnel interface to communicate

.h
with another IPv6 host. If the destination host is within
the local site, the next hop is the destination host. If the

g
destination host is in a different site, the next hop

in
address is the address of the ISATAP router.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

During a later stage of IPv4-to-IPv6 transition, IPv6 networks are widely


deployed, while IPv4 networks are isolated islands over the world. You
ht

can create a tunnel on an IPv6 network to connect isolated IPv4 sites


so that isolated IPv4 sites can access other IPv4 networks through the
IPv6 public network.
s:
ce

The forwarding process is described as follows:


IPv4/IPv6 dual stack is enabled and an IPv4 over IPv6 tunnel
ur

is deployed on edge routing devices.


After the edge routing device receives a packet from the
so

connected IPv4 network, it adds an IPv6 header to the IPv4


packet to encapsulate the IPv4 packet as an IPv6 packet if the
Re

destination address of the packet is not the routing device.


On the IPv6 network, the encapsulated packet is transmitted to
ng

the remote edge routing device.


The remote edge routing device decapsulates the packet,
ni

removes the IPv6 header, and then sends the decapsulated


IPv4 packet to the connected IPv4 network.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
The device addresses are determined as follows:
ht

If RTX connects to RTY, the addresses of the two


devices are 2001:XY::X/64 and 2001:XY::Y/64
respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


ripng: creates an RIPng process.
ht

ripng enable: enable RIPng on an interface.


ripng metricout: sets the metric that is added to the RIPng
route sent by an interface.
s:

import-route: configures RIPng to import routes from other


ce

routing protocols. You can use the route-policy parameter to


filter routes to be imported and configure route properties.
ur

Precautions:
The policy usage is similar to that in IPv4.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
The device addresses are determined as follows:
ht

If RTX connects to RTY, the addresses of the two


devices are 2001:XY::X/64 and 2001:XY::Y/64
respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


router-id: configures the ID of the router running OSPFv3.
ht

ospfv3 area: enables the OSPFv3 process on an interface


and specifies the area the process belongs to.
nssa: configures an OSPFv3 area as an NSSA.
s:

undo ipv6 nd ra halt: enables the system to send RA packets.


ipv6 address auto global: enables a device to automatically
ce

generate a global IPv6 address through stateless


ur

autoconfiguration.
so

Precautions:
OSPFv3 has similar features as OSPFv2.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
The device addresses are determined as follows:
ht

If RTX connects to RTY, the addresses of the two


devices are 2001:XY::X/64 and 2001:XY::Y/64
respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


ipv6 enable: enables the IPv6 capability of an IS-IS process.
ht

ipv6 nd ra prefix: configures the prefix in an RA packet.


isis ipv6 enable: enables the IS-IS IPv6 capability for an
interface and specifies the ID of the IS-IS process to be
s:

associated with the interface.


ipv6 import-route isis level-2 into level-1: configures IPv6
ce

route importing from Level-2 areas to Level-1 areas.


ur

Precautions:
so

IS-IS IPv6 has similar features as IS-IS IPv4.


Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
The device addresses are determined as follows:
ht

If RTX connects to RTY, the addresses of the two


devices are 2001:XY::X/64 and 2001:XY::Y/64
respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


peer{ipv6-address | group-name } as-number as-number:
ht

creates a peer or configures an AS number for a specified


peer group.
ipv6-family: displays the IPv6 address family view of BGP.
s:

peer enable: enables a BGP device to exchange routes with a


ce

specified peer or peer group in the address family view.


peer connect-interface: specifies a source interface from
ur

which BGP packets are sent, and a source address used for
initiating a connection.
so

peer password: enables a BGP device to implement MD5


authentication for BGP messages exchanged during the
Re

establishment of a TCP connection with a peer.

Precautions:
ng

BGP4+ has similar features as BGP.


ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
IPv6 and IPv4 addresses have been specified.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


interface tunnel: creates a tunnel interface and displays the
ht

tunnel interface view.


tunnel-protocol ipv6-ipv4: sets the tunnel mode to IPv6 over
IPv4 manual tunnel.
s:

source { ipv4-address | interface-type interface-number }:


ce

specifies the source interface of a tunnel.


destination { ipv4-address }: specifies the destination
ur

interface of a tunnel.
ipv6 address { ipv6-address prefix-length }: configures IPv6
so

addresses for tunnel interfaces.


Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

Example description:
IPv6 and IPv4 addresses have been specified.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The commands and their functions are as follows:


interface tunnel: creates a tunnel interface and displays the
ht

tunnel interface view.


tunnel-protocol gre: sets the tunnel mode to IPv6 over IPv4
GRE tunnel.
s:

source { ipv4-address | interface-type interface-number }:


ce

specifies the source interface of the tunnel.


destination { ipv4-address }: specifies the destination
ur

interface of a tunnel.
ipv6 address { ipv6-address prefix-length }: configures IPv6
so

addresses for tunnel interfaces.


Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

MPLS VPN overview


A BGP/MPLS IP VPN is a Layer 3 Virtual Private Network
ht

(L3VPN). It uses the Border Gateway Protocol (BGP) to


advertise VPN routes and uses Multiprotocol Label Switching
(MPLS) to forward VPN packets on the backbone network of
s:

the Service Provider (SP). This technology is called IP VPN


ce

because IP packets are transmitted on VPNs.


The BGP/MPLS IP VPN model consists of the following
ur

entities:
Customer Edge (CE): a device that is deployed at the
so

edge of a customer network and has interfaces directly


connected to the SP network. A CE device can be a
Re

router, switch, or host. Generally, CE devices cannot


detect VPNs and do not need to support MPLS.
Provider Edge (PE): a device that is deployed at the
ng

edge of an SP network and directly connected to a CE


ni

device. On an MPLS network, PE devices process all


VPN services and must have high performance.
ar

Provider (P): a backbone device that is deployed on an


SP network and is not directly connected to CE devices.
Le

P devices only need to provide basic MPLS forwarding


capabilities and do not maintain VPN information.
PE and P devices are managed by SPs. CE devices are
re

managed by customers unless customers authorize SPs to


Mo

manage their CE devices.


en
A PE device can connect to multiple CE devices. A CE device

m/
can connect to multiple PE devices of the same SP or different
SPs.

. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Site
A site is a group of IP systems with IP connectivity, which can
ht

be achieved independent of ISP networks.


Sites are configured based on topologies between devices but
not their geographic locations, although devices in a site are
s:

geographically adjacent to each other in most situation.


The devices in a site may belong to multiple VPNs. That is, a
ce

site may belong to more than multiple VPNs.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp

Different VPN sites can use overlapping address spaces.


ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A PE device establishes and maintains a VPN instance for each


directly connected site. A VPN instance contains VPN member
ht

interfaces and routes of the corresponding site. Specifically, information


in a VPN instance includes the IP routing table, label forwarding table,
interface bound to the VPN instance, and VPN instance management
s:

information. VPN instance management information includes the route


ce

distinguisher (RD), route filtering policy, and member interface list of


the VPN instance.
ur

A public routing and forwarding table and a VRF differ in the following
so

aspects:
A public routing table contains IPv4 routes of all the PE and P
Re

devices. The routes are static routes or dynamic routes


generated by routing protocols on the backbone network.
A VPN routing table contains routes of all sites that belong to a
ng

VPN instance. The routes are obtained through the exchange


ni

of VPN routing information between PE devices or between


CE and PE devices.
ar

Information in a public forwarding table is extracted from the


public routing table according to route management policies,
Le

whereas information in a VPN forwarding table is extracted


from the corresponding VPN routing table.
re
Mo
en
VPN instances on a PE device are independent of each other

m/
and maintain a VRF independent of the public routing and
forwarding table. Each VPN instance can be considered as a

co
virtual device, which maintains an independent address space
and connects to VPNs through interfaces.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The PE devices use Multiprotocol Extensions for BGP-4 (MP-BGP) to


advertise VPN routes and use the VPN-IPv4 address family to solve
ht

the problem that BGP cannot distinguish VPN routes with the same IP
address prefix.
RDs distinguish the IPv4 prefixes with the same address space. The
s:

RD format enables SPs to allocate RDs independently. When CE


ce

devices are dual-homed to PE devices, RD must be globally unique to


ensure correct routing.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

A VPN target, also called the route target (RT), is a 32-bit BGP
extension community attribute. BGP/MPLS IP VPN uses VPN targets to
ht

control VPN routes advertisement.

A VPN instance is associated with one or more VPN target attributes.


s:

VPN target attributes are classified into the following types:


Export target: After a PE device learns IPv4 routes from
ce

directly connected sites, it converts the routes to VPN-IPv4


ur

routes and sets the export target attribute for those routes. The
export target attribute is advertised with the routes as a BGP
so

extended community attribute.


Import target: After a PE device receives VPN-IPv4 routes
Re

from other PE devices, it checks the export target attribute of


the routes. If the export target is the same as the import target
ng

of a VPN instance on the local PE device, the local PE device


adds the route to the VPN routing table.
ni

A VPN target defines which sites can receive a VPN route and which
VPN routes of which sites can be received by a PE device.
ar

The reasons for using the VPN target instead of the RD as the
Le

extended community attribute is as follows:


re
Mo
en
A VPN-IPv4 route has only one RD, but can be associated

m/
with multiple VPN targets. With multiple extended community
attributes, BGP can greatly improve the flexibility and

co
expansibility of a network.
VPN targets can be used to control route advertisement

.
between different VPNs on a PE device. With properly

ei
configured VPN targets, different VPN instances on a PE

w
device can import routes from each other.

ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Traditional BGP-4 defined in RFC 1771 can manage only the IPv4
routes but cannot process VPN routes that have overlapping address
ht

spaces.
To correctly process VPN routes, VPNs use MP-BGP defined in RFC
2858 (Multiprotocol Extensions for BGP-4). MP-BGP supports multiple
s:

network layer protocols. Network layer protocol information is contained


ce

in the Network Layer Reachability Information (NLRI) field and the Next
Hop field of an MP-BGP Update message.
ur

MP-BGP uses the address family to differentiate network layer


protocols. An address family can be a traditional IPv4 address family or
so

any other address family, such as a VPN-IPv4 address family or an


IPv6 address family. For the values of address families, see RFC 1700
Re

(Assigned Numbers).
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The PE and CE devices exchange routing information through standard


BGP, OSPF, IS-IS, RIP or static routes. During the process, the PE
ht

device needs to store routes received from the CE devices to different


VRFs. Other operations are the same as those for common route
exchange. You can configure the same routing protocol for all the CE
s:

devices. However, you must configure different instances for each VRF
ce

of a PE device. The instances do not interfere with each other.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

After a PE1 device receives an IPv4 route from a CE1 device, the PE
device adds the manually configured RD of the VRF to the route to
ht

change the IPv4 route into a VPNv4 route. Then the PE device
changes the Next_Hop attribute in the Route Advertisement message
to its own Loopback address and adds a VPN label (randomly
s:

generated by MP-IBGP) to the route. After that, the PE device adds the
ce

Export Route Target attribute to the route and sends the route to all the
PE neighbors. In VRP5.3, after MPLS is enabled on PE1, PE1 uses
ur

MP-BGP to allocate VPN labels to private network routes. PE devices


can then correctly exchange VPN routes.
so

When multiple CE devices in a VPN site connect to different PE


devices, VPN routes advertised from the CE devices to the PE devices
Re

may be sent back to the VPN site after the routes traverse the
backbone network. This may cause routing loops in the VPN site. The
Site or Origin (SOO) specifies the source site and prevents routing
ng

loops.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

After PE2 receives a VPNv4 route advertised by PE1, PE2 converts the
VPNv4 route into an IPv4 route and adds the IPv4 route to the
ht

corresponding VRF based on the import target attribute of the route.


The VPN label of the route is retained for packet forwarding. PE2
forwards the IPv4 route to the corresponding CE device through the
s:

routing protocol between the PE and CE devices. The next hop in the
ce

route is the IP address of PE2's interface.


ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The data to be exchanged to VPNs needs to be forwarded through the


MPLS backbone network based on MPLS labels. The process for
ht

allocating public network labels (outer labels) is as follows:


The PE and P routers learn BGP next hop IP addresses using an IGP,
assign outer labels using LDP, and establish LSPs. A label stack is
s:

used for packet forwarding. An outer label directs packets to the BGP
ce

next hop. An inner label indicates the outbound interface for the packet
or the VPN instance to which the packet belongs. MPLS forwarding is
ur

based on only outer labels and is irrelevant to the inner labels.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

CE2 sends an IP packet destined for CE1. After receiving the packet,
PE2 encapsulates an inner label 15362 and then an outer label 1024 to
ht

the packet and forwards the packet to the P device. After receiving the
packet, the penultimate hop P pops out the outer label, retains the inner
label, and forwards the packet to PE1 based on the outer label. PE1
s:

determines the VPN site to which the packet belongs based on the
ce

inner label, removes the inner label, and forwards the packet to CE1.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:

Assume that PE1 is RT1, PE2 is RT2, P is RT3.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ip binding vpn-instance: binds the current AC interface to a
ht

specified VPN instance.


ipv4-family: enters the IPv4 address family view of BGP.
s:

Precautions
After a VPN instance is bound to or unbound from an interface,
ce

Layer 3 features such as IP address and routing protocol are


ur

deleted from the interface. If such features are required, you


need to re-configure them.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:

Assume that PE1 is RT1, PE2 is RT2, P is RT3.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ip binding vpn-instance: binds the current AC interface to a
ht

specified VPN instance.


ipv4-family: enters the IPv4 address family view of BGP.
s:

Precautions
Specify a VPN instance for each RIP process on the PE
ce

device.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:

Assume that PE1 is RT1, PE2 is RT2, P is RT3.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ip binding vpn-instance: binds the current AC interface to a
ht

specified VPN instance.


ipv4-family: enters the IPv4 address family view of BGP.
s:

Precautions
Specify a VPN instance for each IS-IS process on the PE
ce

device.
ur

Deleting a VPN instance or disabling a VPN instance IPv4


address family will delete all the IS-IS processes bound to the
so

VPN instance or the VPN instance IPv4 address family on the


PE.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:

Assume that PE1 is RT1, PE2 is RT2, P is RT3.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ip binding vpn-instance: binds the current AC interface to a
ht

specified VPN instance.


ipv4-family: enters the IPv4 address family view of BGP.
Precautions
s:

Specify a VPN instance for each OSPF process on the PE


ce

device.
Deleting a VPN instance or disabling a VPN instance IPv4
ur

address family will delete all the OSPF processes bound to the
VPN instance or the VPN instance IPv4 address family on the
so

PE.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Case description
In this case, the addresses for interconnecting devices are as
ht

follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:

Assume that PE1 is RT1, PE2 is RT2, P is RT3.


ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Command usage
ip binding vpn-instance: binds the current AC interface to a
ht

specified VPN instance.


peer substitute-as: replaces the AS number of the peer
specified in the AS_Path attribute with the local AS number.
s:
ce

Precautions
VPN sites in the same AS or with different private AS numbers
ur

can communicate over the BGP MPLS/IP VPN backbone


network. Sites in the same VPNs have the same AS number.
so

When a local CE device establishes an EBGP neighbor


relationship with a PE device, you need to run the peer
Re

substitute-as command to enable AS number substitution on


the PE device. If AS number substitution is disabled, the local
ng

CE device discards VPN routes with the local AS number. As


a result, VPN users cannot communicate with each other.
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

To improve the HA of a device, increase MTBF and reduce MTTR.


ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Concepts
Two network devices establish a BFD session to detect the
ht

bidirectional forwarding paths between them and serve upper-layer


applications. BFD does not provide the neighbor discovery mechanism.
Instead, BFD obtains neighbor information from the upper-layer
s:

applications BFD serves. After the BFD session is established, the


ce

local device periodically sends BFD packets. If the local device does
not receive a response from the peer device within the detection time, it
ur

considers the forwarding path faulty. BFD then notifies the upper-layer
application for processing.
so

BFD control messages are encapsulated in UDP packets. The


destination port number is 3784 and source port number is a random
Re

value from 49152 to 65535.

BFD session establishment process


ng

OSPF discovers neighbors using the hello mechanism and sets up


ni

connections to neighbors.
After setting up a neighbor relationship, OSPF notifies neighbor
ar

information (including destination and source addresses) to BFD.


BFD sets up a session by using the received neighbor information.
Le

After the BFD session is set up, BFD starts to detect link faults and
rapidly responds to link faults.
re

BFD session establishment process


Mo

A link fault is detected.


en
BFD detects the link fault and changes the BFD session status to

m/
Down.
BFD notifies the local OSPF device that the BFD peer is unreachable.

co
Local OSPF process tears down the connection with the OSPF
neighbor.

.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

The BFD sessions have the following status: Down, Init, Up, and Down.
Down: indicates that a BFD session is in the Down state or has just
ht

been set up.


Init: indicates that the local system can communicate with the peer
system, and the local system expects to make the session Up.
s:

Up: indicates that a session is established successfully.


AdminDown: indicates that a session is in the AdminDown state.
ce
ur

BFD session status transition:


R1 and R2 start BFD state machines respectively. The initial state of
so

BFD state machine is Down. R1 and R2 send BFD control messages


with the State field as Down.
Re

After receiving the BFD message with the State field as Down from
R1, R2 switches the session status to Init and sends a BFD message
with State field as Init.
ng

After the local BFD session status of R2 changes to Init, R2 no longer


ni

processes the received BFD messages with the State field as Down.
The BFD session status change on R1 is the same as that on R2.
ar

After receiving the BFD message with the State field as Init, R2
changes the local BFD session status to Up.
Le

The BFD session status change on R1 is the same as that on R2.


re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Common Commands
Single-hop detection and multi-hop detection
ht

Single-hop or multi-hop detection:


The bfd command enables the global BFD and
displays the BFD view.
s:

The bfd bind peer-ip command creates a BFD


binding and establishes a BFD session.
ce

The discriminator command sets the local and


remote discriminators for the current BFD
ur

session.
The commit command submits the
so

configurations of a BFD session.


Association between BFD and interface status
Re

The bfd command enables the global BFD and


displays the BFD view.
The bfd bind peer-ip default-ip command binds the
ng

physical status of a physical link to the BFD session.


The discriminator command sets the local and remote
ni

discriminators for the current BFD session.


The process-interface-status command associates
ar

the status of the current BFD session with the status of


the interface to which the session is bound.
Le

The configuration is similar to the configuration of BFD and route


association, and is omitted here.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

When a router fails, neighbors at the routing protocol layer detect that
their neighbor relationships are Down and then become Up again after
ht

a period of time. This is the flapping of neighbor relationships. The


flapping of neighbor relationships causes route flapping, which leads to
black hole routes on the restarted router or causes data services from
s:

the neighbors to be transmitted bypass the restarted router. This


ce

decreases the reliability on the network.


NSF is thus introduced to address route flapping issue. The following
ur

requirements must be met:


Hardware: Dual control boards must be configured with redundant
so

RP. One is the active board and the other is the standby board. If the
active board restarts, the standby board becomes the active one. The
Re

distributed structure is used. That is, data forwarding and control are
separated, and LPUs are responsible for data forwarding.
System software: When the active control board is running, it
ng

synchronizes configuration and interface state information to the


ni

standby control board. When an active/standby switchover occurs,


LPUs do not reset or withdraw forwarding entries, and the interfaces
ar

remain Up.
Protocols: Graceful restart (GR) must be supported for related
Le

network protocols, such as routing protocols OSPF, IS-IS, and BGP,


and other protocols such as Label Distribution Protocol (LDP) and
Resource Reservation Protocol (RSVP).
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Graceful Restart (GR) is a mechanism that ensures nonstop service


data forwarding during an active/standby switchover or a protocol
ht

restart. When a device is performing a protocol restart, it notifies


neighboring devices of its restart so that the neighboring relationships
and routes are stably maintained in a certain period. After the protocol
s:

restart is complete, the neighboring devices synchronize configurations


ce

(including the topologies, routes, and sessions maintained by the GR-


related protocols) to the GR Restarter. The configurations on the GR
ur

Restarter are quickly restored. During the protocol restart, route


flapping will not occur and packet forwarding path is not changed. The
so

entire system continuously works.


Re

OSPF GR terms:
GR Restarter: indicates the GR-capable device where protocol restart
occurs.
ng

GR Helper: indicates a device neighboring with the GR Restarter and


ni

helping complete the GR process.


GR Session: indicates the process of GR capability negotiation
ar

performed during OSPF neighbor relationship establishment. The


negotiated content includes whether the two parties have the GR
Le

capability. If the GR capability negotiation is successful, the GR


process starts when the protocol restart occurs.
re

Assume that R1 and R2 have a stable OSPF neighbor relationship and


Mo

GR capability is enabled on R1 and R2. When R1 restarts, the GR


process is as follows:
en
After R1 restarts, it sends a Grace LSA to R2.

m/
When R2 receives the Grace LSA sent by R1, it maintains the
neighbor relationship with R1.

co
R1 and R2 exchange hello and DD packets and synchronize LSDB to
each other. LSAs are not generated during GR; therefore, if R1

.
receives its own LSAs from R2 during LSDB synchronization, it stores

ei
them and adds the Stable tag.

w
After LSDB synchronization is complete, R1 sends Grace LSA to

ua
notify R2 that the GR is finished. R1 starts the OSPF process and
regenerates LSAs, and then deletes the LSAs that are tagged Stable

.h
and not regenerated.
After restoring all routing entries, R1 starts to recalculate routes and

g
updates the FIB table.

in
OSPF GR commands:
The opaque-capability enable command enables the Opaque-LSA

rn
capability. After Opaque-LSA capability is enabled, an OSPF process

ea
can generate Opaque-LSAs and receive Opaque-LSAs from
neighboring devices.
/l
The graceful-restart command enables OSPF GR.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IS-IS GR also uses the concepts of GR Restarter, GR Helper, and GR


Session, which are the same as that used in OSPF GR.
ht

To support the GR feature, IS-IS adds the Restart TLV field to hello
packets and defines three timers.
T1 timer is similar to the IIH timer used in the IS-IS protocol. When a
s:

device restarts, it creates a T1 timer on each interface and periodically


ce

sends hello packets. The T1 timer on an interface is deleted only when


the interface receives all hello ACK packets and CSNP packets.
ur

T2 defines the timeout period of LSDB synchronization after a device


restarts. The T2 timer of a Level is deleted only when the LSDB of this
so

Level completes synchronization. If LSDB synchronization is not


complete when the T2 timer expires, the T2 timer is deleted and GR
Re

fails.
T3 defines the maximum time during which the GR Restarter
performs GR. If LSDB synchronization is not complete when the T3
ng

timer expires, the T3 timer is deleted and GR fails.


ni

Assume that R1 and R2 have a stable IS-IS neighbor relationship and


ar

GR capability is enabled on R1 and R2. When R1 restarts, the GR


process is as follows:
Le

T2 and T3 timers start when the IS-IS protocol on R1 is globally


enabled again. When the interface of R1 goes Up again and enables
the IS-IS protocol, the T1 timer starts on the interface and the interface
re

sends a hello packet.


Mo
en
When R2 receives the hello packet from R1, it maintains the neighbor

m/
relationship with R1 and sends a hello packet. Then R2 sends a CSNP
packet and an LSP packet to R1 to help LSDB synchronization.

co
When the interface of R1 receives the hello packet and all CSNP
packets, R1 deletes the T1 timer; otherwise, R1 periodically sends hello

.
packets until it receives all hello packets and CSNP packets. If the

ei
number of times the T1 timer expires reaches the maximum value, the

w
T1 timer is also deleted.

ua
When the LSDB synchronization is complete, R1 deletes the T2 timer.
After all T2 timers are deleted, R1 starts to delete T3 timers. When

.h
the GR is complete, R1 starts the IS-IS process. IIH timer is started on
all interfaces, and then R1 can periodically send hello packets.

g
After restoring all routing entries, R1 starts to recalculate routes and

in
updates the FIB table.

rn
IS-IS GR command:
The graceful-restart command enables IS-IS GR.

ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

LAND attack
Because of the vulnerability in the 3-way handshake mechanism of
ht

TCP, a LAND attacker sends SYN packets of which the source address
and port of a device are the same as the destination address and port
respectively. After receiving the SYN packet, the target host creates a
s:

null TCP connection with the source and destination addresses as the
ce

address of the target host. The connection is kept until expiration. The
target host will create many null TCP connections, wasting resources or
ur

causing device breakdown.


After defense against malformed packet attacks is enabled, the
so

device checks source and destination addresses in TCP SYN packets


to prevent LAND attacks. The device considers TCP SYN packets with
Re

the same source and destination addresses as malformed packets and


discards them.
ng

Commands for configuring defense against malformed packet attacks


ni

The anti-attack abnormal enable command configures defense


against malformed packets. After the command is executed, the device
ar

discards malformed packets.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

TCP SYN attack


The TCP SYN attack takes advantage of the vulnerability in 3-way
ht

handshake of TCP. During the 3-way handshakes of TCP, when


receiving the initial SYN message from the client, the server sends
back an SYN+ACK packet. When the server is waiting for the final ACK
s:

packet from the client, the connection stays in half-connected mode. If


ce

the server fails to receive the ACK packet, it resends a SYN+ACK


packet to the client. If the server still cannot receive ACK packets, the
ur

server closes the connection and updates the session status in memory.
The interval from the sending of initial SYN+ACK packet to connection
so

closing is about 30 seconds.


During this interval, the attacker may send more than 100 thousands
Re

of SYN packets to the open interfaces and does not respond to the
SYN+ACK packets from the server. Then, memory of the server is
overloaded and cannot accept new connection requests. As a result,
ng

the server closes all active connections.


ni

After defense against TCP SYN flood attacks is enabled, the device
limits the rate of TCP SYN packets so that system resources will not be
ar

exhausted by attacks.
Le

Commands for configuring defense against malformed packet attacks


The anti-attack udp-flood enable command enables the TCP SYN
Flood attack defense.
re

The anti-attack tcp-syn car command configures the rate limit for
Mo

TCP SYN packets. If the rate of received TCP SYN flood packets
exceeds the limit, the device discards excess packets to ensure normal
working of CPU.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Two modes of URPF:


Strict mode
ht

In this mode, packets can pass the check only when


the forwarding table contains the related entries and
the interface of the default route matches the inbound
s:

interface of the packet.


If route symmetry is ensured, you are advised to use
ce

the URPF strict check. For example, if there is only one


ur

path between two network edge devices, URPF strict


check can be used to ensure network security.
so

Loose mode
In this mode, packets pass the check as long as the
Re

source IP addresses of the packets match the entries


in the routing table.
If route symmetry is not ensured, you are advised to
ng

use the URPF loose check. For example, if there are


ni

multiple paths between two network edge devices,


URPF loose check can be used to ensure network
ar

security.
Topology description
Le

A bogus packet with source IP address 2.1.1.1 is sent by the attacker


to S1. After receiving the bogus packet, S1 sends a response packet to
the destination device at 2.1.1.1. In this situation, both S1 and PC1 are
re

attacked by the bogus packets. If URPF is enabled on S1, when S1


Mo

receives the bogus packet with source IP address 2.1.1.1, URPF


discards the packet because the interface corresponding to the source
address of the packet does not match the interface receiving the packet.
en
URPF command

m/
The urpf command enables URPF on an interface and set the URPF
mode.

. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

IPSG principles
IPSG matches IP packets against static or dynamic DHCP binding
ht

table. Before a network device forwards an IP packet, it compares the


source IP address, source MAC address, interface, and VLAN
information in the IP packet with entries in the binding table. If a
s:

matching entry is found, the device considers the IP packet valid and
ce

forwards it. Otherwise, the device considers the IP packet as an attack


packet and discards it.
ur

Working process
so

After IPSG is configured on S1, S1 checks the incoming IP packets


against the binding table. When the packet information matches the
Re

binding table, the packets are forwarded; otherwise, the packets are
discarded.
ng

IPSG commands
ni

The binding table can be generated through DHCP or manually


configured through static IP addresses (the user-bind static command
ar

is used to configure static table).


The ip source check user-bind enable command enables the IPSG
Le

function on an interface to check the received IP packets.


The ip source check user-bind check-item command configures
VLAN- or interface-based IP packet check items. This command is only
re

valid to dynamic binding table.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Topology description
The figure shows a scenario of the MITM attack. The attacker sends a
ht

bogus ARP packet using the PC3's address as the source address to
PC1. PC1 records incorrect address mapping relationship of PC3 in the
ARP table. The attacker thus obtains the data sent by PC1 to PC3 and
s:

sent by PC3 to PC1. Therefore, information between PC1 and PC3


ce

leaks.
To prevent MITM attacks, configure DAI on S1.
ur

When an attacker connects to S1 and attempts to send bogus ARP


packet to S1, S1 detects the attack behavior according to the DHCP
so

snooping binding table and discards the ARP packet. If the ARP
discarding alarm is enabled on S1, when the number of discarded ARP
Re

packets exceeds the alarm threshold, S1 sends an alarm to notify the


administrator.
ng

DAI uses DHCP snooping binding table to defend against MITM attacks.
ni

Before a device forwards an ARP packet, it compares the source IP


address, source MAC address, interface, and VLAN information in the
ar

ARP packet with entries in the binding table. If an entry is matched, the
device considers the packet valid and forwards it; otherwise, the device
Le

considers the packet as an attack packet and discards it.

DAI command
re

The arp anti-attack check user-bind enable command enables DAI


Mo

on an interface or in a VLAN. That is, the device checks ARP packets


against the binding table.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

QoS provides differentiated service qualities for different applications,


for example, dedicated bandwidth, decreased packet loss ratio, short
ht

packet transmission delay, and decreased delay and jitter.

Best-effort service model


s:

Routers and switches are packet switching devices. They


ce

select transmission path for each packet based on TCP/IP and


use the statistics multiplexing method, but do not use the
ur

dedicated connections like TDM. Traditionally, IP provides only


one service model (Best-Effort). In this model, all packets
so

transmitted on a network have the same priority. Best-Effort


means that the IP network tries best to transmit all packets to
Re

the correct destination addresses completely and ensure that


the packets are not discarded, damaged, repeated, or loss of
ng

sequence during transmission. However, the Best-Effort model


does not guarantee any transmission indicators, such as delay
ni

and jitter.
Best-Effort is not belongs to the QOS technical in strict, but is
ar

the major service model used by today's Internet. So we need


know about it.
Le

Due to the Best-Effort model, the Internet has made a lot of


achievements. However, with the development of the Internet,
the Best-Effort model cannot meet increasing requirements of
re

emerging applications. Therefore, the SPs have to provide


Mo

more types of service based on the Best-Effort model, to meet


requirements of each application.
en
IntServ model

m/
The IntServ model, developed by IETF in 1993, supports
various types of service on IP networks. It provides both real-

co
time service and best-effort service on IP networks. The
IntServ model reserves resources for each information flow.

.
The source and destination hosts exchange RSVP messages

ei
to establish packet categories and forwarding status on each

w
node along the transmission path. The model maintains a

ua
forwarding state for each flow, so it has a poor extensibility.
There are millions of flows on the Internet, which consume a

.h
large number of device resources. Therefore, this model is not
widely used. In recent years, IETF has modified the RSVP

g
protocol, and defines that RSVP can be used together with the

in
DiffServ model, especially in the MPLS VPN field. Therefore,

rn
RSVP has a new improvement. However, this model still has
not been widely used. THe DiffServ model addresses

ea
problems in the IntServ mode, so the DiffServ model is a
widely used QoS technology.

DiffServ model
/l
:/
The IntServ has a poor extensibility. After 1995, SPs and
research organizations developed a new mechanism that
tp

supports various services. This mechanism has a high


ht

extensibility. In 1997, IETF recognized that the service model


in use is not applicable to network operation, and there should
be a way to classify information flows and provide
s:

differentiated service for users and applications. Therefore,


IETF developed the DiffServ model, which classifies flow on
ce

the Internet and provides differentiated service for them. The


DiffServ model supports various applications and is applicable
ur

to many business models.


so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Precedence field
The 8-bit Type of Service (ToS) field in an IP packet header
ht

contains a 3-bit IP precedence field.


Bits 0 to 2 constitute the Precedence field, representing
precedence values 7, 6, 5, 4, 3, 2, 1 and 0 in descending order
s:

of priority. The highest priorities (values 7 and 6) are reserved


ce

for routing and network control communication updates. User-


level applications can use only priority values 0 to 5. Bits 6 and
ur

7 are reserved.
Apart from the Precedence field, a ToS field also contains the
so

D, T, and R sub-fields:
Bit D indicates the delay. The value 0 represents a
Re

normal delay and the value 1 represents a short delay.


Bit T indicates the throughput. The value 0 represents
ng

normal throughput and the value 1 represents high


throughput.
ni

Bit R indicates the reliability. The value 0 represents


normal reliability and the value 1 represents high
ar

reliability.
Le

DSCP field
RFC 2474 redefines the TOS field. The right-most 6 bits
identify service type and the left-most 2 bits are reserved.
re

DSCP can classify traffic into 64 categories.


Mo
en
Each DSCP value matches a Behavior Aggregate (BA) and

m/
each BA matches a PHB (such as forward and discard), and
then the PHB is implemented using some QoS mechanisms

co
(such as traffic policing and queuing technologies).
DiffServ network defines four types of PHB: Expedited

.
Forwarding (EF), Assured Forwarding (AF), Class Selector

ei
(CS), and Default PHB (BE PHB). EF PHB is applicable to the

w
services that have high requirements on delay, packet loss,

ua
jitter, and bandwidth. AF PHBs are classified into four
categories and each AF PHB category has three discard

.h
priorities to specifically classify services. The performance of
AF PHB is lower than the performance of EF PHB. CS PHBs

g
originate from IP TOS, and are classified into 8 categories. BE

in
PHB is a special type in CS PHB, and does not provide any

rn
guarantee. Traffic on IP networks belongs to this category by
default.

ea
Priority mapping configuration
/l
Configure the trusted packet priorities: Run the trust command
to specify the packet priority to be mapped.
:/
Configure the priority mapping table: Run the qos map-table
command to enter the 802.1p or DSCP mapping table view,
tp

and run the input command to set the priority mappings.


ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Token bucket
A token bucket with a certain capacity stores tokens. The
ht

system places tokens into a token bucket at the configured


rate. When the token bucket is full, excess tokens overflow
and no token is added.
s:

A token bucket forwards packets according to the number of


tokens in the token bucket. If there are sufficient tokens in the
ce

token bucket for forwarding packets, the traffic rate is within


the rate limit. Otherwise, the traffic rate is not within the rate
ur

limit.
so

Single-rate-single-bucket
A token bucket is called bucket C. Tc indicates the number of
Re

tokens in the bucket. Single-rate-single-bucket has two


parameters:
Committed Information Rate (CIR): indicates the rate of
ng

putting tokens into bucket C, that is, the average traffic


rate permitted by bucket C.
ni

Committed Burst Size (CBS): indicates the capacity of


bucket C, that is, the maximum volume of burst traffic
ar

allowed by bucket C each time.


The system places tokens into the bucket at the CIR. If Tc is
Le

smaller than the CBS, Tc increases; otherwise, Tc does not


increase.
B indicates the size of an arriving packet:
re

If B is smaller than or equal to Tc, the packet is colored


green, and Tc decreases by B.
Mo

If B is greater than Tc, the packet is colored red, and


Tc remains unchanged.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Single-Rate-Double-Bucket
Two token buckets are available: bucket C and bucket E. Tc and Te
ht

indicate the number of tokens in the bucket. Single-rate-double-bucket


has three parameters:
Committed Information Rate (CIR): indicates the rate of
s:

putting tokens into bucket C, that is, the average traffic


ce

rate permitted by bucket C.


Committed Burst Size (CBS): indicates the capacity of
ur

bucket C, that is, the maximum volume of burst traffic


allowed by bucket C each time.
so

Excess Burst Size (EBS): indicates the capacity of


bucket E, that is, the maximum volume of excess burst
Re

traffic allowed by bucket E each time.


The system places tokens into the buckets at the CIR:
If Tc is smaller than the CBS, Tc increases.
ng

If Tc is equal to the CBS and Te is smaller than the


ni

EBS, Te increases.
If Tc is equal to the CBS and Te is equal to the EBS,
ar

Tc and Te do not increase.


B indicates the size of an arriving packet:
Le

If B is smaller than or equal to Tc, the packet is colored


green, and Tc decreases by B.
If B is greater than Tc and smaller than or equal to Te,
re

the packet is colored yellow and Te decreases by B.


Mo

If B is greater than Te, the packet is colored red, and


Tc and Te remain unchanged.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Double-Rate-Double-Bucket
Two token buckets are available: bucket P and bucket C. Tp and Tc
ht

indicate the number of tokens in the bucket. Double-rate-double-bucket


has four parameters:
Peak information rate (PIR): indicates the rate at which
s:

tokens are put into bucket P, that is, the maximum


ce

traffic rate permitted by bucket P. The PIR must be


greater than the CIR.
ur

Committed Information Rate (CIR): indicates the rate of


putting tokens into bucket C, that is, the average traffic
so

rate permitted by bucket C.


Peak Burst Size (PBS): indicates the capacity of bucket
Re

P, that is, the maximum volume of burst traffic allowed


by bucket P each time. PBS is greater than CBS.
Committed Burst Size (CBS): indicates the capacity of
ng

bucket C, that is, the maximum volume of burst traffic


ni

allowed by bucket C each time.


The system places tokens into bucket P at the rate of PIR and
ar

places tokens into bucket C at the rate of CIR:


If Tp is smaller than the PBS, Tp increases. If Tp is
Le

greater than or equal to the PBS, Tp remains


unchanged.
If Tc is smaller than the CBS, Tc increases. If Tc is
re

greater than or equal to the CBS, Tc remains


Mo

unchanged.
en
B indicates the size of an arriving packet:

m/
If B is greater than Tp, the packet is colored red.
If B is greater than Tc and smaller than or equal to Tp,

co
the packet is colored yellow and Tp decreases by B.
If B is smaller than or equal to Tc, the packet is colored

.
green, and Tp and Tc decrease by B.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Traffic policing discards excess traffic to limit traffic within a proper


range and to protect network resources and enterprises' interests.
ht

Traffic policing consists of:


Meter: measures the network traffic using the token bucket
s:

mechanism and sends the measurement result to the marker.


Marker: colors packets in green, yellow, or red based on the
ce

measurement result received from the meter.


Action: takes actions based on packet coloring results (packets in
ur

green or yellow are forwarded and packets in red are discarded by


default) received from the marker. The following actions are defined:
so

Pass: forwards the packets that meet network


requirements.
Re

Remark + pass: changes the local priorities of packets


and forwards them.
Discard: discards the packets that do not meet network
ng

requirements.
ni

If the rate of a type of traffic exceeds the threshold, the device lowers
the packet priority and then forwards or directly discards the packets.
ar

By default, these packets are discarded.


Le

Traffic policing commands:


Configure interface-based traffic policing: Run the qos car
command to create a QoS CAR profile and configure QoS CAR
re

parameters. The parameters in the command vary when the command


is executed on a WAN interface and a LAN interface.
Mo

Configure rate limiting on WAN interface: Run the qos lr command


to set the ratio of packet rate sent by a physical interface to the total
interface bandwidth.
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Traffic policing discards excess traffic to limit traffic within a proper


range and to protect network resources and enterprises' interests.
ht

Traffic shaping process:


When packets arrive, the device classifies packets into different
s:

types and places them into different queues.


If the queue that packets enter is not configured with traffic shaping,
ce

the packets are immediately sent. Packets requiring queuing proceed


ur

to the next step.


The system places tokens to the bucket at the specified rate (CIR):
so

If there are sufficient tokens in the bucket, the device


forwards the packets and the number of tokens
Re

decreases.
If there are insufficient tokens in the bucket, the device
ng

places the packets into the buffer queue. When the


buffer queue is full, packets are discarded.
ni

When there are packets in the buffer queue, the system extracts the
packets from the queue and sends them periodically. Each time the
ar

system sends a packet, it compares the number of packets with the


number of tokens till the tokens are insufficient to send packets or all
Le

the packets are sent.

Traffic shaping commands:


re

Configure interface-based traffic shaping: Run the qos gts


Mo

command to configure traffic shaping on the interface.


en
Configure queue-based traffic shaping.

m/
Run the qos queue-profile queue-profile-name
command to create a queue profile and display the

co
queue profile view.
Run the queue { start-queue-index [ to end-queue-

.
index ] } &<1-10> length { bytes bytes-value | packets

ei
packets-value } command to set the length of each

w
queue.

ua
Run the queue { start-queue-index [ to end-queue-
index ] } &<1-10> gts cir cir-value [ cbs cbs-value ]

.h
command to configure queue-based traffic shaping. By
default, traffic shaping is not performed for queues.

g
Run the qos queue-profile queue-profile-name

in
command to apply the queue profile to an interface.

rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

If the rate of incoming packets on an interface is higher than the rate of


outgoing packets, the interface is congested. If there is insufficient
ht

space for storing the packets, some packets are discarded. When
packets are discarded, hosts or routers retransmit the packets, leading
to a vicious circle.
s:

When congestion occurs, multiple packets preempt resources. The


ce

packets that cannot obtain resources are discarded. The bandwidth,


delay, and jitter of key services cannot be ensured. The core of
ur

congestion management is to decide the resource scheduling policy


that specifies the packet forwarding sequence. Generally, devices use
so

the queue technology to cope with congestion. The queue technology


involves queue creation, traffic classifier, and queue scheduling.
Re

Initially, there is only one queue scheduling policy, that is, First-in-First-
out. To meet different service requirements, more scheduling policies
are developed.
ng

Queue scheduling mechanisms include hardware queue scheduling


ni

and software queue scheduling. Hardware queue is also called transmit


queue (TxQ). The interface drive uses this queue when transmiting
ar

packets one by one. The hardware queue is a FIFO queue. Software


queue schedules data packets to hardware queue according to QoS
Le

requirements. It can use multiple scheduling methods.


Data packets enter the software queue only when the hardware queue
is full.
re
Mo
en
The hardware queue length depends on the bandwidth setting on the

m/
interface. If the interface bandwidth is high, transmission delay is short,
so queue length can be long. An appropriate hardware queue length is

co
important. If the hardware queue length is too long, the policy execution
performance of the software queue degrades because the hardware

.
queue uses the FIFO mechanism for scheduling. If the hardware queue

ei
length is too short, scheduling efficiency is low, link use efficiency is low,

w
and the CPU usage is high.

ua
LAN ports support the FQ and WRR queues.
WAN ports support the FQ and WFQ queues.

.h
Configuration commands:

g
Run the qos queue-profile queue-profile-name command to

in
create a queue profile and display the queue profile view.
On the WAN-side interface, run the schedule{ { pq start-

rn
queue-index [ to end-queue-index ] } | {wfq start-queue-index

ea
[ to end-queue-index ] } command to set a scheduling mode
for each queue on the WAN-side interface.
/l
On the LAN-side interface, run the schedule{ { pq start-
queue-index [ to end-queue-index ] } | { drr start-queue-index
:/
[ to end-queue-index ] } | {wrr start-queue-index [ to end-
queue-index ] } command to set a scheduling mode for each
tp

queue on the LAN-side interface.


Run the qos queue-profile queue-profile-name command to
ht

apply the queue profile to an interface.


s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

FIFP characteristics:
Advantages:
ht

Simple
Disadvantages:
Unfair and no separation between flows. A large flow
s:

will occupy the bandwidth of other flows, which


ce

prolongs the delay of other flows.


When congestion occurs, FIFO discards some packets.
ur

When TCP detects packet loss, it lowers transmission


speed to avoid congestion. However, UDP does not
so

lower transmission speed because it is a


connectionless protocol. As a result, the TCP and UDP
Re

packets in FIFO are not equally processed. The TCP


packet rate is too low.
A flow may occupy all the buffer space and blocks
ng

other types of traffic.


ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

RR
Advantages:
ht

Different flows are separated, and bandwidth is equally


allocated to queues.
Available bandwidth is equally allocated to other
s:

queues.
Disadvantages:
ce

Weights cannot be configured for the queues.


ur

When queues have different packet lengths,


scheduling is inaccurate.
so

When scheduling rate is low, delay and jitter indicators


will deteriorate. For example, when a packet arrives at
Re

an empty queue that is just scheduled, this packet can


be processed only when all the other queues are
ng

scheduled. In this situation, jitter is serious. However, if


scheduling rate is high, the delay is short. The RR
ni

mode is widely used on high-speed routers.


ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Compared with RR, WRR can set the weights of queues. During the
WRR scheduling, the scheduling chance obtained by a queue is in
ht

direct proportion to the weight of the queue. During the WRR


scheduling, the empty queue is directly skipped. Therefore, when there
is a small volume of traffic in a queue, the remaining bandwidth of the
s:

queue is used by the queues according to a certain proportion.


Advantages:
ce

Bandwidth is allocated based on weights, and the


ur

remaining bandwidth of a queue is equally allocated to


other queues. Low-priority queues are also scheduled
so

in a timely manner.
It is easy to implement.
Re

Applicable to DiffServ ports.


Disadvantages:
Similar to RR, WRR is inaccurate when queues have
ng

different packet lengths.


ni

When scheduling rate is low, packet delay is unstable


and the delay and jitter indicators cannot be lowered to
ar

the expected values.


Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

PQ
PQ has four-level queues, including Top, Middle, Normal, and
ht

Bottom. However, most devices support eight-level queues.


Packets in queues with a low priority can be scheduled only
after all packets in queues with a high priority have been
s:

scheduled. Therefore, PQ has obvious advantages and


ce

disadvantages.
PQ ensures that the packets in high-priority queues obtain
ur

high bandwidth, low delay and jitter; however, the packets in


low-priority queues cannot be scheduled in a timely manner or
so

even cannot be scheduled. As a result, the lower-priority


queues starve out.
Re

PQ has the following characteristics:


Uses ACL to classify packets into different types and
ng

adds packets to the corresponding queues.


Packets are discarded only by using the Tail Drop
ni

mechanism.
When the queue length is set to 0, the queue length
ar

can be infinite. That is, the packets entering this queue


are not discarded by Tail Drop unless the memory
Le

space is exhausted.
The FIFO logic is used internal the queue.
The packets in low-priority queues are scheduled only
re

after all packets in high-priority queues are scheduled.


Mo

PQ ensures high quality for specified service traffic, but does


not care about the quality of other services.
en
Advantages:

m/
Precisely controls the delay of high-priority queues.
Easy to implement, differentiating services

co
Disadvantages:
Cannot allocate bandwidth as required. When high-

.
priority queues have many packets, the packets in low-

ei
priority queues cannot be scheduled.

w
It shortens the delay of high-priority queues by

ua
compromising the service quality of low-priority queues.
If a high-priority queue transmits TCP packets and a

.h
low-priority queue transmits UDP packets, the TCP
packets are transmitted at a high speed, while UDP

g
packets cannot obtain sufficient bandwidth.

in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

CQ
The number of bytes to be scheduled must be specified for
ht

each queue. A packet can be scheduled only when its length


exceeds the specified byte size. If the configured byte size is
too small, the queue may be congested. If the configured byte
s:

size is small, bandwidth allocation is inaccurate. For example,


ce

500 bytes is specified for a queue, while most packets in the


queue exceed 1000 bytes. Therefore, the bandwidth actually
ur

allocated is higher than the expected bandwidth. If the number


of bytes specified is large, it is difficult to control the delay. CQ
so

can schedule multiple packets each time. The number of


packets to be scheduled is the same as the number of packets
Re

that can be accommodated by the bytes scheduled each time.


Advantages:
Allocates bandwidth according to certain percentages.
ng

When the traffic volume of a queue is small, other


ni

queues can occupy the bandwidth of this queue.


Easy to implement
ar

Disadvantages:
When the specified number of bytes is small,
Le

bandwidth allocation is inaccurate. When the specified


number of bytes is large, delay and jitter are serious.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

WFQ
Weighted Fair Queuing (WFQ) classifies packets by flow. On
ht

an IP network, the packets with the same source IP addresses,


destination IP addresses, protocol numbers, and IP
precedence belong to the same flow. On an MPLS network,
s:

the packets with the same labels and EXP fields belong to the
ce

same flow. WFQ assigns each flow to a queue, and tries to


assign different flows to different flows. When packets leave
ur

the queues, WFQ allocates the bandwidth on the outbound


interface for each flow according to the weights. The smaller
so

the weight value of the flow is, the smaller the bandwidth the
flow obtains. The greater the weight value of the flow is, the
Re

greater the bandwidth the flow obtains. In this manner,


services of the same priority are treated equally; services of
ng

different priorities are allocated with different weight.


For example, there are eight flows on the interface, with
ni

weights as 1, 2, 3, 4, 5, 6, 7, and 8 respectively. The total


bandwidth quota is the sum of weights, that is, 1 + 2 + 3 + 4 +
ar

5 + 6 + 7 + 8 = 36. The bandwidth occupied by each flow is:


Weight of each flow/Total bandwidth quota. That is, flows
Le

obtain the bandwidth of 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 7/36,
and 8/36. Thus, WFQ assigns different scheduling weights to
services of different priorities while ensuring fairness between
re

services of the same priority.


Mo

Advantages:
en
The queues are scheduled fairly based on the

m/
granularity of bytes.
Differentiates services and allocates weights.

co
Properly controls delay and reduces jitter.
Disadvantages:

.
Difficult to implement.

w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Congestion Avoidance
Tail drop is a traditional method in the congestion avoidance
ht

mechanism. When the length of a queue reaches the


maximum value, all the packets are discarded. If too many
TCP packets are dropped, TCP times out. This may result in
s:

slow TCP start and trigger the congestion avoidance


ce

mechanism so that the device slows down the transmission of


TCP packets. When queues drop several TCP-connection
ur

packets at the same time, these TCP connections start


congestion avoidance and slow startup, which is referred to as
so

global TCP synchronization. Thus, these TCP connections


simultaneously send fewer packets to the queue so that the
Re

rate of incoming packets is smaller than the rate of outgoing


packets, reducing the bandwidth usage. Moreover, the volume
ng

of traffic sent to the queue varies greatly from time to time. As


a result, the volume of traffic over the link fluctuates between
ni

the bottom and the peak. The delay and jitter of certain traffic
are affected.
ar

The traditional packet loss policy uses the tail drop method.
When the queue length reaches the upper limit, the excess
Le

packets (buffered at the queue tail) are discarded.


To prevent global TCP synchronization, Random Early
Detection (RED) is used. The RED technique randomly
re

discards packets to prevent the transmission speed of multiple


Mo

TCP connections from being reduced simultaneously. The


TCP rate and network traffic volume thus are stable.
en
The device provides Weighted Random Early Detection

m/
(WRED) based on RED technology. WRED discards packets
in queues based on DSCP field or IP precedence. The upper

co
drop threshold, lower drop threshold, and drop probability can
be set for each priority. When the number of packets of a

.
priority reaches the lower drop threshold, the device starts to

ei
discard packets. When the number of packets reaches the

w
upper drop threshold, the device discards all the packets. A

ua
higher threshold indicates a high drop probability. The
maximum drop probability cannot exceed the upper drop

.h
threshold. WRED discards packets in queues based on the
drop probability, thereby relieving congestion.

g
WRED configuration:

in
Configure a drop profile.
Run the drop-profile drop-profile-name

rn
command to create a drop profile and enter the

ea
drop profile view.
Run the dscp{ dscp-value1 [ to dscp-value2 ] }
/l
&<1-10> low-limit low-limit-percentage high-
limit high-limit-percentage discard-percentage
:/
discard-percentage command to set DSCP-
based WRED parameters.
tp

Run the ip-precedence { ip-precedence-value1


ht

[ to ip-precedence-value2 ] } &<1-10> low-limit


low-limit-percentage high-limit high-limit-
percentage discard-percentage discard-
s:

percentage command to set IP precedence-


based WRED parameters.
ce

Apply the drop profile.


Run the qos queue-profile queue-profile-name
ur

command to enter the queue profile view.


Run the schedule wfq start-queue-index [ to
so

end-queue-index ] command to set the


Re

scheduling mode of a queue to WFQ.


Run the queue { start-queue-index [ to end-
queue-index ] } &<110> drop-profile drop-
ng

profile-name command to bind a drop profile to


a queue in a queue profile.
ni

Run the qos queue-profile queue-profile-name


command to apply the queue profile to an
ar

interface.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Traffic classification is used to identify the packets with certain


characteristics according to a rule, and is the prerequisite and basis for
ht

differentiated services. You can define rules to classify packets and


specify the relationships between rules:
AND: Packets match a traffic classifier only when the packets
s:

match all the rules. If a traffic classifier contains ACL rules,


ce

packets match the traffic classifier only when the packets


match one ACL rule and all the non-ACL rules. If a traffic
ur

classifier does not contain ACL rules, packets match the traffic
classifier only when the packets match all the non-ACL rules.
so

OR: Packets match a traffic classifier as long as the packets


match a rule.
Re

A traffic behavior refers to an action taken for packets. Performing


ng

traffic classification is to provide differentiated services. A


traffic classifier takes effect only when it is associated with a
ni

traffic control action or a resource allocation action.


ar

A traffic policy is configured by binding traffic classifiers to traffic


behaviors. After a traffic policy is applied to an interface,
Le

globally, to a board, or to a VLAN, differentiated service is


provided.
re

Traffic policy configuration commands


Mo
en
Configure a traffic classifier.

m/
Run the traffic classifier classifier-name [ operator
{ and | or } ] command to create a traffic classifier and

co
enter the traffic classifier view.
Configure a traffic behavior.

.
Run the traffic behavior behavior-name command to

ei
create a traffic behavior and enter the traffic behavior

w
view.

ua
Configure a traffic policy.
Run the traffic policy policy-name command to create

.h
a traffic policy and enter the traffic policy view.
The classifier behavior command binds a traffic

g
behavior to a traffic classifier to a traffic behavior in a

in
traffic policy.
Run the traffic-policy policy-name { inbound | outbound }

rn
command to apply a traffic policy to the interface or sub-

ea
interface in the inbound or outbound direction.

/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

SNMP model
NMS station is the manager in a network management system. It
ht

uses the SNMP protocol to manage and monitor the network. The NMS
software runs on an NMS server.
Agent is a process on the managed device. The agent maintains data
s:

on the managed device, receives and processes the request packets


ce

from the NMS, and then sends the response packets to the NMS.
Management object is the object to be managed. A device may have
ur

multiple management objects, including a hardware component (such


as an interface board) and parameters (such as a routing protocol)
so

configured for the hardware or software.


MIB is a database specifying variables that are maintained by the
Re

managed device and can be queried or set by the agent. MIB defines
attributes of the managed device, including the name, status, access
rights, and data type of objects.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Operations of SNMPv1 and SNMPv2c


Get: reads one or several parameter values from the MIB of the agent
ht

process.
GetNext: reads the next parameter value from the MIB of the agent
process.
s:

Set: sets one or several parameter values in the MIB of the agent
ce

process.
Response: returns one or more queried values. The agent performs
ur

this operation that corresponds to the GetRequest, GetNextRequest,


SetRequest, and GetBulkRequest operations. Upon receiving a Get or
so

Set request, the agent performs the Query or Modify operation using
MIB tables and then sends the responses to the NMS.
Re

Trap: sent by an agent process to notify the NMS of a fault or event


on the managed device.
ng

New Operation Types of SNMPv2c


ni

GetBulk: The NMS queries managed devices in batches. It is


implemented based on the GetNext operation. A GetBulk operation
ar

equals to a series of GetNext operations. You can specify the number


of times the GetNext operation is executed on the managed device
Le

during a GetBulk interaction.


InformRequest: sent by a managed device to notify the NMS of an
alarm on a managed device. After the managed device sends an inform,
re

the NMS must send an InformResponse packet to the managed device.


Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

Operations related to SNMPv3:


The NMS sends a Get request without security parameters to the
ht

agent.
The agent responds and returns requested parameters to the NMS.
The NMS sends a Get request carrying security parameters to the
s:

agent.
The agent encrypts response packet and returns required parameters
ce

to the NMS.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

NQA Principles
Creating a test instance
ht

NQA requires two test ends, an NQA client and an


NQA server (or called the source and destination). The
NQA client (or the source) initiates an NQA test. You
s:

can configure test instances through command lines or


ce

the NMS. Then NQA places the test instances into test
queues for scheduling.
ur

Starting the test instance


When starting an NQA test instance, you can choose to
so

start the test instance immediately, at a specified time,


or after a delay. A test packet is generated based on
Re

the type of a test instance when the timer expires. If the


size of the generated test packet is smaller than the
ng

minimum size of a protocol packet, the test packet is


generated and sent out with the minimum size of the
ni

protocol packet.
Processing a test instance
ar

After a test instance starts, the protocol-related running


status can be collected according to response packets.
Le

The client adds a timestamp to a test packet based on


the local system time before sending the packet to the
server. After receiving the test packet, the server sends
re

a response packet to the client. The client then adds a


Mo

timestamp to the received response packet based on


the current local system time. This helps the client
calculate the round-trip time (RTT) of the test packet
based on the two timestamps.
en
An NQA ICMP test instance checks whether a route from the NQA

m/
client to the destination is reachable. The ICMP test has a similar
function as the ping command, while the ICMP test provides more

co
output information:
By default, the command output shows the results of the latest five

.
tests.

ei
The output includes the average delay, the packet loss ratio, and the

w
time the last packet is correctly received.

ua
Test Procedure

.h
Source (R1) sends an ICMP echo request packet to the destination
(R2).

g
After receiving the ICMP echo request packet, the destination (R2)

in
responds to the source (R1) with an ICMP echo reply packet.
The source (R1) then can calculate the time of communication

rn
between the source (R1) and the destination (R2) by subtracting the

ea
time the source sends the ICMP echo request packet from the time the
source receives the ICMP echo reply packet. The calculated data can
/l
reflect the network performance and operating status.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

NTP synchronization process


R1 sends an NTP packet to R2. The packet carries a timestamp,
ht

10:00:00 am (T1), indicating the time it leaves R1.


When the NTP packet reaches R2, R2 adds a timestamp, 11: 00:01
am (T2), to the NTP packet, indicting the time R2 receives the packet.
s:

When the NTP packet leaves R2, R2 adds a transmit timestamp,


ce

11:00:02 am (T3), to the NTP packet, indicating the time it leaves R2.
When R1 receives this response packet, it adds a new receive
ur

timestamp, 10:00:03 am (T4), to the packet. R1 uses the received


information to calculate the following two important parameters:
so

Roundtrip delay of the NTP packet: Delay = (T4 - T1) -


(T3 - T2)
Re

Clock offset of R1 by taking R2 as a reference: Offset =


((T2 - T1) + (T3 - T4))/2
After the calculation, R1 knows that the roundtrip delay is 2 seconds
ng

and the clock offset of R1 is 1 hour. R1 sets its own clock based on
ni

these two parameters to synchronize its clock with that of R2.


ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en

También podría gustarte