HCIE-R&S Huawei Certified Internetwork Expert - Routing and Switching PDF

Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
HCIE-R&S
Huawei Certification
en
m/
co
HCIE-R&S
.
ei
Huawei Certified Internetwork Expert-Enterprise
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
Huawei Technologies Co.,Ltd

ni
ar
Le
re
Mo
HUAWEI TECHNOLOGIES
HCIE
Copyright Huawei Technologies Co., Ltd. 2010. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any
en
means without prior written consent of Huawei Technologies Co., Ltd.
m/
Trademarks and Permissions
. co
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
ei
All other trademarks and trade names mentioned in this document are the property of
w
their respective holders.
ua
Notice
g .h
The information in this document is subject to change without notice. Every effort
in
has been made in the preparation of this document to ensure accuracy of the contents,
but all statements, information, and recommendations in this document do not
rn
constitute the warranty of any kind, expressed or implied.
ea
/l
:/
tp
Huawei Certification
ht
HCIE-R&S
s:
Huawei Certified Internetwork Expert-Enterprise

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
HUAWEI TECHNOLOGIES
HCIE-R&S
en
Huawei Certification System
m/
co
Relying on its strong technical and professional training system, in accordance with
different customers at different levels of ICT technology, Huawei certification is
.
ei
committed to provide customs with authentic, professional certification.
w
Based on characteristics of ICT technologies and customersneeds at different levels,
ua
Huawei certification provides customers with certification system of four levels.
.h
HCDA (Huawei Certification Datacom Associate) is primary for IP network
g
maintenance engineers, and any others who want to build an understanding of the IP
in
network. HCDA certification covers the TCP/IP basics, routing, switching and other
common foundational knowledge of IP networks, together with Huawei
rn
communications products, versatile routing platform VRP characteristics and basic
maintenance.
ea
/l
HCDP-Enterprise (Huawei Certification Datacom Professional-Enterprise) is aimed at
enterprise-class network maintenance engineers, network design engineers, and any
:/
others who want to grasp in depth routing, switching, network adjustment and
tp
optimization technologies. HCDP-Enterprise consists of IESN (Implement Enterprise

Switch Network), IERN (Implement Enterprise Routing Network), and IENP
ht
(Improving Enterprise Network performance), which includes advanced IPv4 routing

and switching technology principles, network security, high availability and QoS, as
s:
well as the configuration of Huawei products.

ce
HCIE-Enterprise (Huawei Certified Internetwork Expert-Enterprise) is designed to

ur
endue engineers with a variety of IP technologies and proficiency in the maintenance,

diagnostics and troubleshooting of Huawei products, which equips engineers with
so
competence in planning, design and optimization of large-scale IP networks.

Re
ng
ni
ar
Le
re
Mo
HUAWEI TECHNOLOGIES
Mo
re
HCIE
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
HUAWEI TECHNOLOGIES
/l
ea
rn
in
g.h
ua
w ei
. co
m/
en
HCIE-R&S
en
Referenced icon
m/
co
.
w ei
ua
.h
Router L3 Switch L2 Switch Firewall Net cloud
g
in
rn
Ethernet line
ea Serial line
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
HUAWEI TECHNOLOGIES
HCIE
CONTENTS
en
m/
RIP ..................................................................................................................................................... 7
co
IS-IS.................................................................................................................................................. 59
.
ei
OSPF .............................................................................................................................................. 123
w
BGP BASICS .................................................................................................................................... 196
ua
.h
BGP ADVANCED AND INTERNET DESIGN ........................................................................................ 266
g
ROUTE IMPORT AND CONTROL ...................................................................................................... 334
in
rn
VLAN .............................................................................................................................................. 393
ea
LAN LAYER 2 TECHNOLOGIES ......................................................................................................... 448
/l
WAN LAYER 2 TECHNOLOGIES........................................................................................................ 496
:/
STP ................................................................................................................................................. 548

tp
ht
MULTICAST .................................................................................................................................... 636
IPv6 ................................................................................................................................................ 719

s:
ce
MPLS VPN ...................................................................................................................................... 805

ur
OTHER TECHNOLOGIES .................................................................................................................. 841

so
Re
ng
ni
ar
Le
re
Mo
HUAWEI TECHNOLOGIES
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RIPv1 packet format

A RIP packet consists of two parts: Header and Route Entries.
ht
The Header includes the Command and Version fields. Route

Entries include at most 25 routing entries. Each routing entry
contains the Address Family Identifier field, IP Address of
s:
target network, and Metric field.

The meaning of each field in a RIP packet is as follows:
ce
Command: indicates whether the packet is a request or response.

ur
The value is 1 or 2. The value 1 indicates a request, and the value 2

indicates a response.
so
Version: specifies the used RIP version. The value 1 indicates a

RIPv1 packet, and the value 2 indicates a RIPv2 packet.
Re
Address Family Identifier: specifies the used address family. The

value is 2 for IPv4. If the packet is a request for the entire routing
ng
table, the value is 0.

IP Address: specifies the destination address for the routing entry.
ni
The value can be a network address or host address.

Metric: indicates how many hops the packet has passed
ar
through to the destination. Although the field value ranges from 0 to

2^32 (2 to the power of 32), the value ranges from 1 to 16 in RIP.
Le
re
Mo
en
RIPv1 characteristics
m/
RIP is a UDP-based routing protocol. A RIP packet excluding
an IP header has at most 512 bytes, which includes a 4-byte
co
RIP header, and each route includes a 20-byte, the maxium
message of RIP is 4+(25*20)=504-byte routing entries, and an
.
8-byte UDP header. A RIPv1 packet does not carry mask
ei
information. RIPv1 send and receive routes based on the main
w
class network segment mask and interface address mask.
ua
Therefore, RIPv1 does not support route summarization or
discontinuous subnets. RIPv1 packets do not carry the
.h
authentication field, and so RIPv1 does not support
authentication.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RIPv2 packet format

A RIPv2 packet has the same format as a RIPv1 packet
ht
except that RIPv2 uses some new and unused fields in RIPv1
to provide extended functions.
The meaning of the new fields is as follows:
s:
Route Tag: indicates external routes learned from other

ce
protocols or routes imported into RIPv2.

Subnet Mask: identifies the subnet mask of an IPv4 address.
ur
Next Hop: indicates a next-hop address that is better than

the advertising router address. The value 0.0.0.0
so
indicates that the advertising router address is the

optimal next-hop address.
Re
When authentication is configured in RIPv2, RIPv2 modifies

the first Route Entries:
ng
Changes the Address Family Identifier field to 0XFFFF.

Changes the Route Tag field to the Authentication Type field.
ni
Changes the IP Address, Subnet Mask, Next Hop, and

Metric fields to the Password field.
ar
Compared with RIPv1, RIPv2 has the following advantages:

Le
Supports route tags. Route tags are used in routing policies to

flexibly control routes. Tags can also be used when RIP
processes import routes from each other.
re
Supports subnet masks, route summarization, and CIDR.

Mo
en
Supports specified next hops to select the optimal next-hop
m/
address on a broadcast network.
Multicasts route updates. Only RIPv2-running devices can
co
receive protocol packets, reducing resource consumption.
Supports packet authentication to enhance security.
.
ei
On a broadcast network with more than two devices, the Next Hop field
w
changes to optimize the path.
ua
In MD5 authentication, the AND operation is performed on route entries
.h
and shared key. A router then sends the AND operation results and
route entries to the neighbor.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RI mainly uses three timers:

Update timer: defines the interval between two route updates.
ht
It periodically triggers the transmission of route updates at a

default interval of 30 seconds.
Aging timer: specifies the aging time of routes. If a RIP device
s:
does not receive the update of a route from its neighbor within
ce
the aging time, the RIP device considers the route as

unreachable. After the aging timer expires, the RIP device sets
ur
the metric of the route to 16.

Garbage-collect timer: specifies the interval between a route is
so
marked as unreachable and the route is deleted from the

routing table. The default interval is four times the update
Re
interval, namely, 120 seconds. If the RIP device does not

receive the update of an unreachable route from the same
ng
neighbor within the garbage-collect time (defaults to 120

seconds), the RIP device deletes the route from the routing
ni
table.
Relationship between three timers:
ar
RIP route update advertisement is controlled by the update

timer. A route update is sent at a default interval of 30 seconds.
Le
Each routing entry has two timers: aging timer and garbage-
collect timer. When a route is learned and added to the routing
table, the aging timer starts. If a RIP device does not receive
re
the update of the route from a neighbor when the aging timer
Mo
expires, the device sets the metric of the route to 16

(indicatingan unreachable route) and starts the garbage-collect
timer.
en
If the device still does not receive the update of the
m/
unreachable route from the neighbor when the garbage-collect
timer expires, the device deletes the route from the routing
co
table.
.
Precautions
ei
If a RIP device does not have the triggered update function, it
w
deletes an unreachable route from the routing table after a
ua
maximum of 300 seconds (aging time plus garbage-collect
time).
.h
If a RIP device has the triggered update function, it deletes an
unreachable route from the routing table after a maximum of
g
120 seconds (the garbage-collect time).
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Split horizon
RIP uses split horizon to reduce bandwidth consumption and
ht
prevent routing loops.
Implementation
s:
R1 sends R2 a route to network 10.0.0.0/8. If split horizon is

ce
not configured, R2 sends the route learned from R1 back to R1.

In this manner, R1 can learn two routes to network 10.0.0.0/8:
ur
one direct route with zero hops and the other route with two
hops and R2 as the next hop.
so
However, only the direct route is active in the RIP routing table
of R1. When the route from R1 to network 10.0.0.0/8 becomes
Re
unreachable and R2 does not receive route unreachable

information, R2 continues sending route information indicating
ng
that network 10.0.0.0/8 is reachable to R1. Subsequently, R1

receives incorrect route information and considers that it can
ni
reach network 10.0.0.0/8 through R2; R2 still considers that it

can reach network 10.0.0.0/8 through R1. As a result, a routing
ar
loop occurs. After split horizon is configured, R2 does not send

the route to network 10.0.0.0/8 back to R1, preventing a
Le
routing loop.
Precautions
Split horizon is disabled on NBMA networks by default.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Poison reverse function

Poison reverse helps delete useless routes from the routing
ht
table of the peer end.
Implementation
s:
After receiving a route 10.0.0.0/8 from R1, R2 sets the metric

ce
of the route to 16, indicating that the route is unreachable, if

poison reverse is configured. Then R1 does not use the route
ur
10.0.0.0/8 learned from R2, preventing a routing loop.

so
Precautions
Poison reverse is disabled by default. Generally, split horizon
Re
is enabled on Huawei devices (except on NBMA networks)

and poison reverse is disabled.
ng
Comparisons between split horizon and poison reverse

ni
Both split horizon and poison reverse can prevent routing

loops in RIP. The difference between them is as follows: Split
ar
horizon avoids advertising a route back to neighbors along the

same path to prevent routing loops, while poison reverse
Le
marks a route as unreachable and advertises the route back to

neighbors along the same path to prevent routing loops.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Triggered update
Triggered update can shorten the network convergence time.
ht
When a routing entry changes, a RIP device broadcasts the

change to other devices immediately without waiting for
periodic update. If triggered update is not configured, by
s:
default, invalid routes are retained in the routing table for a

ce
maximum of 300 seconds (aging time plus garbage-collect

time).
ur
Update is not triggered when the next-hop address becomes

unreachable.
so
Implementation
Re
After R1 detects a network fault, it sends a route update to R2

immediately without waiting for the expiry of the update timer.
ng
Subsequently, the routing table of R2 is updated in a timely

manner.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Route summarization
RIPv2 supports route summarization. Because RIPv2 packets
ht
carry the mask, RIPv2 supports subnetting. Route

summarization can improve scalability and efficiency of large
networks and reduce the routing table size.
s:
RIPv2 process-based classful summarization can implement

ce
automatic summarization.
Interface-based summarization can implement manual
ur
summarization.
If the routes to be summarized carry tags, the tags are deleted
so
after these routes are summarized into one summary route.

Re
Case
Two routes: route 10.1.0.0/16 (metric=10) and route
ng
10.2.0.0/16 (metric=2) are summarized into one natural

network segment route 10.0.0.0/8 (metric=3).
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Working process analysis:

Initial state: A router starts a RIP process, associates an
ht
interface with the RIP process, and sends as well as receives

RIP packets from the interface.
Build a routing table: The router builds its routing entries
s:
according to received RIP packets.

Maintain the routing table: The router sends and receive a
ce
route update at an interval of 30 seconds to maintain its

ur
routing entries.
Age routing entries: The router starts a 180-second timer for its
so
routing entries. If the router receives route updates within 180

seconds, it resets the update timer and aging timer.
Re
Garbage collect entries: If the router does not receive the

update of a route after 180 seconds, it starts the 120-second
ng
garbage-collect timer and sets the metric of the route to 16.

Delete routing entries: If the router still does not receive the
ni
update of the route after 120 seconds, it deletes the route from
the routing table.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this case, R1, R2, and R3 reside on network 192.168.1.0/24;
ht
R3, R4, and R5 reside on network 192.168.2.0/24. All the

routers run RIPv2 and advertise IP addresses of connected
interfaces. To control route selection on R3, modify the metric
s:
of routes.
ce
Remarks
ur
In the IP routing table, only some related routing entries are

displayed. In the Flags field of the route, R indicates an
so
iterated route, and D indicates that the route is delivered to the

FIB table.
Re
The route iteration process is as follows: Iteration process is

finding routing for iteration. On a device, when the next hop of
ng
a route to the destination address does not match the

outbound interface of the device, routes can match again the
ni
destination address in the table of the next hop so routes be

iterated to find the correct outbound interface for forwarding.
ar
The FIB table is the route forwarding table that is generated by

the routing table. You can run the display fib command to
Le
view the forwarding table.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The rip metricin command increases the metric of a received
ht
route. After the route is added to the routing table, the metric of
the route is changed. Running this command affects route
selection of the local device and other devices.
s:
The rip metricout command increases the metric of an

ce
advertised route. The metric of the route remains unchanged

in the routing table. Running this command does not affect
ur
route selection of the local device but affects route selection of

other devices.
so
View
Re
Interface view
Parameters
ng
rip metricout { value | { acl-number | acl-name acl-name | ip-

ni
prefix ip-prefix-name } value1 }: sets the additional metric to

be added to an advertised route.
ar
value: increases the metric of an advertised route. The

value ranges from 1 to 15 and defaults to 1.
Le
acl-number: specifies a basic ACL number. The value

ranges from 2000 to 2999.
acl-name acl-name: specifies an ACL name. The value
re
is case-sensitive.
Mo
ip-prefix ip-prefix-name: specifies an IP prefix list name,

which must be unique.
en
value1: increases the metric of the route that passes the
m/
filtering of an ACL or IP prefix list.
co
Precautions
You can specify value1 to increase the metric of the advertised
.
RIP route that passes the filtering of an ACL or IP prefix list. If
ei
a RIP route does not pass the filtering, its metric is increased
w
by 1.
ua
Running the rip metricin/metricout commands will affect
route selection of other devices.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
The topology in this case is the same as that in the previous
ht
case. To prevent interfaces from sending or receiving route

updates, suppress the interfaces or run the undo rip
input/output commands.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The silent-interface command suppresses an interface to
ht
allow it to receive but not send RIP packets. If an interface is

suppressed, direct routes of the network segment where the
interface resides can still be advertised to other interfaces.
s:
This command can be used together with the peer (RIP)

ce
command to advertise routes to a specified device.

The undo rip output/input command prohibits an interface
ur
from sending/receiving RIP packets.

so
View
silent-interface: RIP view
Re
undo rip output/input: interface view
Parameters
ng
silent-interface { all | interface-type interface-number }

ni
all: suppresses all the interfaces.

ar
Precautions
After all the interfaces are suppressed, one of the interfaces
Le
cannot be activated. That is, the silent-interface all command

has the highest priority. In this case, all the interfaces of R4
are suppressed, and so any interface of R4 cannot be
re
activated.
Mo
en
Configuration verification
m/
The display ip routing-table command output shows that: R3
can receive the update of route 172.16.0.0/24 from R5 but not
co
R4 and can receive the update of route 10.0.0.0/24 from R1
but not R2.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. To prevent a device from receiving routes from a

specified neighbor, run the filter-policy gateway command.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The filter-policy { acl-number | acl-name acl-name } import
ht
command filters received routes based on an ACL.

The filter-policy gateway ip-prefix-name import command
filters routes based on the advertising gateway.
s:
ce
View
filter-policy { acl-number | acl-name acl-name | ip-prefix ip-
ur
prefix-name } import: RIP view

filter-policy gateway ip-prefix-name import: RIP view
so
Parameters
Re
filter-policy { acl-number | acl-name acl-name } import

acl-number: specifies the number of a basic ACL used to
ng
filter the destination address of routes.

acl-name acl-name: specifies the name of an ACL. The
ni
name is case-sensitive and must start with a letter.

ip-prefix: filters routes based on an IP prefix list.
ar
ip-prefix-name: specifies the name of an IP prefix list

used to filter the destination address of routes.
Le
filter-policy gateway ip-prefix-name import

gateway: filters routes based on the advertising gateway.
ip-prefix-name: specifies the IP prefix list name of the
re
advertising gateway.
Mo
en
m/
Run the filter-policy gateway command to filter routes from a
specified neighbor. In this case, routes from R4 are filtered on
co
R3.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
To reduce routing entries, Company A decides to summarize
ht
routes. RIPv2 summarization includes automatic

summarization based on the main class network and manual
summarization. You can perform automatic summarization on
s:
R1 and manual summarization on R3 and R4.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
summary [ always ]: When the class summarization is enable,
ht
summary routes are advertised to the natural network

boundary. In default the RIPv2 is enable. But If split horizon or
poison reverse is configured, summarization will become
s:
invalid. And when the always parameter is configured, no

ce
matter how the split horizon or poison reverse situation is,

RIPv2 automatic summarization is enable.
ur
rip summary-address ip-address mask [ avoid-feedback ]:

so
configures a RIP router to advertise a local summary IP

address. If the avoid-feedback keyword is configured, the
Re
local interface does not learn the summary route to the

advertised summary IP address. This configuration prevents
ng
routing loops.
ni
View
summary [ always ]: RIP view
ar
rip summary-address ip-address mask [ avoid-feedback ]:

interface view
Le
Parameters
summary [ always ]
re
always: If the always parameter is not configured,

Mo
classful summarization becomes ineffective when split

horizonor poison reverse is configured.
en
Therefore, when summary routes are advertised to the natural
m/
network boundary with no always, split horizon or poison
reverse must be disabled in corresponding views.
co
rip summary-address ip-address mask [ avoid-feedback ]
ip-address: specifies a summary IP address.
.
mask: specifies a network mask.
ei
avoid-feedback: avoids learning the summary route to
w
the advertised summary IP address from the interface.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this case, R1 and R2 connect over network 192.168.1.0/24.
ht
R1 connects to network 10.0.0.0/24, and R2 connects to

network 172.16.0.0/24. Devices on the network run RIPv2 and
import the routes to networks where the devices reside. Only
s:
the display command output of R1 is provided and only

ce
information about this case is displayed.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
timers rip update age garbage-collect: adjusts a timer.
ht
rip authentication-mode md5 nonstandard password-

key key-id: configures the MD5 authentication mode.
Authentication packets use the nonstandard packet format.
s:
nonstandard indicates that MD5 authentication packets use

ce
the nonstandard packet format (IETF standards).

rip replay-protect [ window-range ]: enables the replay-
ur
protect function. window-range specifies the receive or

transmit buffer size for connections. The default value is 50.
so
View
Re
timers rip update age garbage-collect: RIP view

rip authentication-mode md5 nonstandard password-
ng
key key-id: interface view

rip replay-protect [ window-range ]: interface view
ni
Parameters
ar
timers rip update age garbage-collect

update: specifies the interval for transmitting route
Le
updates.
age: specifies the route aging time.
garbage-collect: specifies the interval at which an
re
unreachable route is deleted from the routing table, namely,

Mo
garbage-collect time defined in standards.

en
Precautions
m/
If the three timers are configured incorrectly, routes become
unstable. The update time must be shorter than the aging time.
co
For example, if the update time is longer than the aging time, a
RIP router cannot notify route updates to neighbors within the
.
update time. In applications, the timeout period of the garbage-
ei
collect timer is not fixed. When the update timer is set to 30
w
seconds, the garbage-collect timer may range from 90 to 120
ua
seconds. The reason is as follows: Before the RIP router
deletes an unreachable route from the routing table, it sends
.h
Update packets four times to advertise the route and sets the
metric of the route is set to 16. Subsequently, all the neighbors
g
learn that the route is unreachable. Because a route may
in
become unreachable anytime within an update period, the
rn
garbage-collect timer is 3 to 4 times the update timer.
Assume that the Identification field (a field in an IP header) of
ea
the last RIP packet sent before a RIP interface goes Down is X.
After the interface becomes Up, the Identification file of the
/l
RIP packet sent again becomes 0, and subsequent RIP
packets are discarded until a RIP packet with the Identification
:/
field as X+1 is received. This, however, causes asynchronous
and lost RIP routing information between two ends. To
tp
address this issue, configure the rip replay-protect command

ht
to enable the RIP interface to obtain the Identification field of

the last RIP packet sent before the RIP interface goes Down
and increase the Identification field in the subsequent RIP
s:
packet by 1.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
1. Check whether ARP is working properly.

2. Check whether related interfaces are Up.
ht
3. Check whether RIP is enabled on the interfaces. Run the display

current-configuration configuration rip command to view
information about the RIP-enabled network segment. Check
s:
whether the interfaces reside on the network segment. The network

ce
address specified in the network command must be a natural

network address.
ur
4. Check whether versions of the RIP packets sent by the peer end
and received by the local end match. By default, an interface sends
so
only RIPv1 packets but can receive RIPv1 and RIPv2 packets.
When an inbound interface receives RIP packets of a different
Re
version, RIP routes may fail to be correctly received.

5. Check whether a routing policy is configured to filter received RIP
ng
routes. If so, modify the routing policy.

6. Check whether UDP port 520 is disabled.
ni
7. Check whether the undo rip input/output commands are

configured on the interfaces or whether a high metric is configured
ar
using the rip metricin command.

8. Check whether the interfaces are suppressed.
Le
9. Check whether the route metric is larger than 16.

10. Check whether the interface authentication modes on two ends
match. If packet authentication fails, correctly configure interface
re
authentication modes.
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
1. Check whether RIP is enabled on the interfaces. Run the display

current-configuration configuration rip command to view
ht
information about the RIP-enabled network segment. Check

whether the interfaces reside on the network segment. The network
address specified in the network command must be a natural
s:
network address.
ce
2. Check whether versions of the RIP packets sent by the peer end
and received by the local end match. By default, an interface sends
ur
only RIPv1 packets but can receive RIPv1 and RIPv2 packets.
When an inbound interface receives RIP packets of a different
so
version, RIP routes may fail to be correctly received.

3. Check whether a routing policy is configured to filter received RIP
Re
routes. If so, modify the routing policy.

4. Check whether UDP port 520 is disabled.
5. Check whether the undo rip input/output commands are
ng
configured on the interfaces or whether a high metric is configured

ni
using the rip metricin command.

6. Check whether the interfaces are suppressed.
ar
7. Check whether the route metric is larger than 16.

8. Check whether the interface authentication modes on two ends
Le
match. If packet authentication fails, correctly configure interface

authentication modes.
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this case, R1 connects to R2 through a frame relay network.
ht
R1 connects to network 10.X.X.0/24, and R2 connects to

network 172.16.X.0/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Analysis process
In the pre-configurations of R1 and R2, the frame relay
ht
configuration supports multicast.

R1 sends version 2 Update packets to R2 in multicast.
R1 and R2 can learn routes to each other.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Generally, the peer command makes the routers send the
ht
packets in unicast, but not surpress to sent packets in

multicast by default. Therefore, suggest configure the related
interfaces are silent mode when configure this command. So,
s:
the multicast packets is supress and only unicast packets can

ce
be sent.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
The display rip route command displays the RIP routes
ht
learned from other routers and values of timers for routes. The
Tag field indicates whether a RIP route is an internal or
external route. The default value is 0. The Flags field indicates
s:
whether a RIP route is active or inactive. The value RA

ce
indicates an active RIP route, and the value RG indicates an

inactive RIP route and that the garbage-collect timer has been
ur
started.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
After the avoid-feedback keyword is specified, the local
ht
interface does not learn the summary route to the advertised

summary IP address, preventing routing loops.
The filter-policy export command configures a filtering policy
s:
to filter the routes to be advertised. Only the filtered routes can

ce
be added to the routing table and advertised through Update

packets.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this topology, R1, R2, and R3 connect to the same
ht
broadcast domain. R3 connects to network 172.16.X.0/24 and

advertises routes to RIP.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Analysis process
In requirements 1 and 3, R1 is taken as an example. The
ht
command output shows that R1 sends multicast packets and

does not start authentication.
Before meeting requirement 2, R1 can receive all routes to
s:
172.16.X.0/24.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
RIP authentication command can only be configured on an
ht
interface. Huawei devices support standard MD5

authentication and Huawei proprietary authentication mode.
You can run the display rip process-id interface interface-
s:
type verbose command to view the authentication mode.

ce
Parameters
ur
rip authentication-
mode { simple password | md5 { nonstandard { password-
so
key1 key-id | keychain keychain-name } | usual password-

key2 } }
Re
simple: indicates plain-text authentication.

password: Specifies the plain-text authentication password.
ng
md5: indicates MD5 cipher-text authentication.

nonstandard: indicates that MD5 cipher-text
ni
authentication packets use the nonstandard packet

format (IETF standards)
ar
password-key1: specifies the authentication password in

cipher text.
Le
key-id: specifies the key in MD5 cipher-text authentication.

keychain keychain-name: specifies a keychain name.
usual: indicates that MD5 cipher-text authentication
re
packets use the universal packet format (namely,

Mo
private standards).
en
password-key2: indicates the cipher-text authentication
m/
keyword.
co
Precautions
Only one authentication password is used for each
.
authentication. If multiple authentication passwords are
ei
configured, only the latest one takes effect. The authentication
w
password does not contain spaces.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Only an ACL can be used but an IP prefix list cannot be used,
ht
When defined ACLs make sure use the wild-mask. In this case,
need focus on the bits of wild-mask is 0, and the other bits is 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
RIPv2 multicasts Update packets by default. You can run the
ht
rip version 2 broadcast command in the interface view to

configure RIPv2 to broadcast Update packets.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IS-IS Overview
IS-IS is a dynamic routing protocol designed by the
ht
International Organization for Standardization (ISO) for its

Connectionless Network Protocol (CLNP).
The Internet Engineering Task Force (IETF) extended and
s:
modified IS-IS so that IS-IS can be applied to TCP/IP and

ce
OSI environments. This version of IS-IS is called Integrated

IS-IS.
ur
IS-IS Terms
Connectionless network service (CLNS)
so
CLNS consists of the following three protocols:

CLNP: is similar to the Internet Protocol (IP) of TCP/IP.
Re
IS-IS: is a routing protocol between intermediate systems,

that is, a protocol between routers.

ng
ES-IS: End System to Intermediate System ,is similar to

Address Resolution Protocol (ARP) and Internet Control
ni
Message Protocol (ICMP) of IP.

NSAP: The open systems interconnection (OSI) uses
ar
NSAP(Network Service Access Point) to search for various

services at the transport layer on OSI networks. An NSAP
Le
is similar to an IP address.
Note for Integrated IS-IS
Integrated IS-IS applies to TCP/IP and OSI environments.
re
Unless otherwise specified, the IS-IS protocol in this

Mo
material refers to Integrated IS-IS.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Overall IS-IS Topology

To support large-scale routing networks, IS-IS adopts a
ht
two-level hierarchy consisting of a backbone area and non-

backbone areas in an autonomous system (AS). Generally,
Level-1 routers are deployed in non-backbone areas,
s:
whereas Level-2 and Level-1-2 routers are deployed in the

ce
backbone area. Each non-backbone area connects to the

backbone area through a Level-1-2 router.
ur
Topology Introduction
The figure shows a network that runs IS-IS. The network
so
topology is similar to the multi-area topology of an OSPF

network. The backbone area contains all routers in area
Re
47.0001 and Level-1-2 routers in other areas.

In addition, Level-2 routers can be in different areas.
Differences between IS-IS and OSPF of topology are as follows:
ng
In OSPF, a link can belongs to only one area.In IS-IS, a link

ni
can belong to different areas.

In IS-IS, no area is physically defined as the backbone or
ar
non-backbone area. In OSPF, Area 0 is defined as the

backbone area.
Le
In IS-IS, Level-1 and Level-2 routers use the shortest path

first (SPF) algorithm to generate shortest path trees (SPTs)
respectively. In OSPF, the SPF algorithm is used only in
re
the same area, and inter-area routes are forwarded by the

Mo
backbone area.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Level-1 Router
A Level-1 router manages intra-area routing. It establishes
ht
neighbor relationships with only Level-1 and Level-1-2

routers in the same area. A Level-1 router maintains a
Level-1 link state database (LSDB), which contains routes
s:
in the local area.

A Level-1 router forwards packets destined for other areas
ce
to the nearest Level-1-2 router.

ur
A Level-1 router connects to other areas through a Level-1-

2 router.
so
Level-2 Router
A Level-2 router manages inter-area routing. It can
Re
establish neighbor relationships with Level-2 routers in the

same area or in other areas, as well as Level-1-2 routers.
A Level-2 router maintains a Level-2 LSDB, which contains
ng
all routes in the IS-IS network.

ni
All Level-2 routers form the backbone network of the

routing domain,They establish Level-2 neighbor
ar
relationships and are responsible for inter-area

communication. Level-2 routers in the routing domain must
Le
be physically contiguous to ensure the continuity of the

backbone network.
Level-1-2 Router
re
A router that belongs to both a Level-1 area and a Level-2

Mo
area is called a Level-1-2 router. It can establish Level-1

neighbor relationships with Level-1 and Level-1-2 routers in
the same area.
en
It can also establish Level-2 neighbor relationships with
m/
Level-2 and Level-1-2 routers in the same area or the other
areas.
co
A Level-1 router connects to other areas through a Level-1-
2 router.
.
A Level-1-2 router maintains a Level-1 LSDB for intra-area
ei
routing and a Level-2 LSDB for inter-area routing.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Network Types Supported by IS-IS

For a non-broadcast multiple access (NBMA) network such
ht
as a frame relay (FR) network, you need to configure sub-

interfaces and set the sub-interface type to point-to-point
(P2P). IS-IS cannot run on point-to-multipoint (P2MP) links.
s:
DIS
In a broadcast network, IS-IS needs to elect a designated
ce
intermediate system (DIS) from all the routers.

ur
The Level-1 DIS and Level-2 DIS are elected respectively.

The router with the highest DIS priority is elected as the
so
DIS. If there are multiple routers with the highest DIS

priority, the router with the largest MAC address is elected
Re
as the DIS.
You can set different DIS priorities for electing DISs of
ng
different levels.
A router whose DIS priority is 0 can also participate in a
ni
DIS election, which supports preemption.

All routers (including non-DIS routers) of the same level
ar
and on the same network segment establish adjacencies.

However, the LSDB synchronization is ensured by DISs.
Le
DISs are used to create and update pseudonodes, and

generate link state protocol data units (LSPs) of
pseudonodes. LSPs are used to describe network devices
re
on the network.
Mo
en
Pseudonode
m/
A pseudonode is used to simulate a virtual node in the
broadcast network. It is not a real router. In IS-IS, a
co
pseudonode is identified by the system ID of the DIS and
the 1-byte Circuit ID (its value is not 0). The use of
.
pseudonodes simplifies the network topology.
ei
When the network changes, the number of generated LSPs
w
is reduced, and the SPF calculation consumes fewer
ua
resources.
.h
Differences Between DIS in IS-IS and designated router (DR)/backup
designated router (BDR) in OSPF
g
In an IS-IS broadcast network, a router whose priority is 0
in
also takes part in DIS election. In an OSPF network, a
rn
router whose priority is 0 does not take part in DR election.
In an IS-IS broadcast network, when a new router that
ea
meets the requirements of being a DIS connects to the
network, the router is elected as the new DIS, and the
/l
previous pseudonode is deleted. This causes a new
flooding of LSPs. In an OSPF network, when a new router
:/
connects to the network, it is not immediately elected as the
tp
DR even if it has the highest DR priority.

In an IS-IS broadcast network, all routers (including non-
ht
DIS routers) of the same level and on the same network

segment establish adjacencies.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
NSAP
An NSAP consists of the initial domain part (IDP) and domain
ht
specific part (DSP). The lengths of the IDP and DSP are variable.
The maximum length of the NSAP is 20 bytes and its minimum
length is 8 bytes.
s:
The IDP is similar to the network ID in an IP address. It is defined

ce
by the ISO and consists of the authority and format identifier (AFI)
and initial domain identifier (IDI). The AFI indicates the address
ur
allocation authority and address format, and the IDI identifies a

domain.
so
The DSP is similar to the subnet ID and host address in an IP

address. It consists of the High Order DSP (HODSP), system ID,
Re
and NSAP selector (SEL). The HODSP is used to divide areas,

the system ID identifies a host, and the SEL indicates the service
ng
type.
The area address (area ID) consists of the IDP and the HODSP of
ni
the DSP. It identifies a routing domain and areas in a routing

domain. An area address is similar to an area number in OSPF.
ar
Routers in the same Level-1 area must have the same area
address, while routers belong to the Level-2 area can have
Le
different area addresses.

A system ID uniquely identifies a host or router in an area. On a
device, the fixed length of the system ID is 48 bits (6 bytes).
re
Generally, the device's router ID is converted into a system ID.

Mo
An SEL provides similar functions as the protocol identifier of IP.

A transport protocol matches an SEL. The SEL is always 00 in IP.
en
NET
m/
An NET indicates network layer information about a device. An
NET can be regarded as a special NSAP (SEL is 00). The NET
co
length is the same as the NSAP length. Its maximum length is 20
bytes and minimum length is 8 bytes. When configuring IS-IS on a
.
router, you only need to consider an NET but not an NSAP.
ei
A maximum of three NETs can be configured during IS-IS
w
configuration. When configuring multiple NETs, ensure that their
ua
system IDs are the same.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Hello PDU (IIH)

Level-1 LAN IIHs apply to Level-1 routers on broadcast networks.
ht
Level-2 LAN IIHs apply to Level-2 routers on broadcast networks.

P2P IIHs apply to non-broadcast networks.
Compared to a LAN IIH, a P2P IIH does not have the Priority and
s:
LAN ID fields, but has a Local Circuit ID field. The Priority field
ce
indicates the DIS priority in a broadcast network, the LAN ID field

indicates the system ID of the DIS and pseudonode, and the
ur
Local Circuit ID field indicates the local link ID.

IIHs are used for two neighbors to negotiate MTU by padding the
so
packets to the maximum size.

LSP LSPs are similar to link-state advertisements (LSAs) in OSPF.
Re
Level-1 routers transmit Level-1 LSPs.

Level-2 routers transmit Level-2 LSPs.

ng
Level-1-2 routers transmit both Level-1 and Level-2 LSPs.

The ATT, OL, and IS-Type fields are major fields in an LSP. The
ni
ATT field identifies that the LSP is sent by a Level-1 or Level-2

router. The OL field identifies the overload state. The IS-Type field
ar
indicates whether the router that generates the LSP is a Level-1

router or Level-2 router (the value 01 indicates Level-1 and the
Le
value 11 indicates Level-2).

The LSP update interval is 15 minutes and aging time is 20
minutes. However, an expired LSP will be kept in the database for
re
an additional 60 seconds (known as ZeroAgeLifetime) before it is

Mo
cleared. The LSP retransmission time is 5 seconds.

en
Sequence number PDU (SNP)
m/
An SNP contains summary information of the LSDB and is used
to maintain LSDB integrity and synchronization.
co
Complete SNPs (CSNPs) carry summaries of all LSPs in LSDBs,
ensuring LSDB synchronization between neighboring routers. In a
.
broadcast network, the DIS periodically sends CSNPs. The
ei
default interval for sending CSNPs is 10 seconds. On a P2P link,
w
CSNPs are sent only when the neighbor relationship is
ua
established for the first time.
Partial SNPs (PSNPs) carry summaries of LSPs in some LSDBs,
.h
and are used to request and acknowledge LSPs.
Initial Packet Structure of an IS-IS PDU
g
Intra domain routing protocol discriminator
in
This field has a fixed value of 0x83 in all IS-IS PDUs.

rn
PDU header length indicator
It identifies the length of the fixed header field.
ea
Version/protocol ID extension
It has a fixed value of 1.
System ID length
/l
It indicates the system ID length and has a fixed
:/
value of 6 bytes.
PDU type
tp
It identifies the PDU type.

ht
Version
It has a fixed value of 1.
Reserve
s:
It is set to all zeros.

Max areas
ce
It indicates the maximum number of areas supported

by the intermediate system (IS). If the value is 3, the
ur
IS supports a maximum of three areas.

IIHs on a P2P link
so
Circuit type
Re
It indicates the level of the router that sends the PDU.

If this field is set to 0, the PDU will be ignored.
System ID
ng
It indicates the system ID of the originating router

that sends the IIH.
ni
Holding time
It indicates the interval for the peer router to wait for
ar
the originating router to send the next IIH.

Le
PDU length
It indicates the PDU length.
Local circuit ID
re
It is allocated to the local circuit by the originating

router when the router sends IIHs. This ID is unique
Mo
on the router interface. On the other end of the P2P

link, thecircuit ID contained in IIHs may be the same
as or different from the local circuit ID.
en
Area address TLV
m/
It indicates the area address of the originating router.
IP interface address TLV
co
It indicates the interface address or IP address of the
router that sends the PDU.
.
Protocol supported TLV
ei
It indicates protocol types supported by the
w
originating router, such as IP, CLNP, and IPv6.
ua
Restart option TLV
It is used for graceful restart.
.h
Point-to-point adjacency state TLV
It indicates that three-way handshake is supported.
g
Multi topology TLV
in
It indicates that multi-topology is supported.

rn
Padding TLV
It indicates that IIH padding is supported.
ea
LSP
PDU length
Remaining lifetime
/l
:/
It indicates the time before an LSP expires
LSP ID
tp
It can be the system ID, pseudonode ID, or LSP

ht
number.
The value 0000.0000.0001.00-00 indicates a
common LSP.
s:

pseudonode LSP.
ce

fragment of a common LSP.
ur
Sequence number
It indicates the sequence number of the LSP. The
so
value starts from 0 and increases by 1. The

Re
maximum value is 2^32-1.

Checksum
It indicates the checksum. The checksum start after
ng
from the LSP Remaining Time till the end.

P bit
ni
It is used to repair segmented areas and is similar to

the OSPF virtual link. Most vendors do not support
ar
this feature.
Le
ATT bit
It indicates that the originating router is connected to
one or multiple areas.
re
OL bit
It identifies the overload state.
Mo
IS type
It indicates the router type.
en
Protocol supported TLV
m/
It indicates protocol types supported by the
originating router, such as IP, CLNP, and IPv6.
co
Area address TLV
It indicates the area address of the originating router.
.
IS reachability TLV
ei
It is used to list neighbors of the originating router.
w
IP interface address TLV
ua
It indicates the interface address or IP address of the
router that sends the PDU.
.h
IP internal reachability TLV
It indicates that the IP address is internally reachable.
g
It is used to advertise the IP address and related
in
mask information of the area that directly connects to
rn
the router that sends the LSP. A pseudonode LSP
does not contain this TLV.
ea
CSNP and PSNP
PDU length
/l
Source-ID
:/
It indicates the system ID of the originating router.
Start LSP-ID
tp
It starts from 0000.0000.0000.00-00.

It ends at ffff.ffff.ffff.ff-ff.
ht
LSP entries
LSP summary information
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Routers of different levels cannot establish neighbor relationships.

Level-2 routers cannot establish neighbor relationships with Level-1
ht
routers. However, Level-1-2 routers can establish Level-1 neighbor

relationships with Level-1 routers in the same area, and establish
Level-2 neighbor relationships with Level-2 routers in the same area or
s:
in different areas.
ce
Level-1 routers can only establish Level-1 neighbor relationships with

ur
Level-1 or Level-1-2 routers in the same area.

so
IP addresses of IS-IS interfaces on both ends of a link must be on the

Re
same network segment.

According to IS-IS principles, the establishment of IS-IS neighbor
relationships is irrelevant to IP addresses. Therefore, routers that
ng
establish neighbor relationships may be on different network

segments. To solve this problem, Huawei devices check the
ni
network segment of routers to ensure that IS-IS neighbor

relationships are correctly established.
ar
You can configure interfaces not to check IP addresses on a P2P

Le
network if the network does not need to check the IP addresses.

In a broadcast network, you need to simulate Ethernet interfaces
as P2P interfaces before configuring the interfaces not to check
re
IP addresses.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Two routers running IS-IS need to establish a neighbor relationship

before exchanging protocol packets to implement routing. On different
ht
networks, the modes for establishing IS-IS neighbor relationships are

different.
s:
In a broadcast network, routers exchange LAN IIHs to establish

ce
neighbor relationships. LAN IIHs are classified into Level-1 LAN IIHs
(with the multicast MAC address 01-80-C2-00-00-14) and Level-2 LAN
ur
IIHs (with the multicast MAC address 01-80-C2-00-00-15). Level-1

routers exchange Level-1 LAN IIHs to establish neighbor relationships.
so
Level-2 routers exchange Level-2 LAN IIHs to establish neighbor

relationships. Level-1-2 routers exchange Level-1 LAN IIHs and Level-2
Re
LAN IIHs to establish neighbor relationships.
In this example, two Level-2 routers establish a neighbor relationship

ng
on a broadcast link.
ni
R1 multicasts a Level-2 LAN IIH (with the multicast MAC address

01-80-C2-00-00-15) with no neighbor ID specified.
ar
R2 receives the packet and sets the status of the neighbor

relationship with R1 to Initial. R2 then responds to R1 with a
Le
Level-2 LAN IIH, indicating that R1 is a neighbor of R2.

relationship with R2 to Up. R1 then responds to R2 with a Level-2
re
LAN IIH, indicating that R2 is a neighbor of R1.

Mo

relationship with R1 to Up. R1 and R2 successfully establish a
neighbor relationship.
en
The network is a broadcast network, so a DIS needs to be elected.
m/
After the neighbor relationship is established, routers wait for two
intervals before sending Hello PDUs to elect the DIS. Hello PDUs
co
exchanged by the routers contain the Priority field. The router with the
highest priority is elected as the DIS. If the routers have the same
.
priority, the router with the largest interface MAC address is elected as
ei
the DIS. In an IS-IS network, the DIS sends Hello PDUs at an interval
w
of 10/3 seconds, and non-DIS routers send Hello PDUs at an interval of
ua
10 seconds.
.h
Differences between IS-IS Adjacencies and OSPF Adjacencies
In IS-IS, two neighbor routers establish an adjacency if they
g
exchange Hello PDUs. In OSPF, two routers establish a neighbor
in
relationship if they are in 2-Way state, and establish an adjacency
rn
if they are in Full state.
In IS-IS, a router whose priority is 0 can participate in a DIS
ea
election. In OSPF, a router whose priority is 0 does not take part
in DR election.

/l
In IS-IS, the DIS election is based on preemption. In OSPF, a
router cannot preempt to be the DR or BDR if the DR or BDR has
:/
been elected.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Unlike the establishment of a neighbor relationship on a broadcast

network, the establishment of a neighbor relationship on a P2P network
ht
is classified into two modes: two-way mode and three-way mode.

s:
Two-Way Mode
Upon receiving a P2P IIH from a peer router, a router
ce
considers the peer router Up and establishes a neighbor

ur
relationship with the peer router.

Unidirectional communication may occur.
so
Three-Way Mode
A neighbor relationship is established after P2P IIHs are
Re
sent for three times. The establishment of a neighbor

relationship on a P2P network is similar to that on a
ng
broadcast network.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The process of synchronizing LSDBs between a newly added router

and DIS on a broadcast link is as follows:
ht
Assume that the newly added router R3 has established neighbor

relationships with R2 (DIS) and R1.
R3 sends an LSP to a multicast address (01-80-C2-00-00-14 in a
s:
Level-1 area and 01-80-C2-00-00-15 in a Level-2 area). All

ce
neighbors on the network can receive the LSP.

The DIS on the network segment adds the received LSP to its
ur
LSDB. After the CSNP timer expires, the DIS sends CSNPs at an
interval of 10 seconds to synchronize the LSDBs on the network.
so
R3 receives the CSNPs from the DIS, checks its LSDB, and
sends a PSNP to the DIS to request the LSPs it does not have.
Re
The DIS receives the PSNP and sends the required LSPs to R3
for LSDB synchronization.
ng
The process of updating the LSDB of the DIS is as follows:

ni
The DIS receives an LSP and searches the matching record in

the LSDB. If no matching record exists, the DIS adds the LSP to
ar
the LSDB and multicasts the new LSDB.

If the sequence number of the received LSP is larger than that of
Le
the corresponding LSP in the LSDB, the DIS replaces the local
LSP with the received LSP and multicasts the new LSDB. If the
re
sequence number of the received LSP is smaller than that of the

LSP in the LSDB, the DIS sends the local LSP to the inbound
Mo
interface.
en
If the sequence number of the received LSP is the same as that of
m/
the
corresponding LSP in the LSDB, the DIS compares the remaining
co
lifetime of the two LSPs. If the remaining lifetime of the received
LSP is smaller than that of the LSP in the LSDB, the DIS replaces
.
the local LSP with the received LSP and broadcasts the new
ei
LSDB. If the remaining lifetime of the received LSP is larger than
w
that of the LSP in the LSDB, the DIS sends the local LSP to the
ua
inbound interface.
If the sequence number and the remaining lifetime of the received
.h
LSP and those of the corresponding LSP in the LSDB are the
same, the DIS compares the checksum of the two LSPs. If the
g
checksum of the received LSP is larger than that of the LSP in the
in
LSDB, the DIS replaces the local LSP with the received LSP and
rn
broadcasts the new LSDB. If the checksum of the received LSP is
smaller than that of the LSP in the LSDB, the DRB sends the local
ea
LSP to the inbound interface.
If the sequence number, remaining lifetime, and checksum of the
/l
received LSP and those of the corresponding LSP in the LSDB
are the same, the LSP is not forwarded.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The process of synchronizing LSDBs on a P2P network is as follows:

After establishing a neighbor relationship, R1 and R2 send a
ht
CSNP to each other. If the LSDB of the neighbor and the received
CSNP are not synchronized, the neighbor sends a PSNP to
request the required LSP.
s:
Assume that R2 requests the required LSP from R1. R1 sends

ce
the required LSP to R2, starts the LSP retransmission timer, and
waits for a PSNP from R2 as an acknowledgement for the
ur
received LSP.
If R1 does not receive a PSNP from R2 after the LSP
so
retransmission timer expires, R1 resends the LSP until it receives

a PSNP from R2.
Re
The process of updating LSDBs on a P2P link is as follows:

ng
If the sequence number of the received LSP is smaller than that of

the corresponding LSP in the LSDB, the router directly sends the
ni
local LSP to the neighbor and waits for a PSNP from the neighbor.
If the sequence number of the received LSP is larger than that of
ar
the corresponding LSP in the LSDB, the router adds the received
LSP to its LSDB, sends a PSNP to acknowledge the received
Le
LSP, and then sends the received LSP to all its neighbors except
the neighbor that sends the LSP.
re
If the sequence number of the received LSP is the same as that of

the corresponding LSP in the LSDB, the router compares the
Mo
remaining lifetime of the two LSPs.

en
If the remaining lifetime of the received LSP is smaller than that of
m/
the LSP in the LSDB, the router replaces the local LSP with the
received LSP, sends a PSNP to acknowledge the received LSP,
co
and sends the received LSP to all neighbors except the neighbor
that sends the LSP. If the remaining lifetime of the received LSP
.
is larger than that of the LSP in the LSDB, the router sends the
ei
local LSP to the neighbor and waits for a PSNP.
w
If the sequence number and remaining lifetime of the received
ua
LSP are the same as those of the corresponding LSP in the LSDB,
the router compares the checksums of the two LSPs. If the
.h
checksum of the received LSP is larger than that of the LSP in the
LSDB, the router replaces the local LSP with the received LSP,
g
sends a PSNP to acknowledge the received LSP, and sends the
in
received LSP to all neighbors except the neighbor that sends the
rn
LSP. If the checksum of the received LSP is smaller than that of
the LSP in the LSDB, the router sends the local LSP to the
ea
neighbor and waits for a PSNP.
If the sequence number, remaining lifetime, and checksum of the
/l
received LSP and those of the corresponding LSP in the LSDB
are the same, the LSP is not forwarded.
:/
tp
On a P2P network, a PSNP has the following functions:

It is used to acknowledge a received LSP.
ht
It is used to request a required LSP.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Assume that R1 sends packets to R6. The default situation is as

follows:
ht
As a Level-1 router, R1 does not know routes outside its area, so

it sends packets to other areas through the default route
s:
generated by the nearest Level-1-2 router (R3). Therefore, R1

selects the route R1->R3->R5->R6, which is not the optimal
ce
route, to forward the packets.

ur
To solve this question, IS-IS provide the Route Leaking. You can
so
configure access control lists (ACLs) and routing policies and mark
routes with tags on Level-1-2 routers to select eligible routes. Then a
Re
Level-1-2 router can advertise routing information of other Level-1

areas and the backbone area to its Level-1 area.
ng
If route leaking is enabled on Level-1-2 routers (R3 and R4), Level-1

ni
routers in area 47.0001 can know of routes outside area 47.0001 and
routes passing through the two Level-1-2 routers. After route calculation,
ar
the forwarding path becomes R1->R2->R4->R5->R6, which is the

Le
optimal route from R1 to R6.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Principles
LSPs with the overload bit are still flooded on the network,
ht
but the LSPs are not used when routes that pass through a
router configured with the overload bit are calculated. That is,
s:
after the overload bit is set on a router, other routers ignore this
router when performing SPF calculation and calculate only the
ce
direct routes of the router.

ur
Topology
so
R2 forwards the packets from R1 to R3. If the overload bit on R2

is set to 1, R1 considers the LSDB of R2 incomplete and sends
Re
packets to R3 through R4 and R5. This process does not affect

packets sent to the directly connected address of R2.
ng
A device enters the overload state in the following situations:

A device automatically enters the overload state due to
ni
exceptions.
You can manually configure a device to enter the overload
ar
state.
Le
Results of entering the overload state

If the system enters the overload state due to exceptions, the
system deletes all the imported or leaked routes.
re
If the system is configured to enter the overload state, the system

determines whether to delete all the imported or leaked routes
Mo
based on the configuration.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Fast Convergence
Incremental SPF (I-SPF): recalculates only the routes of the
ht
changed nodes rather than all the nodes when the network
topology changes, with exception to where calculation is
s:
performed for the first time, at which time all nodes are involved,
thereby speeding up route calculation. I-SPF improves the SPF
ce
algorithm. The shortest path tree (SPT) generated is the same as

that generated by the SPF algorithm. This decreases CPU usage
ur
and speeds up network convergence.

so
Partial route calculation (PRC): calculates only the changed

routes when the network topology changes. Similar to I-SPF, PRC
Re
calculates only the changed routes, but it does not calculate the
shortest path. It updates routes based on the SPT
ng
calculated by I-SPF. In route calculation, a leaf represents a

ni
route, and a node represents a router. If the SPT changes

after I-SPF calculation, PRC processes all the leaves only on the
ar
changed node. If the SPT remains unchanged, PRC processes

only the changed leaves. For example, if IS-IS is enabled on an
Le
interface of a node, the SPT calculated by I-SPF remains

unchanged. PRC updates only the routes of this interface,
re
consuming less CPU resources.

Mo
en
Intelligent Timer
m/
LSP generation intelligent timer: There is a minimum
interval restriction on LSP generation to prevent frequent
co
flapping of LSPs from affecting the network. The same LSP
cannot be generated repeatedly within the minimum
.
interval, which is 5 seconds by default. This restriction
ei
significantly affects route convergence speed.
w
In IS-IS, if local routing information changes,
ua
a router generates a new LSP to advertise this change.
When local routing information changes frequently, the
.h
newly generated LSPs consume a lot of system resources.
If the delay in generating an LSP is too long, the router
g
cannot advertise changed routing information to neighbors
in
in time, reducing the network convergence speed. The
rn
delay in generating an LSP for the first time is determined
by init-interval, and the delay in generating an LSP for the
ea
second time is determined by incr-interval. From the third
time on, the delay in generating an LSP increases twice
/l
every time until the delay reaches the value specified by
max-interval. After the delay remains at the value specified
:/
by max-interval for three times or the IS-IS process is
restarted, the delay decreases to the value specified by init-
tp
interval. When only max-interval is specified, the intelligent

ht
timer functions as an ordinary one-time triggering timer.

SPF calculation intelligent timer: In IS-IS, routes are
calculated when the LSDB changes. However, frequent
s:
route calculations consume a lot of system resources and

decrease the system performance. Delaying SPF
ce
calculation can improve route calculation efficiency. If the

delay in route calculation is too long, the route convergence
ur
speed is reduced. The delay in SPF calculation for the first

time is determined by init-interval and the delay in SPF
so
calculation for the second time is determined by incr-

Re
interval. From the third time on, the delay in SPF

calculation increases twice every time until the delay
reaches the value specified by max-interval. After the delay
ng
remains at the value specified by max-interval for three

times or the IS-IS process is restarted, the delay decreases
ni
to the value specified by init-interval. If incr-interval is not

specified, the delay in SPF calculation for the first time is
ar
determined by init-interval. From the second time on, the

Le
delay in SPF calculation is determined by max-interval.

After the delay remains at the value specified by max-
interval for three times or the IS-IS process is restarted, the
re
delay decreases to the value specified by init-interval.

When only max-interval is specified, the intelligent timer
Mo
functions as an ordinary one-time triggering timer.

en
LSP fast flooding: Because the number of LSPs is huge, IS-IS
m/
periodically floods LSPs in batches to reduce the impact of LSP
flooding on network devices. By default, the minimum interval for
co
sending LSPs on an interface is 50 milliseconds and the
maximum number of LSPs sent at a time is 10. After the flash-
.
flood function is enabled, when LSPs change and cause SPF
ei
recalculation, IS-IS immediately floods LSPs that cause SPF
w
recalculation instead of sending the LSPs periodically. When the
ua
network topology changes, LSDBs of all devices on the network
are inconsistent. This function effectively reduces the time during
.h
which LSDBs are inconsistent and improves the network fast
convergence performance. When a network fault occurs, only a
g
small number of LSPs change although a large number of LSPs
in
exist. Therefore, IS-IS only needs to flood the changed LSPs and
rn
consumes a few system resources.
Priority-based Convergence
ea
You can use the IP prefix list to filter routes and configure different
convergence priorities for different routes so that important routes

/l
are converged first, improving the network reliability.
The convergence priorities of IS-IS routes are classified into
:/
critical, high, medium, and low in decreasing order.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
In area authentication and routing domain authentication, you can

configure a router to authenticate LSPs and SNPs separately in the
ht
following ways:
The router sends LSPs and SNPs carrying the authentication TLV
and verifies the authentication information of the received LSPs
s:
and SNPs.
ce
The router sends LSPs carrying the authentication TLV and

verifies the authentication information of the received LSPs. The
ur
router sends SNPs carrying the authentication TLV but does not
verify the authentication information of the received SNPs.
so
The router sends LSPs carrying the authentication TLV and

Re
verifies the authentication information of the received LSPs.

The router sends SNPs without the authentication TLV and
does not verify the authentication information of the received
ng
SNPs.
The router sends LSPs and SNPs carrying the authentication TLV
ni
but does not verify the authentication information of the received

ar
LSPs and SNPs.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Concepts
Originating system: is a router that runs the IS-IS protocol. After
ht
LSP fragment extension is enabled, you can configure virtual

systems for the router. The originating system refers to the IS-IS
process.
s:
System ID: is the system ID of the originating system.

ce
Additional System ID: is configured for a virtual system after IS-IS

LSP fragment extension is enabled. A maximum of 256 extended
ur
LSP fragments can be generated for each additional system ID.

Like a normal system ID, an additional system ID must be unique
so
in a routing domain.
Virtual system: is a system identified by an additional system ID. It
Re
is used to generate extended LSP fragments.

Principles

ng
IS-IS floods LSPs to advertise link state information. Because one

LSP carries limited information, IS-IS fragments LSPs. Each LSP
ni
fragment is uniquely identified by and consists of the system ID,

pseudonode ID (0 for a common LSP and a non-zero value for a
ar
pseudonode LSP), and LSP number (LSP fragment No.) of the

node or pseudonode that generates the LSP. The length of the
Le
LSP number is 1 byte. Therefore, an IS-IS router can generate a

maximum of 256 LSP fragments, restricting link information that
can be advertised by the router.
re
Mo
en
The LSP fragment extension feature enables an IS-IS router to
m/
generate more LSP fragments. You can configure up to 50 virtual
systems for the router. Each virtual system can generate a
co
maximum of 256 LSP fragments. An IS-IS router can generate a
maximum of 13,056 LSP fragments.
.
An IS-IS router can run the LSP fragment extension feature in two
ei
modes.
w
Mode-1
ua
It is used when some routers on the network do not support
LSP fragment extension.
.h
Virtual systems participate in SPF calculation. The
originating system advertises LSPs containing information
g
about links to each virtual system. Similarly, each virtual
in
system advertises LSPs containing information about links
rn
to the originating system. Virtual systems look like the
physical routers that connect to the originating system.
ea
The LSP sent by a virtual system contains the same area
address and overload bit as those in a common LSP. If the
/l
LSPs sent by a virtual system contain TLVs specified in
other features, these TLVs must be the same as those in
:/
common LSPs.
The virtual system carries neighbor information indicating
tp
that the neighbor is the originating system, with the metric

ht
equal to the maximum value (64 for narrow metric) minus 1.

The originating system carries neighbor information
indicating that the neighbor is the virtual system, with the
s:
metric 0. This ensures that the virtual system is the

downstream node of the originating system when other
ce
routers calculate routes.

As shown in the topology, R2 does not support LSP
ur
fragment extension, and R1 is configured to support LSP

fragment extension in mode-1. R1-1 and R1-2 are virtual
so
systems of R1 and send LSPs carrying some routing

Re
information of R1. After receiving LSPs from R1, R1-1, and

R1-2, R2 considers that there are three individual routers at
the remote end and calculates routes. Because the cost of
ng
the route from R1 to R1-1 and the cost of the route from R1
to R1-2 are both 0, the cost of the route from R2 to R1 is
ni
the same as the cost of the route from R2 to R1-1.

The LSPs that are generated by virtual systems contain
ar
only the originating system as the neighbor (the neighbor

Le
type is P2P). In addition, virtual systems are considered

only as leaves.
Mode-2
re
It is used when all the routers on the network support LSP

fragment extension. In this mode, virtual systems do not
Mo
participate in SPF calculation.

en
All the routers on the network know that the LSPs
m/
generated by virtual systems actually belong to the
originating system.
co
R2 supports LSP fragment extension, and R1 is configured
to support LSP fragment extension in mode-2. R1-1 and
.
R1-2 are virtual systems of R1 and send LSPs carrying
ei
some routing information of R1.
w
When receiving LSPs from R1-1 and R1-2, R2 obtains the IS
ua
Alias ID TLV and knows that the originating system of R1-1
and R1-2 is R1. R2 then considers that information
.h
advertised by R1-1 and R1-2 belongs to R1.
Precautions
g
After LSP fragment extension is configured, the system
in
prompts you to restart the IS-IS process if information is
rn
lost because LSPs overflow. After being restarted, the
originating system loads as much routing information as
ea
possible to LSPs, and adds the overloaded information to
the LSPs of the virtual system for transmission.
/l
If there are devices of other vendors on the network, LSP
fragment extension must be set to mode-1, otherwise,
:/
devices of other vendors cannot identify the LSPs.
It is recommended that you configure LSP fragment
tp
extension and virtual systems before establishing IS-IS

ht
neighbor relationships or importing routes. If you establish

IS-IS neighbor relationships or import routes, IS-IS will
carry a lot of information that cannot be loaded through 256
s:
fragments. You must configure LSP fragment extension

and virtual systems. The configuration takes effect only
ce
after you restart the IS-IS router. Therefore, exercise

caution when you establish IS-IS neighbor relationships or
ur
import routes.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IS-IS Administrative Tag

Administrative tags control the advertisement of IP prefixes in an
ht
IS-IS routing domain to simplify route management. You can use

administrative tags to control the import of routes of different
levels and different areas and control IS-IS multi-instances (tags)
s:
running on the same router.

ce
Topology
Assume that R1 only needs to receive only Level-1 routing
ur
information from R2, R3, and R4. To meet this requirement,

configure the same administrative tag for IS-IS interfaces on R2,
so
R3, and R4. Then configure the Level-1-2 router in area 47.0003
to leak only the routes matching the configured administrative tag
Re
from Level-2 to Level-1 areas. This configuration allows R1 to

receive only Level-1 routing information from R2, R3, and R4.
Precautions
ng
To use administrative tags, you must enable the IS-IS wide metric
ni
attribute.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
In this case, the addresses for interconnecting devices are as
ht
follows:
If RX interconnects with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively, network
s:
mask is 24.
ce
Remarks
R4 and R5 are Level-1-2 routers. They take part in calculate the
ur
routes of Level-1 and Level-2 at the same time, and maintain the
Level-1 and Level-2 LSDB.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The is-level command sets the level of an IS-IS router. By
ht
default, the level of an IS-IS router is Level-1-2.

The isis circuit-level command sets the link type of an
interface.
s:
View

ce
is-level: IS-IS view

isis circuit-level: interface view
ur
Parameters
is-level { level-1 | level-1-2 | level-2 }
so
level-1: sets a router as a Level-1 router, which

calculates only intra-area routes and maintains a Level-1
Re
LSDB.
level-1-2: sets a router as a Level-1-2 router, which
ng
calculates Level-1 and Level-2 routes and maintains a

Level-1 LSDB and a Level-2 LSDB.
ni
level-2: sets a router as a Level-2 router, which

exchanges only Level-2 LSPs, calculates only Level-2
ar
routes, and maintains a Level-2 LSDB.

isis circuit-level [ level-1 | level-1-2 | level-2 ]
Le
level-1: specifies the Level-1 link type. That is, only

Level-1 neighbor relationship can be established on the
interface.
re
level-1-2: specifies the Level-1-2 link type. That is, both

Mo
Level-1 and Level-2 neighbor relationships can be

established on the interface.
en
level-2: specifies the Level-2 link type. That is, only
m/
Level-2 neighbor relationship can be established on the
interface.
co
Precautions
If a router is a Level-1-2 router and needs to establish a
.
neighbor relationship at a specified level (Level-1 or Level-
ei
2) with a peer router, you can run the isis circuit-level
w
command to allow the local interface to send and receive
ua
only Hello packets of the specified level on the P2P link.
This configuration prevents the router from processing too
.h
many Hello packets and saves the bandwidth.
The configuration of the isis circuit-level command takes
g
effect on the interface only when the IS-IS system type is
in
Level-1-2, otherwise, the level configured using the is-
rn
level command is used as the link type.
In a P2P network, the Circuit ID uniquely identifies a local
ea
interface. In a broadcast network, the Circuit ID is the
system ID and pseudonode ID.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
The topology in this case is the same as that in the previous case.
ht
It is required that no DIS can be elected between R4 and R6 or

between R5 and R6. That is, the links between R4 and R6 and
between R5 and R6 cannot be broadcast links.
s:
A priority that is as small as possible but can still enable a router

ce
to participate in the DIS election is 0.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The isis dis-priority command sets the priority of the interface
ht
that is a candidate for the DIS at a specified level.

The isis circuit-type command simulates the network type of an
interface to P2P.
s:
View

ce
isis dis-priority: interface view

isis circuit-type: interface view
ur
Parameters
isis dis-priority priority [ level-1 | level-2 ]
so
Specifies the priority for electing DIS. The value ranges from 0
to 127. The default value is 64. The greater the value of priority
Re
is, the higher the priority is.

level-1 Indicates the priority for electing Level-1 DIS.
ng
level-2 Indicates the priority for electing Level-2 DIS.

isis circuit-type p2p
ni
Sets the interface network type as P2P.

Precautions
ar
The isis dis-priority command takes effect only on a broadcast

link.
Le
The isis circuit-type command takes effect only on a broadcast

interface. The network types of IS-IS interfaces on both ends of a
link must be the same, otherwise, the two interfaces cannot
re
establish a neighbor relationship.

Mo
Configuration Verification
Run the display isis interface process-id command, and view
the DIS field in the command output.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
Company A requires route control. When configuring tags, you

should also enable IS-IS wide metric on all devices in the network
so that the tags can be transmitted in the entire network. In
s:
addition, Level-2 routes cannot be directly leaked to Level-1 areas

ce
and need to be configured manually.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The import-route command configures IS-IS to import routes
ht
from other routing protocols.

The import-route isis level-2 into level-1 command controls
route leaking from Level-2 areas to Level-1 areas. The command
s:
needs to be configured on Level-1-2 routers that are connected to

ce
external areas.
The cost-style command sets the cost style of routes sent and
ur
received by an IS-IS router.

View
so
import-route: IS-IS view

import-route isis level-2 into level-1: IS-IS view
Re
cost-style: IS-IS view

Parameters

ng
import-route isis level-2 into level-1 [ filter-policy { acl-

number | acl-name acl-name | ip-prefix ip-prefix-name | route-
ni
policyroute-policy-name } | tag tag ]

filter-policy: indicates the route filtering policy.
ar
acl-name: specifies the number of a basic ACL.

acl-name acl-name: specifies the name of a named ACL.
Le
ip-prefix ip-prefix-name: specifies the name of an IP prefix.

Only the routes that match the IP prefix can be imported.
route-policy route-policy-name: specifies the name of a
re
routing policy.
Mo
tag tag: assigns administrative tags to the imported

routes.
cost-style { narrow | wide | wide-compatible }
en
narrow: indicates that the device can receive and send
m/
routes with cost style narrow.
wide: indicates that the device can receive and send
co
routes with cost style wide.
wide-compatible: indicates that the device can receive
.
routes with cost style narrow or wide but sends only
ei
routes with cost style wide.
w
Precautions
ua
To transmit tags in the entire network, run the cost-style wide
command on all devices in the network.
.h
Run the display isis router command to view tag information.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
Company A reconstructs its network. IS-IS uses ACLs, IP prefix

lists, and tags to control routes.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The filter-policy import command allows IS-IS to filter the
ht
received routes to be added to the IP routing table.

View
filter-policy import: IS-IS view
s:
Parameters

ce

prefix-name | route-policy route-policy-name } import
ur
acl-number: specifies the number of a basic ACL.

acl-name acl-name: specifies the name of a named ACL.
so
ip-prefix ip-prefix-name: specifies the name of an IP

prefix list.
Re
route-policy route-policy-name: specifies the name of a

routing policy that filters routes based on tags and
ng
other protocol parameters.

Precautions
ni
IS-IS can control routes and determine whether a route is

added to the routing table. However, LSP transmission is
ar
not affected.
The filter-policy export command takes effect only when it
Le
is used together with the filter-policy import command.

+IP-Extended* indicates that wide metric is supported. The
symbol * indicates that the route is learned through route
re
leaking.
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
IS-IS authentication classifies into area authentication, routing
ht
domain authentication, and interface authentication.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The area-authentication-mode command configures an IS-IS
ht
area to authenticate received Level-1 packets (LSPs and SNPs)

using the specified authentication mode and password, or adds
authentication information to Level-1 packets to be sent.
s:
The isis authentication-mode command configures an IS-IS

interface to authenticate Hello packets using the specified mode
ce
and password.
View
ur
area-authentication-mode: IS-IS view

isis authentication-mode: interface view
so
Parameters
isis authentication-mode { simple password | md5 password-
Re
key } [ level-1 | level-2 ] [ ip | osi ] [ send-only ]

simple password: indicates that the password is
transmitted in plain text.
ng
md5 password-key: indicates that the password to be

transmitted is encrypted using MD5.
ni
keychain keychain-name: specifies a keychain that

changes with time.
ar
level-1: sets Level-1 authentication.

level-2: sets Level-2 authentication.
Le
ip: indicates the IP authentication password. This

parameter cannot be configured in the keychain authentication
mode.
re
osi: indicates the OSI authentication password. This

Mo
mode.
en
send-only: indicates that the router encapsulates sent Hello
m/
packets with authentication information but does not
authenticate received Hello packets.
co
area-authentication-mode { simple password | md5 password-
key } [ ip | osi ] [ snp-packet { authentication-avoid | send-only }
.
| all-send-only ]
ei
simple password: indicates that the password is
w
transmitted in plain text.
ua
md5 password-key: indicates that the password to be
transmitted is encrypted using MD5.
.h
keychain keychain-name: specifies a keychain that
changes with time.
g
ip: indicates the IP authentication password. This
in
rn
mode.
osi: indicates the OSI authentication password. This
ea
mode.
/l
send-only: indicates that the router encapsulates sent
Hello packets with authentication information but does not
:/
authenticate received Hello packets.
all-send-only: indicates that the router encapsulates
tp
generated LSPs and SNPs with authentication information and

ht
does not authenticate received LSPs and SNPs.

authentication-avoid: indicates that the router does not
encapsulate generated SNPs with authentication information
s:
or authenticates received SNPs. The router encapsulates

generated LSPs with authentication information and
ce
authenticates received LSPs.

snp-packet: authenticates SNPs.
ur
Precautions
The area-authentication-mode command takes effect only on
so
Level-1 and Level-1-2 routers.

Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
follows:
If RX interconnects with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively, network
s:
mask is 24.

ce
R2 connects to R3 and R1 through serial interfaces. R1 and R3

connect through Ethernet interfaces. R1 connects to network
ur
10.0.0.0/24 through G0/0/1.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
You can run the display isis peer command to check whether
ht
neighbor relationships are established successfully.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
You can run the display isis interface command to view the
ht
interface relationship.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
You can run the display ip routing-table command to view the
ht
routing table.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
In this case, the network runs IS-IS.
ht
Requirement analysis
The log prompt function of IS-IS is disabled by default.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
The nexthop command sets the preferences of equal-cost routes.
ht
After IS-IS calculates equal-cost routes using the SPF algorithm,

the next hop is chosen from these equal-cost routes based on the
value of weight. The smaller the value is, the higher the
s:
preference is.
ce
Parameters
nexthop ip-address weight value
ur
ip-address: indicates the next hop address.

weight value: indicates the next hop weight. The value is
so
an integer that ranges from 1 to 254. The default value

is 255.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
The summary ip-address mask avoid-feedback |
ht
generate_null0_route command avoids learning the aggregation

route again. It can also generate a route to the Null0 interface to
prevent loops.
s:
You need to manually open logs of a neighbor.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPF topology:
OSPF divides an Autonomous System (AS) into one or
ht
multiple logical areas. All areas are connected to Area 0.Area

0 is backbone Area.
s:
Router type:
Internal router: All interfaces on an internal router belong to the
ce
same OSPF area.

ur
Area Border Router (ABR): An ABR belongs to two or more

areas, one of which must be the backbone area. An ABR is
so
used to connect the backbone area and non-backbone areas.

It can be physically or logically connected to the backbone
Re
area.
Backbone router: At least one interface on a backbone router
ng
belongs to the backbone area. Internal routers in Area 0 and

all ABRs are backbone routers.
ni
AS Boundary Router (ASBR): An ASBR exchanges routing

information with other ASs. An ASBR does not necessarily
ar
reside on the border of an AS. It can be an internal router or an

ABR. An OSPF device that has imported external routing
Le
information will become an ASBR.
Differences between OSPF and IS-IS in the topology:

re
In OSPF, a link can belongs to only one area.In IS-IS, a link

Mo
can belong to different areas.

en
In IS-IS, no area is physically defined as the backbone or
m/
non-backbone area. In OSPF, Area 0 is defined as the
backbone area.
co
In IS-IS, Level-1 and Level-2 routers use the shortest path
first (SPF) algorithm to generate shortest path trees (SPTs)
.
respectively. In OSPF, the SPF algorithm is used only in the
ei
same area, and inter-area routes are forwarded by the
w
backbone area.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPF supports the following network types:

P2P: A network where the link layer protocol is PPP or HDLC
ht
is a P2P network by default. On a P2P network, protocol

packets such as Hello packets, DD packets, LSR packets,
LSU packets, and LSAck packets are sent in multicast mode
s:
using the multicast address 224.0.0.5.

P2MP: No network is a P2MP network by default, no matter
ce
what type of link layer protocol is used on the network. A

ur
network can be changed to a P2MP network. The common

practice is to change a non-fully meshed NBMA network to a
so
P2MP network. On a P2MP network, Hello packets are sent in

multicast mode using the multicast address 224.0.0.5, and
Re
other types of protocol packets, such as DD packets, LSR

packets, LSU packets, and LSAck packets are sent in unicast
ng
mode.
NBMA: A network where the link layer protocol is ATM or FR is
ni
an NBMA network by default. On an NBMA network, protocol

packets such as Hello packets, DD packets, LSR packets,
ar
LSU packets, and LSAck packets are sent in unicast mode.

Broadcast: A network with the link layer protocol of Ethernet or
Le
FDDI is a broadcast network by default. On a broadcast

network, Hello packets, LSU packets, and LSAck packets are
usually sent in multicast mode. The multicast addresses
re
224.0.0.5 is used by an OSPF device. The multicast address

Mo
224.0.0.6 is reserved for an OSPF designated router (DR). DD

and LSR packets are transmitted in unicast mode.
en
DR/BDR functions
m/
Reduces the number of neighbors and further reduces the
number of times that link-state information and routing
co
information are updated. The DRother sets up full adjacency
only with the DR/BDR. The DR and BDR set up full adjacency
.
with each other.
ei
The DR generates Network-LSAs to describe information about
w
the NBMA or broadcast network segment.
ua
DR/BDR election rules
.h
When Hello is used for DR/BDR election, the DR/BDR is
elected based on Router Priority of interfaces.
g
If Router Priority is set to 0, the router cannot be elected as
in
the DR or BDR.
A larger value of Router Priority indicates a higher priority. If
rn
the value of Router Priority is the same on two interfaces, the
ea
interface with a larger Router ID is elected.
The DR/BDR cannot preempt resources.
/l
If the DR is faulty, the BDR automatically becomes the new DR,
and a new BDR is elected on the network. If the BDR is faulty,
:/
the DR does not change, and a new BDR is elected.
tp
Differences between IS-IS DIS and OSPF DR/BDR

On an IS-IS broadcast network, routers with priority 0 still
ht
participate in DIS election. On an OSPF network, routers with

priority 0 do not participate in DR election
s:
On an IS-IS broadcast network, when a new router meeting

DIS conditions joins the network, the router is elected as the
ce
new DIS, and the original pseudonode is deleted. This causes

LSP flooding. On an OSPF network, a new router will not
ur
immediately become the DR on the network segment even if

the router has the highest DR priority.
so
On an IS-IS broadcast network, routers with the same level on

Re
the same network segment form adjacencies with each other,

including all non-DIS routers.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Overview of OSPF packets

OSPF packets are transmitted at the network layer. The
ht
protocol number is 89. There are five types of OSPF packets,

whose packet headers are in the same format.
OSPF packets except the Hello packet carry LSA information.
s:
ce
OSPF packet header information

All OSPF packets have the same OSPF packet header.
ur
Version: specifies the OSPF protocol number. This field must

be set to 2.
so
Type: specifies the OSPF packet type. There are five types of
OSPF packets.
Re
Packet length: specifies the total length of an OSPF packet,

including the packet header. The unit is byte.
Router ID: specifies the router ID of the router generating the
ng
packet
ni
Area ID: specifies the area to which the packet is to be

advertised.
ar
Checksum: specifies the standard IP checksum of the entire

packet (including the packet header).
Le
AuType: specifies the authentication mode

Authentication: specifies information for authenticating packets,
such as the password.
re
Mo
Hello packet
Network Mask: specifies the network mask of the interface
sending Hello packets.
en
HelloInterval: specifies the interval for sending Hello packets, in
m/
seconds.
Options: specifies optional functions supported by the OSPF
co
router sending the Hello packet. Detailed functions are not
mentioned in this course.
.
Rtr Pri: specifies the router priority on the interface sending
ei
Hello packets. This field is used for electing the DR and BDR.
w
RouterDeadInterval: specifies the interval for advertising that
ua
the neighbor router does not run OSPF on the network
segment, in seconds. In most cases, the value of this field is
.h
four times HelloInterval.
Designated Router: specifies the IP address of the DR elected
g
by routers sending Hello packets. The value 0.0.0.0 of this field
in
indicates that the DR is not elected.
Backup Designated Router: specifies the IP address of the
rn
BDR elected by routers sending Hello packets. The value
ea
0.0.0.0 of this field indicates that the BDR is not elected.
Neighbor: specifies the neighbor router ID, indicating that the
/l
router has received valid Hello packets from neighbors.
:/
DD packet
Interface MTU: specifies the maximum IP data packet size that
tp
an interface on the originating router can send without

ht
fragmentation. The value of this field is 0x0000 on a virtual link.

Options: is the same as that of the Hello packet.
I-bit: is set to 1 for the first DD packet in a series of sent DD
s:
packets. The I-bit fields of subsequent DD packets are 0.

M-bit: is set to 1 when the sent DD packet is not the last one.
ce
The M-bit field of the last DD packet is set to 0.

MS-bit: advertises the router as the master router.
ur
DD Sequence Number: specifies the sequence number of the

DD packet.
so
LSA header information

Re
LSR packet
Link State Advertisement Type: specifies the LSA type, which
ng
can be router-LSA, network-LSA, or other LSA types.

Link State ID: varies depending on LSA types.
ni
Advertising Router: specifies the router ID of the originating

router that advertises LSAs.
ar
Le
LSU packet
Number of LSAs: specifies the number of LSAs in an LSU
packet.
re
LSA: specifies detailed LSA information.

Mo
LSU packet
Header of LSA: specifies LSA header information.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
LSA header information contained in all OSPF packets excluding Hello

packets
ht
LS age: specifies the age of the LSA, in seconds.

Option: specifies optional performance that LSAs supported in
some OSPF areas.
s:
LS type: identifies the format and functions of LSAs. There are

ce
five types of commonly used LSAs.

Link State ID: varies with LSAs.
ur
Advertising Router: specifies router ID in the first LSA.

Sequence Number: increases with the generation of LSA
so
instances. This field allows other routers to identify latest LSA

instances.
Re
Checksum: indicates the checksum of all information in an LSA.

The checksum needs to be recalculated as the aging time
ng
increases.
Length: specifies the length of an LSA, including the LSA header.
ni
Router-LSA (describing all interfaces or links on the originating router)

ar
Link State ID: specifies the router ID of the originating router.

V: indicates that the originating router is an endpoint on one or
Le
more virtual links with full adjacency when this field is set to 1.
E: is set to 1 when the originating router is an ASBR.

re
B: is set to 1 when the originating router is an ABR.

Number of links: specifies the number of router links described in
Mo
an LSA.
Link Type: indicates the link type. The value of this field can be:
1: P2P link to a device, point-to-point connection to another router
en
2: link to a transit network, such as broadcast or NBMA network
m/
3: link to a subnet, such as Loopback interface
4: virtual link
co
Link ID: specifies the link ID. The value of this field can be:
1: neighbor router ID
.
2: IP address of the interface on a DR
ei
3: IP network or subnet address
w
4: neighbor router ID
ua
Link Data: indicates more information about a link. This field
specifies the IP address of the interface on the originating router
.h
connected to the network when the value of Link Type is 1 or 2,
and specifies the IP address or subnet mask of the network when
g
the value of Link Type is 3.
in
ToS: is not supported.

rn
Metric: specifies the metric of a link or interface.
ea
Network-LSA
Link State ID: specifies the IP address of the interface on a DR.
/l
Network Mask: specifies the IP address or subnet mask used on
the network.
:/
Attached router: lists router IDs of the DR and all routers that have
set up adjacency relationships with the DR on an NBMA network.
tp
ht
Network-summary-LSA and ASBR-summary-LSA

Link State ID: specifies the IP address of the network or subnet in
a Type 3 LSA. In a Type 4 LSA, this field specifies the router ID of
s:
the ASBR.
Network Mask: specifies the IP address or subnet mask of the
ce
network in a Type 3 LSA. In a Type 4 LSA, this field has no

meaning and is set to 0.0.0.0.
ur
Metric: specifies the metric of a route to the destination.

so
AS-external-LSA
Re
Link State ID: Indicates the advertised network or subnetIP

address.
Network Mask: specifies the destination IP address or subnet
ng
mask.

ni
E: specifies the type of the external route. The value 1 indicates

the E2 metric, and the value 0 indicates the E1 metric.
ar
Metric: specifies the metric of a route and is set by an ASBR.

Forwarding Address: specifies the forwarding address (FA) of a
Le
packet destined for a specific destination address. When this field

is set to 0.0.0.0, the packet is forwarded to the originating router.
External Route Tag: identifies an external route.
re
Mo
en
NSSA LSA
m/
Forwarding Address: When an internal route is advertised
between an NSSA ASBR and the neighboring AS, this field is set
co
to the next-hop address of the local network. When the internal
route is not used for advertisement, this field is set to the interface
.
ip of the stub network,such as loopback,if have multi stub
ei
network,choose the maximum ip address.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Options field:
DN: prevents loops on an MPLS VPN network. When a type 3, 5,
ht
or 7 LSA is sent from a PE to a CE, the DN bit MUST be set.

When the PE receives, from a CE router, a type 3, 5, or 7 LSA
with the DN bit set, the information from that LSA MUST NOT be
s:
used during the OSPF route calculation.

ce
O: indicates that the originating router supports Opaque LSAs

(Type 9, 10, and 11 LSAs).
ur
DC-bit: indicates that the originating router supports OSPF

capabilities of on-demand links.
so
EA: indicates that the originating router can receive and forward
External-Attributes-LSA(type8 LSA).
Re
N-bit: exists only in Hello packets. The value 1 indicates that the
router supports Type 7 LSAs. The value 0 indicates the router
ng
does not receive or send NSSA LSAs.

P-bit: exists only in NSSA LSAs. This field instructs an NSSA
ni
ABR to convert the Type 7 LSA into a Type 5 LSA.

MC-bit: indicates that the originating router supports multicast,
ar
this bit will be set.

E-bit: indicates that the originating router can receive AS external
Le
LSAs. This field is set to 1 in all Type 5 LSAs and LSAs that are
sent from the backbone area and NSSA areas. This field is set to
0 in LSAs that are sent from stub areas. This field in a Hello
re
packet indicates that the interface can receive and send Type 5
Mo
LSAs.
MT-bit: indicates that the originating router supports MOSPF.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Neighbor status:
Down: It is the initial stage of setting up sessions between
ht
neighbors. In this state, a router receives no message from its

neighbor.
Init: A router has received Hello packets from its neighbor but is
s:
not in the neighbor list of the received Hello packets. The router
has not established bidirectional communication with its neighbor.
ce
In this state, the neighbor is in the neighbor list of Hello packets.

2-Way: In this state, bidirectional communication has been
ur
established but the router has not established the adjacency

relationship with the neighbor. This is the highest state before the
so
adjacency relationship is established. When routers are located

on a broadcast or NBMA network, the routers elect the DR/BDR.
Re
When the neighbor relationship is established, routers negotiate

parameters carrying in Hello packets.
ng
If the network type of the interface receiving Hello packets is

P2MP or NBMA, the Network Mask field in Hello packets must
ni
be the same as the network mask of the interface receiving the

Hello packets. If the network type of the interface is P2P or virtual
ar
link, the Network Mask field is not checked.

The HelloInterval and RouterDeadInterval fields in a Hello
Le
packet must be the same as those on the interface receiving the

Hello packet.
The Authentication field in a Hello packet must be the same as
re
that on the interface receiving the Hello packet.

The E-bit option in a Hello packet must be the same as that on
Mo
the interface receiving in the area configuration.

The Area ID field in a Hello packet must be the same as that on
the interface receiving the Hello packet.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Neighbor relationship setup:

When the neighbor state machine is ExStart on R1, R1 sends the
ht
first DD packet to R2. Assume that in fields in this DD packet are

set as follows:
DD Sequence Number is set to 552A.
s:
I-bit is set to 1, indicating that the DD packet is the first DD packet.

ce
M-bit is set to 1, indicating that more DD packets are to be sent.

MS-bit is set to 1, indicating that R1 advertises itself as the
ur
master router.
When the neighbor state machine is ExStart on R2, R2 sends the
so
first DD packet in which DD Sequence Number is set to 5528 to

R1. The router ID of R2 is larger than that of R1; therefore, R2
Re
functions as the master router. After the comparison of router IDs

is complete, R1 generates a NegotiationDone event and changes
ng
its neighbor state machine from ExStart to Exchange.

When the neighbor state machine is Exchange on R1, R1 sends a
ni
new DD packet containing the local LSDB. In the DD packet, DD

Sequence Number is set to the sequence number of the DD
ar
packet sent by R2, M-bit is set to 0 indicating no other DD packet

is required for describing the local LSDB, and MS-bit is set to 0
Le
indicating that R1 advertises itself as the slave router. After

receiving the DD packet, R2 generates a NegotiationDone event
and changes its neighbor state machine to Exchange.
re
When the neighbor state machine is Exchange on R2, R2 sends a

Mo
new DD packet containing the local LSDB. In this DD packet, DD

Sequence Number is increased by 1 (5528 + 1 = 5529).
en
R1 as the slave router needs to acknowledge each DD packet
m/
from R2
even through R1 does not need to update its LSDB using new DD
co
packets. R1 sends an empty DD packet with DD Sequence
Number of 5529.
.
When the neighbor state machine is Loading on R1, R1 sends a
ei
Link State Request (LSR) packet to request link state information
w
that is learned from DD packets when the neighbor state machine
ua
is Exchange but not contained in the local LSDB.
After receiving the LSR packet, R2 sends a Link State Update
.h
(LSU) packet containing detailed link state information to R1.
When receiving the LSU packet, R1 changes its neighbor state
g
machine from Loading to Full.
in
R1 then sends a Link State Acknowledgement (LSAck) packet to
rn
R2 to ensure information transmission reliability. LSAck packets
are flooded to acknowledge the receiving of LSAs.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPF can define areas as stub and totally stub areas. A stub area is a
special area where ABRs do not flood the received AS external routes.
ht
The ABR in a stub area maintains fewer routing entries and transmits
less routing information. The stub area is an optional configuration, but
not all areas can be configured as stub areas. Generally, a stub area is
s:
a non-backbone area with only one ABR and is located at the AS

ce
boundary. To ensure the reachability of AS external routes, the ABR in

a stub area generates a Type 3 LSA carrying a default route and
ur
advertises it within the entire stub area.

so
Stub area
Re
The backbone area cannot be configured as a stub area.

If an area needs to be configured as a stub area, all the routers in
ng
this area must be configured with stub attributes.

An ASBR cannot exist in a stub area. That is, AS external routes
ni
are not flooded in the stub area.

A virtual link cannot pass through a stub area.
ar
Type 5 LSAs cannot be advertised within a stub area.

A router in the stub area must learn AS external routes from the
Le
ABR. The ABR automatically generates a Type 3 LSA carrying a

default route and advertises it within the entire stub area. The
re
router can then learn the AS external network from the ABR.
Mo
Totally stub area

Neither Type 3 nor Type 5 LSAs can be advertised within a totally
stub area.
en
A router in the totally stub area must learn AS external and inter-
m/
area network from an ABR.
The ABR automatically generates a Type 3 LSA and advertises it
co
within the entire totally stub area.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To prevent a large number of external routes from consuming the

bandwidth and storage resources of routers in a stub area, OSPF
ht
defines that stub areas cannot import external routes. However, stub
areas cannot meet the requirements of the scenario that requires the
import of external routes while preventing resources from being
s:
consumed by external routes. Therefore, NSSA areas are introduced.

ce
Type 7 LSA
Type 7 LSAs are defined in an NSSA Area to describe AS
ur
external routes.
Type 7 LSAs are generated by an ASBR in an NSSA area and
so
advertised only within the NSSA area of this ASBR.

When receiving Type 7 LSAs, an ABR in an NSSA selectively
Re
translates the Type 7 LSAs to Type 5 LSAs so that external

routes can be advertised in other areas of the OSPF network.
Type 7 LSAs can be used to carry default route information to
ng
guide traffic to other ASs.

ni
To advertise the external routes imported by an NSSA area to other

areas, ABRs in the NSSA area needs to translate Type 7 LSAs to Type
ar
5 LSAs so that the external routes can be advertised on the entire

OSPF network.
Le
The P-bit informs routers whether Type 7 LSAs need to be

translated.
The ABR with the largest router ID in an NSSA area translates
re
Type 7 LSAs to Type 5 LSAs.

Only when the P-bit is set and Forwarding Address is not 0, a
Mo
Type 7 LSA can be translated to a Type 5 LSA. Forwarding

Address figure out the destination address inside the ospf
domain for the external routes.
en
The default Type 7 LSAs meeting the preceding conditions can
m/
also be translated.
The Type 7 LSAs generated by ABRs are not set with the P-bit.
co
Precautions
.
Multiple ABRs may be deployed in an NSSA area. To prevent
ei
routing loops, ABRs do not calculate the default routes advertised
w
by each other.
ua
NSSA and totally NSSA
.h
A small number of AS external routes learned from the ASBR in an
NSSA area can be imported to the NSSA area. Type 5 LSAs
g
cannot be advertised within the NSSA area, but routers can learn
in
the AS external routes from the ASBR.

rn
Neither Type 3 nor Type 5 LSAs can be advertised within a totally
NSSA.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Fast convergence
I-SPF improves this algorithm. With exception to where
ht
calculation is performed for the first time, only changed nodes, as

opposed to all nodes, are involved in calculation. The SPT
ultimately generated is the same as that generated by the
s:
previous algorithm. This decreases the CPU usage and speeds

ce
up network convergence.
Similar to I-SPF, PRC calculates only the changed routes. PRC,
ur
however, does not calculate the shortest path. PRC updates

routes based on the SPT calculated by I-SPF. In route calculation,
so
a leaf represents a route, and a node represents a router. A

change in the SPT or leaf causes a change in routing information,
Re
but changes in the SPT or leaf and routing information are not
dependent on each other. PRC processes routing information
ng
based on the SPT or leaf changes:

When the SPT is changed, the PRC processes routing
ni
information on all leaves of the changed nodes.

When the SPT is not changed, PRC does not process
ar
routing information on nodes.

When a leaf is changed, PRC processes routing
Le
information on the changed leaf.

When the leaf is not changed, PRC does not process
routing information on the leaf.
re
The OSPF intelligent timer controls the route calculation, LSA

Mo
generation, and receiving of LSAs to speed up network

convergence. The OSPF intelligent timer speeds up network
convergence in the following modes:
en
On a network where routes are frequently calculated, the
m/
OSPF intelligent timer dynamically adjusts the interval for
calculating
co
routes based on the user configuration and exponential
backoff technology. In this manner, the route calculation and
.
CPU resource consumption are decreased. Routes are
ei
calculated after the network topology becomes stable.
On an unstable network, if a router generates or receives
w
LSAs due to frequent topology changes, the OSPF
ua
intelligent timer can dynamically adjust the interval for
calculating routes. No LSA is generated or handled within
.h
an interval, which prevents invalid LSAs from being
generated and advertised on the entire network.
g
The OSPF intelligent timer helps calculate routes as follows:
in
Based on the local LSDB, a router that runs OSPF
calculates the SPT with itself as the root using the
rn
SPF algorithm, and determines the next hop to the
destination network according to the SPT. Changing
ea
the interval for SPF calculation can prevent the
bandwidth and resource consumption caused by
/l
frequent LSDB changes.
On a network that requires short route convergence
:/
time, specify the interval for route calculation in
milliseconds to increase the route calculation
tp
frequency and speed up route convergence.

When the OSPF LSDB changes, the shortest path
ht
needs to be recalculated. If a network changes

frequently and the shortest path is calculated
continually, a large number of system resources will
s:
be consumed, affecting router performance. You can

configure an intelligent timer and set a proper interval
ce
for SPF calculation to prevent memory and bandwidth

resources from being consumed.
ur
After the OSPF intelligent timer is used:

The initial interval for SPF calculation is
so
specified by the parameter start-interval.

The interval for SPF calculation for the nth (n
Re
is larger than or equal to 2) time is equal to

hold-interval x 2 x (n 1).
When the interval specified by hold-interval x
ng
2 x (n 1) reaches the maximum interval

specified by max-interval, OSPF performs
ni
SPF calculation at the maximum interval for

three consecutive times. Then perform step 1
ar
again for SPF calculation at the initial interval

specified by start-interval.
Le
Priority-based convergence
Filter routes based on the IP prefix list. Set different priorities for
re
the routes so that routes with the highest priority are preferentially
converged, improving network reliability.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Setting the maximum number of non-default external routes on a router

can prevent an OSPF database overflow. You must set the same
ht
maximum number of non-default routes for all routers on an OSPF

network. If the number of external routes on a router reaches the
configured maximum number, the router enters the overflow state and
s:
starts the overflow timer. The router automatically leaves the overflow
ce
state after the overflow timer expires. The default timeout period is 5
seconds.
ur
The OSPF database overflow process is as follows:

When entering the overflow state, a router deletes all non-default
so
external routes that are generated by itself.

When staying in the overflow state, the router does not generate
Re
non-default external routes, discards newly received, non-default

routes, and does not reply with an LSAck packet. When the
ng
overflow timer expires, the router checks whether the number of

external routes still exceeds the maximum value. If so, restart the
ni
timer; if not, the router leaves the overflow state.

When leaving the overflow state, the router deletes the overflow
ar
timer, generates non-default external routes, receives new non-

default external routes, replies with LSAck packets, and gets
Le
ready to enter the overflow state again.

re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
During OSPF deployment, all non-backbone areas must be connected

to the backbone area to ensure that all areas are reachable.
ht
s:
Two ABRs use a virtual link to directly transmit OSPF packets. The
routers between the two ABRs only forward packets. Because the
ce
destination of OSPF packets is not these routers, the routers

transparently forward the OSPF packets as common IP packets.
ur
If a virtual link is not properly deployed, a loop may occur.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When the two authentication types exist, use authentication based on

interfaces.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
The OSPF default route is generally applied to the following scenarios:

An ABR in an area advertises Type 3 LSAs carrying the default
ht
route within the area. Routers in the area use the received default
route to forward inter-area packets.
An ASBR in an area advertises Type 5 or Type 7 LSAs carrying
s:
the default route within the AS. Routers in the AS use the
ce
received default route to forward AS external packets.

ur
Precautions
When no exactly matched route is discovered, a router can
so
forward packets through the default route. Due to hierarchical

management of OSPF routes, the priority of default Type 3 routes
Re
is higher than the priority of default Type 5 or Type 7 routes.

If an OSPF router has advertised LSAs carrying a default route,
ng
the router does not learn this type of LSA advertised by other
routers, which carry a default route. That is, the router uses only
ni
the LSAs advertised by itself to calculate routes. The LSAs

advertised by others are still saved in the LSDB.
ar
If a router has to use a route to advertise LSAs carrying an

external default route, the route cannot be a route learned by the
Le
local OSPF process. This is because a router in an area uses

default external routes to forward packets outside the area,
whereas the routes in the AS have the next hop pointing to
re
devices within the AS.

Mo
Principles for advertising default routes in different areas

Common area
en
By default, OSPF routers in a common OSPF area do not
m/
automatically generate default routes, even if the common
OSPF area has default routes.
co
NSSA area
To advertise AS external routes using the ASBR in an
.
NSSA area and advertise other external routes
ei
through other areas, configure a default Type 7 LSA
w
on the ABR and advertise this LSA in the entire
ua
NSSA area. In this way, a small number of AS
external routes can be learned from the ASBR in the
.h
NSSA, and other inter-area routes can be learned
from the ABR in the NSSA area.
g
To advertise all the external routes using the ASBR in
in
the NSSA area, configure a default Type 7 LSA on
rn
the ASBR and advertise this LSA in the entire NSSA
area. In this way, all the external routes are
ea
advertised using the ASBR in the NSSA area.
The preceding configurations are performed using the
/l
same command in different views. The difference
between these two configurations is described as
:/
follows:
An ABR will generate a default Type 7 LSA
tp
regardless of whether the routing table contains the

ht
default route 0.0.0.0.

An ASBR will generate a default Type 7 LSA only
when the routing table contains the default route
s:
0.0.0.0.
An ABR does not translate Type 7 LSAs carrying a
ce
default route into Type 5 LSAs carrying a default

route or flood them to the entire AS.
ur
Totally NSSA area

All routers in the totally NSSA area must learn AS
so
external routes from the ABR.

Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Route filtering
LSAs are not filtered during route learning. Route filtering can
ht
only determine whether calculated routes are added to the

routing table. The learned LSAs are complete.
s:
Precautions
Stub areas and database overflow can also implement the
ce
LSA filtering function.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
This figure shows the process of establishing the neighbor relationship

and process of neighbor status changes.
ht
Down: It is the initial stage of setting up sessions between

neighbors. In this state, a router receives no message from its
neighbor. On an NBMA network, the router can still send Hello
s:
packets to the neighbor with static configurations. PollInterval

ce
specifies the interval for sending Hello packets and its value is
usually the same as the value of RouterDeadInterval.
ur
Attempt: This state exists only on the NBMA network and

indicates that the router receives no message from the neighbor.
so
In this state, the router periodically sends packets to the neighbor

at an interval of HelloInterval. If the router receives no Hello
Re
packets from the neighbor within RouterDeadInterval, the state

changes to Down.

ng
Init: A router has received Hello packets from its neighbor but is
not in the neighbor list of the received Hello packets. The router
ni
has not established bidirectional communication with its neighbor.

In this state, the neighbor is in the neighbor list of Hello packets.
ar
2-WayReceived: A router knows that bidirectional communication

with the neighbor has started, that is, the router is in the neighbor
Le
list of Hello packets received from the neighbor. If the router

needs to establish the adjacency relationship with the neighbor,
the router enters the ExStart state and starts database
re
synchronization. If the router fails to establish the adjacency

Mo
relationship with the neighbor, the router enters the 2-Way state.
en
2-Way: In this state, bidirectional communication has been
m/
established but the router has not established the adjacency
relationship with the neighbor.
co
This is the highest state before the adjacency relationship is established.
1-WayReceived: The router knows that it is not in the neighbor list
.
of Hello packets received from the neighbor. This is caused by the
ei
restart of the neighbor.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The state machines in the figure are described as follows:

ExStart: This is the first step for establishing the adjacency
ht
relationship. In this state, the router starts to send DD packets to

the neighbor. The two neighbors start to negotiate the
master/slave status and determine the sequence numbers of DD
s:
packets. DD packets transmitted in this state do not contain the

ce
local LSDB.
Exchange: The router exchanges DD packets containing the local
ur
LSDB with its neighbor.

Loading: The router exchanges LSR packets with the neighbor for
so
requesting LSAs and exchanges LSU packets for advertising

LSAs.
Re
Full: The local LSDBs on the two routers have been synchronized.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPF supports P2P, P2MP, NBMA, and multicast networks. IS-IS

supports only P2P and broadcast networks.
ht
OSPF works only at the network layer and the protocol number is
89.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When an OSPF neighbor relationship is established, the two

routers check the mask, authentication mode, Hello/dead interval,
ht
and area ID in Hello packets. The conditions for establishing an

IS-IS neighbor relationship are relatively loose.
Establishing a neighbor relationship over an OSPF P2P link
s:
requires a three-way handshake. Establishing an IS-IS neighbor

ce
relationship does require a three-way handshake. Huawei devices

are enabled with the three-way handshake function on an IS-IS
ur
P2P network by default, which ensuring reliability for establishing

the neighbor relationship.
so
An IS-IS neighbor relationship has level 1 and level 2.

The election of an OSPF DR/BDR is based on the priority and IP
Re
address. The elected DR/BDR cannot be preempted. On an

OSPF network, all DRothers establish full adjacency relationships
ng
with DRs/BDRs, and establish 2-way adjacency relationships with

each other. When the priority of a router on the OSPF network is
ni
0, the router does not participate in the DR/BDR election.

The election of an IS-IS DIS is based on the priority and MAC
ar
address. The elected DIS can be preempted. On an IS-IS network,

all routers establish adjacency relationships with each other. If the
Le
priority of a router on the IS-IS network is 0, the router can still

participate in the DIS election and just has a lower priority.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IS-IS supports a few type of LSPs but provides good extension

capabilities through the TLV field contained in LSPs.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPF costs are calculated based on bandwidth. IS-IS

supports the default cost, delay cost, overhead cost, and error
ht
cost. IS-IS uses the default cost for implementation.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
The NBMA network topology is displayed in this case. Other
ht
devices are connected based on the following rules:

If RX is interconnected with RY, their interconnection
addresses are XY.1.1.X and XY.1.1.Y respectively
s:
network mask is 24.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The peer command sets the IP address and DR priority of the
ht
neighboring router on an NBMA network. On an NBMA network, a

router cannot discover neighboring routers by broadcasting Hello
packets. You must manually specify IP addresses and DR
s:
priorities of neighboring routers.

ce
View
OSPF view
ur
Parameters
so
peer ip-address [ dr-priority priority ]

ip-address: specifies the IP address for a neighboring
Re
router.
dr-priority priority: specifies the priority for the neighbor
ng
to select a DR.
ni
Precautions
In the routing table on R3, the routing entry mapping the IP
ar
address 12.1.1.2/32 exits. This is caused by the PPP echo

function. When this function is disabled, the routing entry mapping
Le
this 32-bit IP address does not exist.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
The network topology in this case is the same as the previous
ht
topology. Area 3 is not directly connected to Area 0, and

therefore cannot communicate with other areas.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The vlink-peer command creates and configures a virtual link.
ht
View
OSPF area view
s:
Parameters

ce
vlink-peer router-id
router-id: specifies the router ID of the virtual link
ur
neighbor.
so
Run the display ospf vlink command to view information about
Re
the OSPF virtual link.
Remarks
ng
A virtual link needs to be configured for R4.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
topology. Company A requires control on the DR. To meet this

requirement, change the DR priorities of routers. The DR/BDR
cannot be preempted.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The ospf dr-priority command sets the priority of an interface
ht
that participates in the DR election.
View
s:
Interface view
ce
Parameters
ur
ospf dr-priority priority

priority: specifies the priority of an interface that
so
participates in the DR/BDR election. A larger value

indicates a higher priority.
Re
Precautions
If the DR priority of an interface on a router is 0, the router
ng
cannot be elected as a DR or a BDR. In OSPF, the DR

ni
priority cannot be configured for null interfaces. Note that

the DR/BDR cannot be preempted even if the DR priority is
ar
changed.
Le
Run the display ospf peer command to view information about
neighbors in OSPF areas.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
topology. This is the network extension requirement. On an

OSPF FR network, the default interval for sending Hello
packets is 30 seconds, and the default interval for sending is
s:
120 seconds. When the neighbor relationship is invalid, the

ce
interval for sending Hello packets is 120 seconds.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The ospf timer hello command sets the interval for sending Hello
ht
packets on an interface.
The ospf timer poll command sets the poll interval for sending
Hello packets on an NBMA network.
s:
ce
View
ospf timer hello: interface view
ur
ospf timer poll: interface view

so
Parameters
ospf timer hello interval
Re
interval: specifies the interval for sending Hello packets

on an interface.

ng
ospf timer poll interval

interval: specifies the poll interval for sending Hello
ni
packets.
ar
Precautions
By default, the intervals for sending Hello packets are 10
Le
seconds on P2P and broadcast interfaces and 30 seconds

on P2MP and NBMA interfaces respectively. Ensure that
parameters are set to the same on the local interface and
re
the remote interface of the neighboring router.

Mo
en
On an NBMA network, after the neighbor relationship is
m/
invalid, the router sends Hello packets periodically at the
interval specified using the ospf timer poll command. The
co
poll interval must be at least four times of the interval for
sending Hello packets.
.
ei
Remarks
w
Perform the same interface configuration on R4 as that on
ua
R2 and R3.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
This case is an extension to the original case. Perform
ht
configurations on the basis of the original case. Imported

routes are advertised in E2 mode by default, and the default
cost value is 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The import-route command imports routes learned by other
ht
routing protocols.
The ospf cost command sets the cost of a route on an OSPF-
enabled interface.
s:
ce
View
import-route: OSPF view
ur
ospf cost: interface view

so
Parameters
import-route[ cost cost | type type ]
Re
cost cost: specifies the cost of a route.

type type: specifies the cost type.

ng
ospf cost cost

cost: specifies the cost of an OSPF-enabled interface.
ni
Precautions
ar
On a non-PE device, only EBGP routes are imported after the

import-route bgp command is configured. IBGP routes are also
Le
imported after the import-route bgp permit-ibgp command is

configured. If IBGP routes are imported, routing loops may occur.
In this case, run the preference (OSPF) and preference (BGP)
re
commands to set the priority of OSPF ASE routes to lower than

Mo
that of IBGP routes.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
configuration on the basis of the original case. If R6 does not

want to receive routes from network 172.16.X.0/24, filter Type
3 LSAs on R5.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The filter-policy export command configures a filtering policy
ht
to filter the imported routes when these routes are advertised

in Type 5 LSAs within the AS. This command can be
configured only on an ASBR to filter Type 5 LSAs.
s:
The filter-policy import command configures a filtering policy

ce
to filter intra-area, inter-area, and AS external routes received

by OSPF. On routers within an area, this command can be
ur
used to filter only routes; on an ABR, this command can be

used to filter Type 3 LSAs.
so
View
Re
filter-policy export: OSPF view

filter-policy import: OSPF view
ng
Parameters
ni

prefix-name } export [ protocol [ process-id ] ]
ar
acl-number: specifies the basic ACL number.

acl-name acl-name: specifies the ACL name.
Le

prefix list.
protocol: specifies the protocol for advertising routing
re
information.
Mo
process-id: specifies the process ID when RIP, IS-IS, or

OSPF is used for advertising routing information.
en
m/
prefix-name } import
acl-number: specifies the basic ACL number.
co
acl-name acl-name: specifies the ACL name.
.
prefix list.
w ei
Precautions
ua
Type 5 LSAs are generated on an ASBR to describe AS
external routes and advertised to all areas (excluding stub and
.h
NSSA areas). The filter-policy command needs to be
configured on an ASBR. To advertise only routing information
g
meeting specific conditions, run the filter-policy command to
in
set filtering conditions.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
configuration on the basis of the original case. Configure Area

1 as an NSSA area.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The nssa command configures an OSPF area as an NSSA area.
ht
View
OSPF area view
s:
ce
Parameters
nssa [ default-route-advertise | flush-waiting-timer interval-
ur
value | no-import-route | no-summary | set-n-bit |suppress-

forwarding-address | translator-always | translator-
so
interval interval-value | zero-address-forwarding ] *

default-route-advertise: generates default Type 7 LSAs
Re
on an ABR or ASBR and then advertises them to the

NSSA area.
ng
flush-waiting-timer interval-value: specifies the interval

for an ASBR to send aged Type 5 LSAs. This parameter
ni
takes effect for once only.

no-import-route: indicates that no external route is
ar
imported to the NSSA area.

no-summary: indicates that an ABR is prohibited from
Le
sending Type 3 LSAs to the NSSA area.

set-n-bit: sets the N-bit in DD packets.
suppress-forwarding-address: sets the FA of the Type
re
5 LSAs translated from Type 7 LSAs by the NSSA ABR

Mo
to 0.0.0.0.
en
translator-always: specifies an ABR in an NSSA area as
m/
an all-the-time translator. Multiple ABRs in an NSSA area
can be configured as translators.
co
translator-interval interval-value: specifies the timeout
period of a translator.
.
zero-address-forwarding: sets the FA of the generated
ei
NSSA LSAs to 0.0.0.0 when external routes are imported
w
from an ABR in an NSSA area.
ua
Precautions
.h
The parameter default-route-advertise is configured to advertise
Type 7 LSAs carrying the default route. Regardless of the route
g
0.0.0.0 exists in the routing table, Type 7 LSAs carrying the default
in
route will be generated on an ABR. However, Type 7 LSAs
rn
carrying the default route will be generated only when the route
0.0.0.0 exists in the routing table on an ASBR.
ea
When the area to which the ASBR belongs is configured as an
NSSA area, invalid Type 5 LSAs from other routers in the area
/l
where LSAs are flooded will be reserved. These LSAs will be
deleted only when the aging time reaches 3600 seconds. The
:/
router performance is affected because the forwarding of a large
number of LSAs consumes the memory resources. The parameter
tp
flush-waiting-timer is configured to generate Type 5 LSAs with

ht
the aging time of 3600 seconds. Invalid Type 5 LSAs on other

routers are therefore cleared in a timely manner.
The parameter flush-waiting-timer does not take effect when the
s:
ASBR also functions as an ABR. In this way, Type 5 LSAs in non-

NSSA areas will not be deleted.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
ht
configuration on the basis of the original case. Note that the

virtual link belongs to Area 0.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command Usage
The authentication-mode command sets the authentication
ht
mode and password for an OSPF area. After this command is

executed, interfaces on all routers in an OSPF area use the same
authentication mode and password.
s:
ce
View
OSPF view
ur
Parameters
so
authentication-mode { md5 | hmac-md5 } [ key-

id { plain plaintext | [ cipher ] ciphertext } ]
Re
md5 password-key: indicates the MD5 authentication

using the ciphertext password.
ng
hmac-md5: indicates HMAC-MD5 authentication using

the ciphertext password.
ni
key-id: specifies an authentication ID, which must be the

same on the two ends.
ar
keychain: indicates keychain authentication.

keychain-name: specifies the keychain name.
Le
authentication-
mode simple [ [ plain ] plaintext | cipher ciphertext ]
simple password: indicates simple authentication.
re
plain: indicates authentication using the plaintext

Mo
password. If this parameter is specified, the device

allows you to set only a plaintext key, and the key is
displayed in plaintext mode in the configuration file.
en
plaintext: specifies a plaintext password.
m/
cipher: specifies a ciphertext password. If this parameter
is specified, the device allows you to set only a ciphertext
co
key, and the key is displayed in ciphertext mode in the
configuration file.
.
ciphertext: specifies a ciphertext password.
w ei
Precautions
ua
The authentication modes and passwords of all the devices must
be the same in an area, but can be different in different areas.
.h
The authentication-mode command used in the interface view
takes precedence over the authentication-mode command used
g
in the OSPF area view.
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case Description
If RX is interconnected with RY, their interconnection
ht
addresses are XY.1.1.X/24 and XY.1.1.Y/24 respectively.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Run the display ospf peer brief command to check whether
ht
the neighbor relationship is established.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Run the tracert command to trace traffic on R3. The command
ht
output shows that traffic on R3 reaches S0/0/0 on R1 through

the Ethernet link.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Run the display ip routing-table command to view the routing
ht
table. During the route summarization, original tags are

removed. Therefore, tags need to be added in the next route
summarization.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
Case Description
so
ur
ce
s:
The network runs OSPF.

ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Analysis
To make R1 select the path through area 2 to reach the
ht
networks in area 1,we must make the path through area2 work
as it is passing through area 0.virtual link meet the
needs.when virtual link is established,R1 will compare the cost
s:
of the two path and choose the path with lower cost as the
ce
best.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Only the external LSA (10.0.0.0) exists in the LSDB on R2.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
All neighbor relationships on R3 are correct, indicating
ht
successful authentication.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP is a dynamic routing protocol used between ASs. BGP-1 (defined

in RFC 1105), BGP-2 (defined in RFC 1163), and BGP-3 (defined in
ht
RFC 1267) are three earlier-released BGP versions. BGP exchanges

reachable inter-AS routes, establishes inter-AS paths, avoids routing
loops, and applies routing policies between ASs. The current BGP
s:
version is BGP-4 defined in RFC 4271.

ce
As an external routing protocol on the Internet, BGP is widely used

ur
among Internet Service Providers (ISPs).

BGP has the following characteristics:
so
BGP is an EGP. Different from Interior Gateway Protocols

(IGPs) such as Open Shortest Path First (OSPF) and Routing
Re
Information Protocol (RIP), BGP controls route advertisement

and selects optimal routes between ASs rather than discover
ng
or calculate routes.
BGP uses the Transport Control Protocol (TCP) with listening
ni
port 179 as the transport layer protocol. TCP enhances BGP

reliability with requiring a dedicated mechanism to ensure
ar
connectivity.
BGP needs to select inter-AS routes, which requires
Le
high protocol stability. TCP with high reliability

therefore is used to enhance BGP stability.
BGP peers must be logically connected and establish
re
TCP connections. The destination port number is 179,

Mo
and the local port number is random.

en
When routes are updated, BGP transmits only the updated
m/
routes. This greatly reduces the bandwidth occupied by BGP
route advertisements. Therefore, BGP applies to the
co
transmission of a large number of routes on the Internet.
BGP is designed to avoid loops.
.
Inter-AS: BGP routes carry information about the ASs
ei
along the path. The routes that carry the local AS
w
number are discarded to avoid inter-AS loops.
ua
Intra-AS: BGP does not advertise the routes learned in
an AS to BGP peers in the AS. In this manner, intra-AS
.h
loops are avoided.
BGP provides rich routing policies to flexibly filter and select
g
routes.
in
BGP provides a route flapping prevention mechanism, which
rn
effectively improves Internet stability.
BGP is easy to extend and adapts to network development. It
ea
is mainly extended using TLVs.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An AS is a group of routers that are managed by a single technical

administration and use the same routing policy.
ht
An AS is a group of routers that are managed by a single technical

administration and use the same routing policy.
Each AS has a unique AS number, which is assigned by the
s:
Internet Assigned Numbers Authority (IANA).

An AS number ranges from 1 to 65535. Values 1 to 64511 are
ce
registered Internet numbers, while values 64512 to 65535 are

ur
private AS numbers.
Each AS on a BGP network is assigned a unique AS number to
so
identify the AS. Currently, 2-byte AS and 4-byte AS numbers are

available. A 2-byte AS number ranges from 1 to 65535, while a 4-
Re
byte AS number ranges from 1 to 4294967295. Devices supporting

4-byte AS numbers are compatible with devices supporting 2-byte
ng
AS numbers.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
EBGP and IBGP

IBGP: runs within an AS. To prevent routing loops within an AS, a
ht
BGP device does not advertise the routes learned from an IBGP
peer to other IBGP peers, and establishes full-mesh connections
with all the IBGP peers.
s:
EBGP: runs between ASs. To prevent routing loops between ASs, a

ce
BGP device discards routes containing the local AS number when

receiving routes from EBGP peers.
ur
Device roles in BGP message exchange

so
Speaker: The device that sends BGP messages is called a BGP

speaker. The speaker receives and generates new routes, and
Re
advertises the routes to other BGP speakers.

Peer: The speakers that exchange messages with each other are
ng
called BGP peers. A group of peers sharing the same policies can
form a peer group.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP peers exchange five types of messages: Open, Update, Keepalive,

Notification, and Route-Refresh messages.
ht
Open message: is used to establish BGP peer relationships. It is

the first message sent after a TCP connection is set up. After a
BGP peer receives an Open message and the peer negotiation
s:
succeeds, the BGP peer sends a Keepalive message to confirm

ce
and maintain the peer relationship. Subsequently, BGP peers can

exchange Update, Notification, Keepalive, and Route-refresh
ur
messages.
Update message: is used to exchange routes between BGP peers.
so
Update messages can be used to advertise multiple reachable

routes with the same attributes or to withdraw multiple unreachable
Re
routes.
An Update message can be used to advertise multiple
ng
reachable routes with the same attributes. These

routes can share a group of route attributes. The route
ni
attributes in an Update message apply to all the

destination addresses (expressed by IP prefixes) in the
ar
Network Layer Reachability Information (NLRI) field of

the Update message.
Le
An Update message can be used to withdraw multiple

unreachable routes. Each route is identified by its
destination address (expressed by an IP prefix), which
re
identifies the routes previously advertised between

Mo
BGP speakers.
en
An Update message can be used only to withdraw
m/
routes. In this case, it does not need to carry route
attributes or NLRI. Similarly, an Update message can
co
be used only to advertise reachable routes, so it does
not need to carry information about withdrawn routes.
.
Keepalive message: is periodically sent to the BGP peer to
ei
maintain the peer relationship.
w
Notification message: is sent to the BGP peer when an error is
ua
detected. The BGP connection is then terminated immediately.
Route-Refresh message: is used to request the BGP peer resend
.h
routes when the BGP inbound routing policy changes. If all BGP
routers have the Route-Refresh capability, the local BGP router
g
sends a Route-Refresh message to BGP peers when the BGP
in
inbound routing policy changes. After receiving the Route-Refresh
rn
message, the BGP peers resend their routing information to the
local BGP router. In this manner, the BGP routing table can be
ea
dynamically updated, and the new routing policy can be used
without terminating BGP connections. A BGP peer notifies its peer
/l
of its Route-Refresh capability by sending an Open message.
BGP message applications
:/
BGP uses TCP port 179 to set up a connection. BGP connection
setup requires a series of dialogues and handshakes. TCP
tp
advertises parameters such as the BGP version, BGP connection

ht
holdtime, local router ID, and authorization information in an Open

message during handshake negotiation.
After a BGP connection is set up, a BGP router sends the BGP
s:
peer an Update message that carries the attributes of a route to be

advertised. This helps the BGP peer select the optimal route. When
ce
local BGP routes change, a BGP router sends an Update message

to notify the BGP peer of the changes.
ur
After two BGP peers exchange routes for a period of time, they do
not have new routes to be advertised and need to periodically send
so
Keepalive messages to maintain the validity of the BGP connection.

Re
If the local BGP router does not receive any BGP message from the
BGP peer within the holdtime, the local BGP router considers that
the BGP connection has been terminated, tears down the BGP
ng
connection, and deletes all the BGP routes learned from the peer.
When the local BGP router detects an error during the operation, for
ni
example, it does not support the peer BGP version or receives an

invalid Update message, it sends the BGP peer a Notification
ar
message to report the error. Before terminating a BGP connection

Le
with the peer, the local BGP router also needs to send a Notification
message to the peer.
re
BGP message header

Marker: A 16-byte field fixed to a value of 1.
Mo
en
Length: A 2-byte unsigned integer that indicates the total length of a
m/
message, including the header.
Type: A 1-byte field that specifies the type of a message:
co
Open
Update
.
Keepalive
ei
Notification
w
Route-Refresh
ua
Open message format
.h
Version: Indicates the BGP version number. For BGPv4, the value
is 4.
g
My Autonomous System: Indicates the local AS number.
in
Comparing the AS numbers on both ends, you can determine
rn
whether a BGP connection is an IBGP or EBGP connection.
Hold Time: Indicates the time during which two BGP peers maintain
ea
a BGP connection between them. During the peer relationship
setup, two BGP peers need to negotiate the holdtime and keep the
/l
holdtime consistent. If two BGP peers have different holdtime
periods configured, the shorter holdtime is used. If the local BGP
:/
router does not receive a Keepalive message from the peer within
the holdtime, it considers that the BGP connection is terminated. If
tp
the holdtime is 0, no Keepalive message is sent.

BGP Identifier: Indicates the router ID of a BGP router. It is
ht
expressed as an IP address to identify a BGP router.

Opt Parm Len (Optional Parameters Length): Indicates the optional
s:
parameter length. The value 0 indicates that no optional parameters

are available.
ce
Optional Parameters: These are used for BGP authentication or

Multiprotocol Extensions. Each parameter is a 3-tuple (Parameter
ur
Type-Parameter Length-Parameter Value).

so
Update message format

Re
Withdrawn Routes Length: A 2-byte unsigned integer that indicates

the total length of the Withdrawn Routes field. The value 0 indicates
that the Withdrawn Routes field is not present in this Update
ng
message.
Withdrawn Routes: A variable-length field that contains a list of IP
ni
address prefixes for the routes to be withdrawn. Each IP address

prefix is in <length, prefix> format. For example, <19,198.18.160.0>
ar
indicates a network at 198.18.160.0 255.255.224.0.

Le
Path Attribute Length: A 2-byte unsigned integer that indicates the

total length of the Path Attribute field. The value 0 indicates that the
Path Attribute field is not present in an Update message.
re
Network Layer Reachability Information: Contains a list of IP

address prefixes. This variable length field is in the same format as
Mo
the Withdrawn Routes: <length, prefix>.

en
Keepalive message format
m/
A Keepalive message has only the message header.
By default, the interval for sending Keepalive messages is 60
co
seconds, and the holdtime is 180 seconds. Each time a BGP router
receives a Keepalive message from its peer, it resets the hold timer.
.
If the hold timer expires, it considers the peer to be 'down'.
w ei
Notification message format
ua
Errorcode: A 1-byte field that uniquely identifies an error. Each error
code may have one or more error subcodes. If no error subcode is
.h
defined for an error code, the Error Subcode Field is all 0s.
Errsubcode: Indicates an error subcode.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A BGP finite state machine (FSM) has six states: Idle, Connect, Active,
OpenSent, OpenConfirm, and Established.
ht
The Idle state is the initial BGP state. In Idle state, a BGP
device refuses all the connection requests from neighbors.
The BGP device initiates a TCP connection with its BGP peer
s:
and changes its state to connect only after receiving a start

ce
event from the system.

A start event occurs when an operator configures a
ur
BGP process, resets an existing BGP process or when

the router software resets a BGP process.
so
If an error occurs in any FSM state, for example, the

BGP device receives a notification message or TCP
Re
connection termination notification, the BGP device

returns to the Idle state.
In the connect state, the BGP device starts the ConnectRetry
ng
timer and waits to establish a TCP connection. The

ni
ConnectRetry timer defaults to 32 seconds.

If a TCP connection is established, the BGP device
ar
sends an open message to the peer and changes to

the OpenSent state.
Le
If a TCP connection fails to be established, the BGP

device moves to the Active state.
If the BGP device does not receive a response from the
re
peer before the ConnectRetry timer expires, the BGP

Mo
device attempts to establish a TCP connection with

another peer and stays in the connect state.
en
If another event (started by the system or operator)
m/
occurs, the BGP device returns to the Idle state.
In the Active state, the BGP device keeps trying to establish a
co
TCP connection with the peer.
If a TCP connection is established, the BGP device
.
sends an open message to the peer, closes the
ei
ConnectRetry timer, and changes to the OpenSent
w
state.
ua
If a TCP connection fails to be established, the BGP
device stays in the Active state.
.h
If the BGP device does not receive a response from the
peer before the ConnectRetry timer expires, the BGP
g
device returns to the connect state.
in
In the OpenSent state, the BGP device waits for an Open
rn
message from the peer and then checks the validity of the
received Open message, including the AS number, version,
ea
and authentication password.
If the received Open message is valid, the BGP device
/l
sends a Keepalive message and changes to the
OpenConfirm state.
:/
If the received Open message is invalid, the BGP
device sends a Notification message to the peer and
tp

ht
In OpenConfirm state, the BGP device waits for a Keepalive or

Notification message from the peer. If the BGP device receives
a Keepalive message, it transitions to the Established state. If
s:
it receives a Notification message, it returns to the Idle state.

In Established state, the BGP device exchanges Update,
ce
Keepalive, Route-Refresh, and Notification messages with the

peer.
ur
If the BGP device receives a valid Update or Keepalive

message, it considers that the peer is working properly
so
and maintains the BGP connection with the peer.

Re
If the BGP device receives a valid Update or Keepalive

message, it sends a Notification message to the peer
and returns to the Idle state.
ng
If the BGP device receives a Route-refresh message, it

does not change its state.
ni
If the BGP device receives a Notification message, it

ar
If the BGP device receives a TCP connection

Le
termination notification, it terminates the TCP

connection with the peer and returns to the Idle state.
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A BGP device adds optimal routes to the BGP routing table to generate
BGP routes. After establishing a BGP peer relationship with a neighbor,
ht
the BGP device follows the following rules to exchange routes with the
peer:

s:
Advertises the BGP routes received from IBGP peers

only to its EBGP peers.
ce
Advertises the BGP routes received from EBGP peers

ur
to all its EBGP peers and IBGP peers.

so
Advertises the optimal route to its peers when there

are multiple valid routes to the same destination.
Re
Sends only updated BGP routes when BGP routes

ng
change.
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP routing information processing

When receiving Update messages from peers, a BGP router
ht
saves the Update messages to the routing information base

(RIB) and specifies the Adj-RIB-In of the peer from which the
Update messages are received. After these Update messages
s:
are filtered by the inbound policy engine, the BGP router

ce
determines the optimal route for each prefix according to the

route selection algorithm.
ur
The optimal routes are saved in the local BGP RIB (Loc-RIB)
and then submitted to the local IP route selection table (IP-
so
RIB).
In addition to the optimal routes received from peers, Loc-RIB
Re
also contains the BGP prefixes that are selected as the optimal
routes and injected by the current router (locally originated
ng
routes). Before the routes in Loc-RIB are advertised to other

peers, these routes must be filtered by the outbound policy
ni
engine. Only the routes that pass the filtering of the outbound
policy engine can be installed to the RIB (Adj-RIB-Out).
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Synchronization is performed between IBGP and IGP to prevent

misleading routers in other ASs.
ht
Topology description (when synchronization is enabled)

R4 learns the route to 10.0.0.0/24 advertised by R1 through
s:
BGP and checks whether local IGP routing tables contain the
ce
route. If so, R4 advertises the route to R5. If not, R4 does not

advertise the route to R5.
ur
Precautions: By default synchronization is disabled on VRP

so
platform, and it can not be changed. Only under two

conditions,we can disable the synchronization:
Re
The local AS is not a transit AS.

All the routers within the local AS set up full-mesh IBGP
ng
connections.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP route attributes are a set of parameters that further describe BGP
routes. Using BGP route attributes, BGP can filter and select routes.
ht
Common attributes are as follows:

Origin: A well-known mandatory attribute.
s:
AS_Path: A well-known mandatory attribute.

Next_Hop: A well-known mandatory attribute.
ce
Local_Pref: A well-known discretionary attribute.

ur
Community: An optional transitive attribute.

MED: An optional non-transitive attribute.
so
Originator_ID: An optional non-transitive attribute.

Cluster_List: An optional non-transitive attribute.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The Origin attribute defines the origin of a route and marks the path of a
BGP route. The Origin attribute is classified into the following types:
ht
IGP: A route with the Origin attribute IGP is an IGP route and
has the highest priority. For example, the Origin attribute of the
s:
routes injected to the BGP routing table using the network

ce
command is IGP.
EGP: A route with the Origin attribute EGP is an EGP route
ur
and has the secondary highest priority.

Incomplete: A route with the Origin attribute Incomplete is
so
learned by other means and has the lowest priority. For

example, the Origin attribute of the routes imported by BGP
Re
using the import-route command is Incomplete.

ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The AS_Path attribute records all the ASs that a route passes through
from a source to a destination in the distance-vector order. To prevent
ht
inter-AS routing loops, a BGP device does not accept the EBGP routes
of which the AS_Path list contains the local AS number.
Assume that a BGP speaker advertises a local route:
s:
When advertising the route to other ASs, the BGP speaker

ce
adds the local AS number to the AS_Path list, and then

advertises it to neighboring routers in Update messages.
ur
When advertising the route to the local AS, the BGP speaker
creates an empty AS_Path list in an Update message.
so
Assume that a BGP speaker advertises a route learned in the Update

Re
message sent by another BGP speaker:
When advertising the route to other ASs, the BGP speaker

ng
adds the local AS number to the leftmost of the AS_Path list.

ni
According to the AS_Path attribute, the BGP router that

receives the route can determine the ASs through which the
ar
route has passed to the destination. The number of the AS that

is nearest to the local AS is placed on the leftmost of the list,
Le
and the other AS numbers are listed according to the

sequence in which the route passes through ASs.
When advertising the route to the local AS, the BGP speaker
re
does not change the AS_Path attribute of the route.

Mo
en
Topology description
m/
When R4 advertises route 10.0.0.0/24 to AS 400 and AS 100,
it adds the local AS number to the AS_Path list. When R5
co
advertises the route to AS 100, it also adds the local AS
number to the AS_Path list. When R1 and R3 in AS 100
.
advertise the route to R2 in the same AS, they keep the
ei
AS_Path attribute of the route unchanged. R2 selects the route
w
with the shortest AS_Path when other BGP routing rules are
ua
the same. That is, R2 reaches 10.0.0.0/24 through R3.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The Next_Hop attribute records the next hop that a route passes
through. The Next_Hop attribute of BGP is different from that of an IGP
ht
because it may not be the neighbor IP address. A BGP speaker

processes the Next_Hop attribute based on the following rules:
When advertising a locally originated route to an IBGP peer,
s:
the BGP speaker sets the Next_Hop attribute of the route to be

ce
the IP address of the local interface through which the BGP

peer relationship is established.
ur
When advertising a route to an EBGP peer, the BGP speaker

sets the Next_Hop attribute of the route to be the IP address of
so
the local interface through which the BGP peer relationship is

established.
Re
When advertising a route learned from an EBGP peer to an

IBGP peer, the BGP speaker does not change the Next_Hop
ng
attribute of the route.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Local_Pref attribute
This attribute indicates the BGP preference of a router. It is
ht
exchanged only between IBGP peers and not advertised to

other ASs.
This attribute helps determine the optimal route when traffic
s:
leaves an AS. When a BGP router obtains multiple routes to

ce
the same destination address but with different next hops from
IBGP peers, the router prefers the route with the highest
ur
Local_Pref.
so
R1,R2,R3 are IBGP Peers of each other in AS 100, R2 establish EBGP
Re
Peer with AS 200 and R3 establish EBGP Peer with AS 300. So R2

and R3 will learn route 10.0.0.0/24 from EBGP, R1 learns two routes to
10.0.0.0/24 from two IBGP peers (R2 and R3) in the local AS. Prefers
ng
R2 routing 10.0.0.0/24 to other ASs in AS100, it need configure the

ni
Local_Pref with R2 and R3: one with Local_Pref value 300 from R2 and
the other with Local_Pref value 200 from R3. R1 prefers the route
ar
learned from R2.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The MED attribute helps determine the optimal route when traffic enters
an AS. When a BGP router obtains multiple routes to the same
ht
destination address but with different next hops from EBGP peers, the
router selects the route with the smallest MED value as the optimal
route if the other attributes of the routes are the same.
s:
ce
The MED attribute is exchanged only between two neighboring ASs.

The AS that receives this attribute does not advertise the attribute to
ur
any other AS. This attribute can be manually configured. If the MED
attribute is not configured for a route, the MED attribute of the route
so
uses the default value 0.

Re
R1 and R2 advertise routes 10.0.0.0/24 to their respective
ng
EBGP peers R3 and R4. When other routing rules are the
same, R3 and R4 prefer the route with a smaller MED value.
ni
That is, R3 and R4 access network 10.0.0.0/24 through R1.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The Community attribute is a set of destination addresses with the

same characteristics. It is expressed as a 4-byte list and in the aa:nn or
ht
community number format.

aa:nn: The value of aa or nn ranges from 0 to 65535. The
administrator can set a specific value as required. Generally,
s:
aa indicates the AS number and nn indicates the community

ce
identifier defined by the administrator. For example, if a route

is from AS 100 and its community identifier defined by the
ur
administrator is 1, the Community attribute is 100:1.

Community number: An integer that ranges from 0 to
so
4294967295. As defined in RFC 1997, numbers from 0

(0x00000000) to 65535 (0x0000FFFF) and from 4294901760
Re
(0xFFFF0000) to 4294967295 (0xFFFFFFFF) are reserved.
The Community attribute helps simplify application, maintenance, and

ng
management of routing policies. With the community, a group of BGP

ni
routers in multiple ASs can share the same routing policy. This attribute
is a route attribute and is transmitted between BGP peers without being
ar
restricted by ASs. Before advertising a route with the Community

attribute to peers, a BGP router can change the original Community
Le
attribute of this route.
Well-known community attributes

re
Internet: All routes belong to the Internet community by default.

Mo
A route with this attribute can be advertised to all BGP peers.

en
No_Advertise: A device does not advertise a received route
m/
with the No_Advertise attribute to any peer.
No_Export: A BGP device does not advertise a received route
co
with the No_Export attribute to devices outside the local AS. If
a confederation is defined, the route with the No_Export
.
attribute cannot be advertised to ASs outside of the
ei
confederation but to other sub-ASs in the confederation.
w
No_Export_Subconfed: BGP device does not advertise the
ua
received route with the No_Export_Subconfed attribute to
devices outside the local AS or to devices outside the local
.h
sub-AS in a confederation.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP routing rules

The next-hop addresses of routes must be reachable.
ht
The PrefVal attribute is a Huawei proprietary attribute and is

valid only on the device where it is configured.
If a route does not have the Local_Pref attribute, the
s:
Local_Pref attribute of the route uses the default value 100.

ce
You can use the default local-preference command to

change the default local preference of BGP routes.
ur
Locally generated routes include the routes imported using the

network or import-route command, manually summarized
so
routes, and automatically summarized routes.

Summarized routes have a higher priority than non-
Re
summarized routes.
Manually summarized routes generated using the
ng
aggregate command have a higher priority than

automatically summarized routes generated using the
ni
summary automatic command.

Routes imported using the network command have a
ar
higher priority than routes imported using the import-

route command.
Le
Prefers the route with the shortest AS_Path.

The AS_Path length does not include
AS_CONFED_SEQUENCE and AS_CONFED_SET.
re
An AS_SET counts as 1 no matter how many AS

Mo
numbers the AS_SET contains.

en
BGP does not compare the AS_Path attributes of
m/
routes after the bestroute as-path-ignore command is
executed.
co
Prefers the route with the lowest MED.
BGP compares only the MED values of routes sent
.
from the same AS (excluding a confederation sub-AS).
ei
That is, BGP compares the MED values of two routes
w
only when the first AS numbers in the AS_SEQUENCE
ua
attributes (excluding the AS_CONFED_SEQUENCE)
of the two routes are the same.
.h
If a route does not have the MED attribute, BGP
considers the MED value of the route as the default
g
value 0. After the bestroute med-none-as-maximum
in
command is executed, BGP considers the MED value
rn
of the route as the maximum value 4294967295.
After the compare-different-as-med command is
ea
executed, BGP compares the MEDs in the routes sent
from peers in different ASs. Do not use this command
/l
unless different ASs use the same IGP and route
selection mode, otherwise routing loops may occur.
:/
After the bestroute med-confederation command is
executed, BGP compares the MED values of routes
tp
only when the AS_Path does not contain external AS

ht
numbers (sub-ASs that do not belong to a

confederation) and the first AS number in
AS_CONFED_SEQUENCE is the same.
s:
After the deterministic-med command is executed,

routes are not selected in the sequence in which routes
ce
are received.
ur
Load Balancing
so
When there are multiple equal-cost routes to the same

Re
destination, you can perform load balancing among these

routes to load balance traffic.
Equal-cost BGP routes can be generated for traffic load
ng
balancing only when the rules before the attibutes "Prefers the
route with the lowest IGP metric are the same.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP security
MD5: BGP uses TCP as the transport layer protocol. To
ht
ensure BGP security, you can perform MD5 authentication

during the TCP connection setup. MD5 authentication,
however, does not authenticate BGP messages. Instead, it
s:
sets the MD5 authentication password for a TCP connection,

ce
and the authentication is performed by TCP. If the

authentication fails, no TCP connection is set up.
ur
After GTSM is enabled for BGP, an interface board checks the

TTL values in all BGP messages. In actual networking,
so
packets whose TTL values are not within the specified range
are either allowed to pass through or discarded by GTSM. To
Re
configure GTSM to discard packets by default, you can set a

correct TTL value range according the network topology.
ng
Subsequently, messages whose TTL values are not within the

specified range are discarded. This function avoids attacks
ni
from bogus BGP messages. This function is mutually

exclusive to multi-hop EBGP.
ar
The number of routes received from peers is limited to prevent

resource exhaustion attacks.
Le
The AS_Path lengths on the inbound and outbound interfaces

are limited. Packets that exceed the limit of the AS_Path
length are discarded.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Route dampening helps solve the problem of route instability. In most

cases, BGP is used on complex networks where route flapping occurs
ht
frequently. To prevent frequent route flapping, BGP uses route

dampening to suppress unstable routes.
s:
Route dampening measures the stability of a route using a penalty

ce
value. A larger penalty value indicates a less stable route. Each time
route flapping occurs, BGP increases the penalty of a route by a value
ur
of 1000. During route flapping, a route changes from active to inactive.

When the penalty value of the route exceeds the suppression threshold,
so
BGP suppresses this route and does not add it to the IP routing table or
advertise any Update message to BGP peers.
Re
After a route is suppressed for a period of time (half life), the penalty
value is reduced by half. When the penalty value of a route decreases
ng
to the reuse threshold, the route becomes reusable and is added to the
ni
routing table. At the same time, BGP advertises an Update message to

peers. The penalty value, suppression threshold, and half life can be
ar
manually configured.
Le
Route dampening applies only to EBGP routes but not IBGP routes.
IBGP routes often include the routes from the local AS, which requires
that the forwarding tables of devices within an AS be the same. In
re
addition, IGP fast convergence aims to achieve information

Mo
synchronization.
en
If IBGP routes were dampened, forwarding tables on devices would be
m/
inconsistent when these devices have different dampening parameters.
Route dampening therefore does not apply to IBGP routes.
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
IP addresses used to interconnect devices are designed as
ht
follows:
If RTX connects to RTY, interconnected addresses are
XY.1.1.X and XY.1.1.Y.Network mask is 24.
s:
Loopback interface addresses of R1, R2, R3, R6, and

ce
R7 are shown in the figure.

ur
Case analysis
To establish stable IBGP peer relationships, use loopback
so
interface addresses and static routes within an AS.

To establish EBGP peer relationships, use physical interface
Re
addresses.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer as-number command sets the AS number of a
ht
specified peer (or peer group).

The peer connect-interface command specifies a source
interface that sends BGP messages and a source address
s:
used to initiate a connection.

The peer next-hop-local command configures a BGP device
ce
to set its IP address as the next hop of routes when it

ur
advertises the routes to an IBGP peer or peer group.

so
View
BGP process view
Re
Parameters
peer ipv4-address as-number as-number
ng
ip-address: specifies the IPv4 address of a peer.

ni
as-number: specifies the AS number of the peer.

peer ipv4-address connect-interface interface-type interface-
ar
number [ ipv4-source-address ]
Le
interface-type interface-number: specifies the interface

type and number.
ipv4-source-address: specifies the IPv4 source address
re
used to set up a connection.

Mo
peer ipv4-address next-hop-local

en
m/
Precautions
co
When using a loopback interface to send BGP messages:
Ensure that the loopback interface address of the BGP
.
peer is reachable.
ei
In the case of an EBGP connection, you need to run
w
the peer ebgp-max-hop command to enable EBGP to
ua
establish the peer relationship in indirect mode.
The peer next-hop-local and peer next-hop-invariable
.h
commands are mutually exclusive.
The PrefRcv field in the display bgp peer command output
g
indicates the number of route prefixes received from the peer.
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. Perform the configuration based on the configuration in

the previous case.
R1 prefers routes to 10.0.X.0/24 with next hop R2 because
s:
BGP prefers the route advertised by the router with the

ce
smallest router ID.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer route-policy command specifies a route-policy to
ht
control routes received from, or to be advertised to a peer or

peer group.
s:
View
BGP view
ce
ur
Parameters
peer ipv4-address route-policy route-policy-
so
name { import | export }

ipv4-address: specifies an IPv4 address of a peer.
Re
route-policy-name: specifies a route-policy name.

import: applies a route-policy to routes to be imported
ng
from a peer or peer group.

export: applies a route-policy to routes to be advertised
ni
to a peer or peer group.

ar
Run the display bgp routing-table command to view the BGP
Le
routing table.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. Company A requires that R1 access network 10.0.1.0/24

through R7. To meet this requirement, you can enable R4 to
access network 10.0.1.0/24 through R7 using the MED
s:
attribute.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

peer group.
s:
View
BGP view
ce
ur
Parameters
so

Re

ng

ni

ar
Le
routing table.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. To meet the requirement, use the Community attribute.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

peer group.
s:
View
BGP view
ce
ur
Parameters
so

Re

ng

ni

ar
Run the display bgp routing-table community command to
Le
view the attributes in the BGP routing table.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
This case is an extension to the previous case. Perform the
ht
configuration based on the configuration in the previous case.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

peer group.
The peer default-route-advertise command configures a
s:
BGP device to advertise a default route to its peer or peer

ce
group.
View
ur
peer route-policy: BGP view

peer default-route-advertise: BGP view
so
Parameters
Re

ng

ni

ar

Le
peer { group-name | ipv4-address } default-route-

advertise [ route-policy route-policy-name ] [ conditional-
re
route-match-all{ ipv4-address1 { mask1 | mask-length1 } }

Mo
&<1-4> | conditional-route-match-any { ipv4-

address2 { mask2 | mask-length2 } } &<1-4> ]
en
m/
route-policy route-policy-name: specifies a route-
policy name.
co
conditional-route-match-all ipv4-
address1{ mask1 | mask-length1 }: specifies the IPv4
.
address and mask/mask length for conditional routes.
ei
The default routes are sent to the peer or peer group
w
only when all conditional routes are matched.
ua
conditional-route-match-any ipv4-
.h
g
only when any conditional route is matched.
in
rn
Run the display ip routing-table command to view IP routing
ea
table information.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The maximum load-balancing command configures the
ht
maximum number of equal-cost routes.
View
s:
BGP view
ce
Parameters
ur
maximum load-balancing [ ebgp | ibgp ] number

ebgp: implements load balancing among EBGP routes.
so
ibgp: implements load balancing among IBGP routes.

number: specifies the maximum number of equal-cost
Re
routes in the BGP routing table.
Precautions
ng
The maximum load-balancing number command cannot be

ni
used together with the maximum load-balancing ebgp

number or maximum load-balancing ibgp number command.
ar
If the maximum load-balancing ebgp number or maximum

load-balancing ibgp number command is executed, the
Le
maximum load-balancing number command does not take

effect.
re
Mo
en
m/
Run the display ip routing-table protocol bgp command to
view the load-balanced routes learned by BGP.
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

After GTSM is enabled between R6 and R8, the hop count
should be 1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer valid-ttl-hops command applies the GTSM function
ht
on the peer or peer group.

The gtsm default-action command configures the default
action to be taken on the packets that do not match the GTSM
s:
policy.
The gtsm log drop-packet command enables the log function
ce
on a board to log information about the packets discarded by

ur
GTSM on the board.

so
View
peer valid-ttl-hops: BGP view
Re
gtsm default-action: system view

gtsm log drop-packet: system view
ng
Parameters
ni
peer ipv4-address valid-ttl-hops [ hops ]

ipv4-address: specifies the IPv4 address of a peer.
ar
hops: specifies the number of TTL hops to be checked.

The value is an integer that ranges from 1 to 255. The default
Le
value is 255. If the value is configured as hops, the valid TTL

range of the detected packet is [255 - hops + 1, 255].
gtsm default-action { drop | pass }
re
Mo
en
drop: discards the packets that do not match the GTSM
m/
policy.
pass: allows the packets that do not match the GTSM
co
policy to pass through.
.
Precautions
ei
GTSM and EBGP-MAX-HOP affect the TTL values of sent
w
BGP packets. The two functions are mutually exclusive.
ua
If the default action is configured but the GTSM policy is not
configured, GTSM does not take effect.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In the topology, among the IP addresses that are not marked,
ht
Rx and Ry connect using IP addresses XY.1.1.X/24 and

XY.1.1.Y/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp
Results
Run the displayvlan command to view the results.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Run the display bgp peer command to view the BGP peer
ht
relationship.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
ht
routing table. The command output shows that 2.2.2.2/32 and

3.3.3.3/32 have been advertised.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
The loop is the result of inconsistency between IGP route
ht
selection and BGP route selection.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In the topology, among the IP addresses that are not marked,
ht
Rx and Ry connect using IP addresses XY.1.1.X/24 and

XY.1.1.Y/24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Analysis process
ht
view the attributes.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp
Results
You will notice that the Community attribute of route
ht
10.0.0.0/24 is labeled as <400:1>, no-export on R2.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
You can add the AS_Path Attribute to change the route
ht
selection of R3.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To ensure connectivity between IBGP peers, you need to establish full-

mesh connections between IBGP peers. If there are n routers in an AS,
ht
you need to establish n(n-1)/2 IBGP connections. When there are a

large number of IBGP peers, many network resources and CPU
resources are consumed. A route reflector (RR) can be used between
s:
IBGP peers to solve this problem.

ce
In an AS, a router functions as an RR, and other routers function as

ur
clients. The RR and its clients establish IBGP connections and form a
cluster. The RR reflects routes to clients, removing the need to
so
establish BGP connections between clients.

Re
RR concepts
RR: a BGP device that can reflect the routes learned from an
ng
IBGP peer to other IBGP peers.

Client: an IBGP device of which routes are reflected by an RR
ni
to other IBGP devices. In an AS, clients only need to directly

connect to the RR.
ar
Non-client: an IBGP device that is neither an RR nor a client.

In an AS, a non-client must establish full-mesh connections
Le
with the RR and all the other non-clients.

Originator: a device that originates routes in an AS. The
Originator_ID attribute helps eliminate routing loops in a
re
cluster.
Mo
Cluster: a set of an RR and clients. The Cluster_List attribute

helps eliminate routing loops between clusters.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An RR advertises learned routes to IBGP peers based on the following

rules:
ht
The RR advertises the routes learned from an EBGP peer to

all the clients and non-clients.
The RR advertises the routes learned from a non-client IBGP
s:
peer to all the clients.

The RR advertises the routes learned from a client to all the
ce
other clients and all the non-clients.

ur
An RR is easy to configure because it needs to be configured only on

so
the device that functions as a reflector, and clients do not need to know
that they are clients.
Re
In some networks, if clients of an RR establish full-mesh connections

among themselves, they can directly exchange routing information. In
ng
this case, route reflection between clients is unnecessary and wastes

ni
bandwidth. You can run the undo reflect between-clients command

on the VRP Platform to prohibit an RR from reflecting the routes
ar
received from a client to other clients.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The originator ID identifies the originator of a route and is generated by

an RR to prevent routing loops in a cluster.
ht
When an RR reflects a route for the first time, the RR adds the
Originator_ID attribute to this route. The Originator_ID attribute
identifies the originator of the route. If the route already
s:
contains the Originator_ID attribute, the RR retains this

ce
Originator_ID attribute.
When a device receives a route, the device compares the
ur
originator ID of the route with the local router ID. If they are the
same, the device discards the route.
so
An RR and its clients form a cluster, which is identified by a unique

Re
cluster ID in an AS.
To prevent routing loops between clusters, an RR uses the Cluster_List
ng
attribute to record the cluster IDs of all the clusters that a route
passes through.
ni
When an RR reflects a route between clients, or between

clients and non-clients, the RR adds the local cluster ID to the
ar
top of the cluster list. If there is no cluster list, the RR creates a

Cluster_List attribute.
Le
When receiving an updated route, the RR checks the cluster

list of the route. If the cluster list contains the local cluster ID,
the RR discards the route. If the cluster list does not contain
re
the local cluster ID, the RR adds the local cluster ID to the
Mo
cluster list and then reflects the route.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Backup RR prevents single-point failures.

ht
Backup RR
On the VRP, you need to run the reflector cluster-id
command to set the same cluster ID for all the RRs in the
s:
same cluster.
When redundant RRs exist, a client receives multiple routes to
ce
the same destination from different RRs and then selects the
ur
optimal route according to BGP route selection policies.

The Cluster_List attribute prevents routing loops between
so
different RRs in the same AS.

Re
When Client1 receives an updated route 10.0.0.0/24 from an
ng
external peer, it advertises the route to RR1 and RR2 through

IBGP.
ni
After RR1 receives the updated route, it reflects the route to

other clients (Client2 and Client3) and adds the local cluster ID
ar
to the top of the cluster list.

After RR2 receives the updated route, it checks the cluster list
Le
and finds that its cluster ID has been contained in the cluster
list. Subsequently, it discards the route without reflecting the
route to its clients.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A backbone network is divided into multiple clusters. RRs of the

clusters are non-clients and establish full-mesh connections with one
ht
other. Although each client only establishes an IBGP connection with

its RR, all the BGP routers in the AS can receive reflected routing
information.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A level-1 RR (RR1) is deployed in Cluster1, while RRs (RR2 and RR3)

in Cluster2 and Cluster3 function as clients of RR1.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Confederation
A confederation divides an AS into sub-ASs. Full-mesh IBGP
ht
connections are established in each sub-AS, while EBGP

connections are established between sub-ASs. ASs outside a
confederation still consider the confederation as an AS.
s:
After a confederation divides an AS into sub-ASs, it assigns a

ce
confederation ID (the AS number) to each router within the AS.

The original IBGP attributes are retained, including the
ur
Local_Pref attribute, MED attribute, and Next_Hop attribute.

Confederation-related attributes are automatically deleted
so
when being advertised outside a confederation. The

administrator therefore does not need to configure the rules for
Re
filtering information such as sub-AS numbers at the egress of

a confederation.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The AS_Path attribute is a well-known mandatory attribute. It consists

of ASs and has the following types:
ht
AS_SET: comprises a series of ASs in a disorderly manner

and is carried in an Update message. When network
summarization occurs, you can use policies to prevent path
s:
information loss using AS_SET.

AS_SEQUENCE: comprises a series of ASs in sequence and
ce
is carried in an Update message. Generally, the AS_Path type

ur
is AS_SEQUENCE.
AS_CONFED_SEQUENCE: comprises a series of member
so
ASs in a confederation in sequence and is carried in an

Update message. Similar to AS_SEQUENCE,
Re
AS_CONFED_SEQUENCE can only be transmitted within a

local confederation.
AS_CONFED_SET: comprises a series of member ASs in a
ng
confederation in a disorderly manner and is carried in an

ni
Update message. Similar to AS_SET, AS_CONFED_SET can

only be transmitted within a local confederation.
ar
Member AS numbers within a confederation are invisible to other ASs

Le
outside the confederation. When routes are therefore advertised to

other ASs outside the confederation, member AS numbers are
removed.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Comparison between a route reflector and a confederation

A confederation requires an AS to be divided into sub-ASs,
ht
changing the network topology a lot.

Only an RR needs to be configured, and clients do not need to
be configured. The confederation needs to be configured on all
s:
the devices.
RRs must establish full-mesh IBGP connections.
ce
Route reflectors are widely used, while confederations are

ur
seldom used.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The BGP routing table of each device on a large network is large. This
burdens devices, increases the route flapping probability, and affects
ht
network stability.
Route summarization is a mechanism that combines multiple routes

s:
into one route. This mechanism allows a BGP device to advertise only
ce
the summarized route but not all the specific routes to peers. It reduces
the BGP routing table size. If the specific routes flap, the network is not
ur
affected, therefore improving network stability.

so
Route summarization uses the Aggregator attribute. This attribute is an

optional transitive attribute and identifies the node where route
Re
summarization occurs and carries the router ID and AS number of the

node.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Precautions
The summary automatic command summarizes the routes
ht
imported by BGP, including direct routes, static routes, RIP

routes, OSPF routes, and IS-IS routes. After summarization is
configured, BGP summarizes routes according to the natural
s:
network segment and suppresses specific routes in the BGP

ce
routing table. This command is only valid for the routes

imported using the network command.
ur
BGP advertises only summarized routes to peers.

BGP does not start automatic summarization by default.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Manual summarization
Summarized routes do not carry the AS_Path attribute of detail
ht
routes.
Using the AS_SET attribute to carry the AS number can
prevent routing loops. Differences between AS_SET and
s:
AS_SEQUENCE are as follows: In AS_SET, the AS list is

ce
often used to perform route summarization, and AS numbers

are added to the AS list in a disorderly manner. In
ur
AS_SEQUENCE, AS numbers are added to the AS list in the

sequence in which a route passes through.
so
Adding the AS_SET attribute to summarized routes may cause

routing flapping.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RFC 5291 and RFC 5292 define the prefix-based BGP outbound route
filtering (ORF) capability to advertise required BGP routes. BGP ORF
ht
allows a device to send prefix-based inbound policies in a Route-

Refresh message to BGP peers. BGP peers then construct outbound
policies based on these inbound policies to filter routes before sending
s:
these routes. This capability has the following advantages:

Prevents the local device from receiving a large number of
ce
unnecessary routes.
ur
Reduces CPU usage of the local device.

Simplifies the configuration of BGP peers.
so
Improves link bandwidth efficiency.

Re
Case description
Among directly-connected EBGP peers, after negotiating the
ng
prefix-based ORF capability with R1, Client2 adds local prefix-

based inbound policies to a Route-Refresh message and
ni
sends the message to R1. R1 then constructs outbound

policies based on the received Route-Refresh message and
ar
sends required routes to Client1 using a Route-Refresh

message. Client1 receives only the required routes, and R1
Le
does not need to maintain routing policies. In this manner, the

configuration workload is reduced.
Client1 and Client2 are clients of the RR. Client1, Client2, and
re
the RR negotiate the prefix-based ORF capability. Client1 and

Mo
Client2 then add local prefix-based inbound policies to Route-

Refresh messages and send the messages to the RR.
en
The RR constructs outbound policies based on the received
m/
inbound policies and reflects required routes in Route-Refresh
messages to Client1 and Client2. Client1 and Client2 receive only
co
the required routes, and the RR does not need to maintain routing
policies. The configuration workload is thereby reduced.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Active-Route-Advertise
Once a route is preferred by BGP, the route can be advertised
ht
to peers by default. When Active-Route-Advertise is configured,

only the route preferred by BGP and also active at the routing
management layer is advertised to peers.
s:
Active-Route-Advertise and the bgp-rib-only command are

ce
mutually exclusive. The bgp-rib-only command prohibits BGP

routes from being advertised to the IP routing table.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
BGP dynamic update peer-groups

BGP sends routes based on peers by default, even though the
ht
peers have the same outbound policies.

After this feature is enabled, BGP groups each route only once
and then sends the route to all the peers in the update-group,
s:
improving grouping efficiency exponentially.

ce
ur
RR1 has three clients and needs to reflect 100,000 routes to these
clients. If RR1 sends the routes grouped per peer to the three clients,
so
the total number of times that all routes are grouped is 300,000
(100,000 x 3). After the dynamic update peer-groups feature is used,
Re
the total number of grouping times changes to 100,000 (100,000 x 1),

improving grouping performance by a factor of 3.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Roles defined in 4-byte AS number

New speaker: a peer that supports 4-byte AS numbers
ht
Old speaker: a peer that does not support 4-byte AS numbers

New session: a BGP connection between new speakers
Old session: a BGP connection between a new speaker and
s:
an old speaker, or between old speakers.

ce
Protocol extension
ur
Two new optional transitive attributes, AS4_Path with attribute

code 0x11 and AS4_Aggregator with attribute code 0x12, are
so
defined to transmit 4-byte AS numbers in old sessions.

If a BGP connection is set up between a new speaker and an
Re
old speaker, a newly reserved AS_TRANS with value 23456 is

defined for interoperability between 4-byte AS number and 2-
ng
byte AS number.
New AS numbers have three formats:
ni
asplain: represents an AS number using a decimal

integer.
ar
asdot+: represents an AS number using two integer

values joined by a period character: <high order 16-bit
Le
value in decimal>.<low order 16-bit value in decimal>.

For example, 2-byte ASN123 is represented as 0.123,
and ASN 65536 is represented as 1.0. The largest
re
value is 65535.65535.
Mo
en
asdot: represents a 2-byte AS number using the
m/
asplain format and representing a 4-byte AS number
using the asdot+ format. (1 to 65535; 1.0 to
co
65535.65535)
Huawei supports the asdot format.
.
ei
w
R2 receives a route with a 4-byte AS number 10.1 from R1.
ua
R2 establishes a peer relationship with R3 and needs to
enable R3 to consider the AS number of R2 as AS_TRANS.
.h
When advertising a route to R3, R2 records AS_TRANS in the
AS_Path attribute of the route and records 10.1 and its AS
g
number 20.1 to the AS4_Path attribute in the sequence
in
required by BGP.
R3 retains the unrecognized AS4_Path attribute and
rn
advertises the route to R4 according to BGP rules and
ea
considers the AS number of R2 as AS_TRANS.
When receiving the route from R3, R4 replaces AS_TRANS
/l
with the IP address recorded in the AS4_Path attribute and
records the AS4_Path as 30 20.1 10.1.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Next-hop iteration based on routing policy

BGP needs to iterate indirect next hops. If indirect next hops
ht
are not iterated according to the routing policy, routes may be

iterated to incorrect forwarding paths. Next hops should
therefore be iterated according to certain conditions to control
s:
the iterated routes. If a route cannot pass the routing policy,

ce
the route is ignored and route iteration fails.

ur
IBGP peer relationships are established between R1 and R2,
so
and between R1 and R3 through loopback interfaces. R1

receives a BGP route with prefix 10.0.0.0/24 from R2 and R3.
Re
The original next hop of the BGP route received from R2 is

2.2.2.2. The IP address of Ethernet0/0/0 of R1 is 2.2.2.100/24.
When R2 is running normally, the BGP route with prefix
ng
10.0.0.0/24 is iterated to the IGP route 2.2.2.2/32. When the

ni
IGP on R2 becomes faulty, the IGP route 2.2.2.2/32 is

withdrawn. This causes route iteration again. On R1, a route is
ar
searched for in the IP routing table based on the original next

hop 2.2.2.2. Consequently, the route is iterated to 2.2.2.0/24.
Le
The user expects that: when the route with the next hop
2.2.2.2 becomes unreachable, the route with the next hop
3.3.3.3 is preferred. Actually, the fault is caused by BGP
re
convergence and results in an instant routing black hole.

Mo
en
With the next-hop iteration policy, you can control the mask
m/
length of the route through which the original next hop can be
iterated. After the next-hop iteration policy is configured, the
co
route with the original next hop 2.2.2.2 depends on only the
IGP route 2.2.2.2/32.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Session setup between peers

A session can be set up between BGP speakers through
ht
directly connected or loopback interfaces. Generally, IBGP

neighbors establish peer relationships through loopback
interfaces, while EBGP neighbors establish peer relationships
s:
through directly connected physical interfaces.

You can configure authentication to ensure security for
ce
sessions between peers.

ur
Logical full-mesh connections must be set up between IBGP

peers (no RR or confederation is used).
so
You can prohibit synchronization to reduce the IGP load.

Route update origin
Re
Routes can be imported into BGP using the import-route or

network command.
Routing policy optimization
ng
You can optimize BGP routes using inbound policies,

ni
outbound policies, and ORF.

Route filtering and attribute control
ar
You can filter the routes to be advertised or received.

You can control BGP route attributes to affect BGP route
Le
propagation.
Route summarization
Route summarization can optimize BGP routing entries and
re
reduce the routing table size.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Redundancy
Path redundancy ensures that a backup path is available when
ht
a network fault occurs.

Traffic symmetry
Scientific network design and policy application can ensure
s:
consistent paths for incoming and outgoing traffic.

ce
Load balancing
When multiple paths to the same destination exist, traffic can
ur
be load balanced through policies to fully utilize bandwidth.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Interaction between non-BGP routes and BGP routes

Generally, non-BGP routes can be imported into the BGP
ht
routing table using the import-route or network command.

Control of default routes
Default routes can be advertised or received according to
s:
conditions of routing policies.

ce
Policy-based routing
Traffic paths can be optimized through PBR.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Dynamic update peer-groups: greatly improves router performance.

Route reflector and confederation: reduces the number of IBGP
ht
sessions and optimizes large BGP networks.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Reduce unstable routes

Use stable IGPs.
ht
Improve router performance.

Reduce manual errors.
Expand link bandwidth.
s:
Improve BGP stability

Use BGP soft reset when using new BGP policies.
ce
Punish unstable routes correctly to reduce the impact of these

ur
routes on BGP.
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
IP addresses used to interconnect devices are as follows:
ht

XY.1.1.X and XY.1.1.Y. Network mask is 24.
If OSPF runs normally and the interconnected addresses and
s:
loopback interface addresses have been advertised into OSPF.

ce
However 10.0.X.0/24, 172.15.X.0/24, and 172.16.X.0/24 are

not advertised into OSPF.
ur
Case analysis
EBGP peer relationships are established using loopback
so
interfaces.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer as-number command sets an AS number for a
ht
specified peer or peer group.

The peer connect-interface command specifies a source
interface that sends BGP messages and a source address
s:
used to initiate a connection.

The peer next-hop-local command configures a BGP device
ce
to set its IP address as the next hop of routes when it

ur
advertises routes to an IBGP peer or peer group.

The group command creates a peer group.
so
View
Re
BGP process view
Parameters
ng
peer ipv4-address as-number as-number

ni

as-number: specifies the AS number of the peer.
ar
peer ipv4-address connect-interface interface-type interface-

number [ ipv4-source-address ]
Le

interface-type interface-number: specifies the interface
type and number.
re
ipv4-source-address: specifies the IPv4 source address

Mo
used to set up a connection.

en
peer ipv4-address next-hop-local
m/
group group-name [ external | internal ]
co
group-name: specifies the name of a peer group.
external: creates an EBGP peer group.
.
internal: creates an IBGP peer group.
w ei
Precautions
ua
When configuring a device to use a loopback interface as the
source interface of BGP messages, note the following points:
.h
The loopback interface of the device's BGP peer must
be reachable.
g
In the case of an EBGP connection, the peer ebgp-
in
max-hop command must be executed to enable the
rn
two devices to establish an indirect peer relationship.
The peer next-hop-local and peer next-hop-invariable
ea
commands are mutually exclusive.
The Rec field in the display bgp peer command output
/l
indicates the number of route prefixes received from the peer.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. Perform the configurations based on the configuration in

the previous case.
If all the clients of the RR have established logically full-mesh
s:
connections, the clients can transmit routes to each other

ce
without requiring the RR to reflect routes to them. In this

situation, prohibit the RR from reflecting routes to clients so as
ur
to reduce the RR load.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The undo reflect between-clients command prohibits an RR
ht
from reflecting routes to clients. This command is executed on

an RR. After this command is executed, clients can directly
exchange BGP messages, while R2 does not need to reflect
s:
routes to these clients. However, R2 still reflects the routes

ce
that are advertised by non-clients.

ur
View
BGP view
so
Re
Run the display bgp peer command to view detailed BGP

peer information.
To reduce the RR load, prohibit BGP routes from being added
ng
to the IP routing table and prevent the RR from forwarding

ni
packets. Disabling route reflection between clients however

can better meet the full-mesh scenario requirement.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. To meet the first requirement, use a route-policy to

advertise interface routing information.
To meet the second requirement, use an IP prefix list to filter
s:
routes.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer ip-prefix command configures a route filtering policy
ht
based on an IP prefix list for a peer or peer group.
View
s:
BGP view
ce
Parameters
ur
peer { group-name | ipv4-address } ip-prefix ip-prefix-

so

Re
ip-prefix-name: specifies the name of an IP prefix list.

import: applies a filtering policy to the routes received
ng

export: applies a filtering policy to the routes sent to a
ni
peer or peer group.

ar
Le
routing table.
For the same node in a route-policy, the relationship between
if-match clauses is AND. A route needs to meet all the
re
matching rules before the actions defined by apply clauses

Mo
are performed.
en
The relationship between the if-match clauses in the if-match route-
m/
type and if-match interface commands is "OR", but the relationship
between the if-match clauses in the two commands and other
co
commands is "AND".
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

In requirement 2, the delivery of a default route depends on
route 172.16.0.0/16. If route 172.16.0.0/16 disappears, the
s:
default route also disappears.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht
control routes received from or to be advertised to a peer or

peer group.
The peer default-route-advertise command configures a
s:
BGP device to advertise a default route to its peer or peer

ce
group.
ur
View
peer route-policy: BGP view
so
peer default-route-advertise: BGP view

Parameters
Re

ng

ni

ar

Le
peer { group-name | ipv4-address } default-route-

advertise [ route-policy route-policy-name ] [ conditional-
route-match-all{ ipv4-address1 { mask1 | mask-length1 } }
re
&<1-4> | conditional-route-match-any { ipv4-

Mo
address2 { mask2 | mask-length2 } } &<1-4> ]

en
m/
route-policy route-policy-name: specifies a route-
policy name.
co
conditional-route-match-all ipv4-
.
ei
w
only when all conditional routes are matched.
ua
conditional-route-match-any ipv4-
.h
g
only when any conditional route is matched.
in
rn
Run the display ip routing-table command to view
ea
information about the IP routing table.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The aggregate command creates an aggregated route in the
ht
BGP routing table.
View
s:
BGP view
ce
Parameters
ur
aggregate ipv4-address { mask | mask-length } [ as-

set | attribute-policy route-policy-name1 | detail-
so
suppressed | origin-policy route-policy-name2 | suppress-

policyroute-policy-name3 ] *
Re
ipv4-address: specifies the IPv4 address of an

aggregated route.
ng
mask: specifies the network mask of an aggregated

route.
ni
mask-length: specifies the network mask length of an

aggregated route.
ar
as-set: generates a route with the AS-SET attribute.

attribute-policy route-policy-name1: specifies the name
Le
of an attribute policy for aggregated routes.

detail-suppressed: advertises only the aggregated
route.
re
origin-policy route-policy-name2: specifies the name of

Mo
a policy that allows route aggregation.

en
suppress-policy route-policy-name3: specifies the
m/
name of a policy for suppressing the advertisement of
specified routes.
co
Precautions
.
During manual or automatic summarization, routes pointing to
ei
NULL0 are generated locally.
w
ua
Run the display ip routing-table protocol bgp command to
.h
view the routes learned by BGP.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

BGP on-demand route advertisement requires ORF to be
enabled on R4, R5, and R6.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The peer capability-advertise orf command enables prefix-
ht
based ORF for a peer or peer group.
View
s:
BGP view
ce
Parameters
ur
peer { group-name | ipv4-address } capability-advertise

orf [ cisco-compatible ] ip-prefix { both | receive | send }
so

Re
cisco-compatible: is compatible with Cisco devices.

both: allows the device to send and receive ORF
ng
packets.
receive: allows the device to receive only ORF packets.
ni
send: allows the device to send only ORF packets.

ar
Precautions
BGP ORF has three modes: send, receive, and both. In send
Le
mode, a BGP device can send ORF information. In receive

mode, a BGP device can receive ORF information. In both
mode, a BGP device can send and receive ORF information.
re
Mo
en
To enable a BGP device that advertises routes to receive ORF
m/
IP-prefix information, configure this device to work in receive or
both mode and the peer device to work in send or both mode.
co
.
Run the display bgp peer 1.1.1.1 orf ip-prefix command to
ei
view prefix-based BGP ORF information received from a
w
specified peer.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

XY.1.1.X and XY.1.1.Y.Network mask is 24.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
ei
w
ua
g .h
in
rn
ea
/l
:/
tp
Results
The configuration is the basic OSPF configuration.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Run the display bgp peer command to view the BGP peer
ht
status.
Run the display bfd session all command to view the BFD
session. In the command output, D_IP_IF indicates that a BFD
s:
session is dynamically created and bound to an interface.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Run the display bgp routing-table command to view BGP
ht
routing entries. The command output shows that R3 learns two

routes 10.0.0.0/24 from R2 and R4. According to BGP routing
rules, R3 prefers the route 10.0.0.0/24 learned from R2.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Analysis process
You can use commands peer groups to reduce the RR load.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
ht
view the Community attribute.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
ht
view the Community attribute. The Community attribute is no-

export. That is, the route is not advertised to EBGP peers.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ACL
An ACL is a series of sequential rules composed of permit and
ht
deny clauses. These rules match packet information to classify

packets. Based on ACL rules, Routers permits or denies
packets.
s:
An Access Control List (ACL) is a set of sequential rules. The

ce
ACL filters packets according to the specified rules. With the

rules applied to a device, the device permits or denies the
ur
packets according to the rules.

so
IP prefix list
An IP prefix list filters matching routes in defined matching
Re
mode to meet requirements.

An IP prefix list filters only routing information but not packets.
ng
AS_Path filter
Each BGP route contains an AS path attribute. AS path
ni
filters specify matching rules regarding AS path attribute.

ar
AS path filters are exclusively used in BGP.

Le
Community filter
Community filters are exclusively used in BGP. Each BGP
re
route contains a community domain to identify a

community.Community filters specify matching rules
Mo
regarding community domains.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ACL management rule

An ACL can contain multiple rules.
ht
A rule is identified by a rule ID, which can be set by a user or

automatically generated based on the ACL step. All the rules
in an ACL are arranged in ascending order of rule IDs.
s:
There is a step between rule IDs. If no rule ID is specified, the

ce
step is determined by the ACL step. You can add new rules to
a rule group based on the rule ID.
ur
ACL rule management

so
When a packet reaches a device, the search engine retrieves

information from the packet to constitute the key value and
Re
matches the key value with rules in an ACL. When a matching

rule is found, the system stops the matching, and the packet
ng
matches the rule.

If no matching rule is found, the packet does not match any
ni
rule.
ar
The action defined in the last rule of a Huawei ACL is permit by default.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Interface-based ACL
Match packets based on the rules defined on the inbound
ht
interface of packets. You can run the traffic-filter command to

reference an interface-based ACL.
s:
Basic ACL
Define rules based on the source IP address, VPN instance,
ce
fragment flag, and time range of packets.

ur
Advanced ACL
so
Define rules based on the source IP address, destination IP

address, IP preference, ToS, DSCP, IP protocol type, ICMP
Re
type, TCP source port/destination port, and UDP source

port/destination port number of packets. An advanced ACL can
ng
define more accurate, abundant, and flexible rules than a basic

ACL.
ni
Layer 2 ACL
ar
Define rules based on Ethernet frame header information in a

packet, including the source MAC address, destination MAC
Le
address, and Ethernet frame protocol type.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ACL matching order

An ACL is composed of a list of rules. Each rule contains a
ht
deny or permit clause. These rules may overlap or conflict.

One rule can contain another rule, but the two rules must be
different.
s:
Devices support two types of matching order: configuration

ce
order and automatic order. The matching order determines the

priorities of the rules in an ACL. Rule priorities resolve the
ur
conflict between overlapping rules.

so
Automatic order
The automatic order follows the depth-first principle.
Re
ACL rules are arranged in sequence based on rule precision.

For an ACL rule (where a protocol type, a source IP address
ng
range, or a destination IP address range is specified), the

stricter the rule, the more precise it is considered. For example,
ni
an ACL rule can be configured based on the wildcard of an IP

address. The smaller the wildcard, the smaller the specified
ar
host and the stricter the ACL rule.

If rules have the same depth-first order, rules are matched in
Le
ascending order of rule IDs.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Packet fragmentation supported by ACLs

In traditional packet filtering, only the first fragment of a packet
ht
needs to match rules, while the other fragments are allowed to

pass through if the first fragment matches rules. In this
situation, network attackers may construct subsequent
s:
fragments to launch attacks.

In an ACL rule, the fragment parameter indicates that the rule
ce
is valid for all fragmented packets. The none-first-fragment

ur
parameter indicates that the rule is valid only for non-first

fragmented packets but not for non-fragmented packets or the
so
first fragmented packet. The rules that do not contain

fragment and none-first-fragment parameters are valid for all
Re
packets (including fragmented packets).
ACL time range

ng
You can make ACL rules valid only at the specified time or
ni
within a specified time range.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IP prefix list
An IP prefix list can contain multiple indexes. Each index has a
ht
node. The system matches a route against nodes by the index

in ascending order. If the route matches a node, the system
does not match the route against the other nodes. If the route
s:
does not match any node, the system filters the route.
According the matching prefix, an IP prefix list can be used for
ce
accurate matching, or matching within a specified mask length

ur
range.
An IP prefix list can implement accurate matching, or matching
so
within a specified mask length range. You can configure

greater-equal and less-equal to specify the prefix mask
Re
length range. If the two keywords are not configured, an IP

prefix is used to implement accurate matching. That is, only
ng
routes with the same mask length as that specified in the IP

prefix list are matched. If only greater-equal is configured, the
ni
mask length range is [greater-equal-value,32]. If only less-

equal is configured, the mask length range is [specified mask
ar
length, less-equal-value].
The mask length range can be specified as mask-
Le
length<=greater-equal-value<=less-equal-value<=32.
Characteristics of an IP prefix list

re
When all IP prefix lists are not matched, the last matching
Mo
mode is deny by default.

When the referenced IP prefix list does not exist, the default
matching mode is permit.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An AS_Path filter is only used to filter BGP routes to be advertised or

received based on the AS_Path attributes contained in the BGP
ht
routes.
Since the number of the last AS that a route passes through is added to
s:
the leftmost of an AS_Path list, configure an AS_Path filter with

ce
caution:
If a route originating from an AS passes through AS 300, AS
ur
200, and AS500, and then reaches AS 600, the AS_Path

attribute of the route is (500 200 300 100).
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A Community filter is only used to filter BGP routes to be advertised or

received based on the Community attributes contained in the BGP
ht
routes.
The Community attribute includes basic and advanced community

s:
attributes.
Self-defined community attributes and well-known
ce
communities are basic community attributes.

ur
RT and SOO in MPLS VPN are extended community attributes.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A route policy is used to filter routes and set attributes for routes. By
changing route attributes (including reachability), a route policy
ht
changes the path that network traffic passes through.
A route policy is often used in the following scenarios:

s:
Control route importing.

Using a route policy, you can preventing sub-optimal
ce
routes and routing loops during the import of routes.

ur
Control route receiving and advertising.

Using a route policy, you can receive or advertise
so
specified routes according to network requirements.

Set attributes for routes.
Re
Using a route policy, you can modify the attributes of

routes to optimize a network.
ng
Route policy principles

ni
A route policy consists of multiple nodes. The system checks

routes against the nodes of a route policy in ascending order
ar
of the node IDs. A node contains multiple if-match and apply

clauses. The if-match clauses define matching conditions of a
Le
node, while apply clauses define the actions to be performed

on the routes that match if-match clauses. The relationship
between the if-match clauses of a node is AND. That is, a
re
route matches a node only when the route matches all the if-
Mo
match clauses of the node. The relationship between the

nodes of a route policy is OR.
en
That is, a route matches a route policy as long as the route
m/
matches the route policy. If a route does not match any node,
the route fails to match the route policy.
co
The relationship between the if-match clauses of a node in a
route policy is AND. The actions defined by apply clauses can
.
be performed on a route only when the route meets all the
ei
matching conditions defined by the if-match clauses. The
w
relationship between the if-match clauses in the if-match
ua
route-type and if-match interface commands is OR, but the
relationship between the if-match clauses in the two
.h
commands and other commands is AND.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
In the topology, dual-node bidirectional route advertisement is

implemented.
ht
In the topology, R1 imports route 10.0.0.1/24 into OSPF. R3

imports OSPF routes into IS-IS, and R2 learns routes
10.0.0.0/24 through IS-IS. In this case, R2 learns two routes
s:
10.0.0.0/24 through OSPF and IS-IS. R2 prefers the route

ce
learned through IS-IS because this route has a higher priority

than the external route learned through OSPF. Therefore, R2
ur
reaches 10.0.0.0/24 along the path R4R3R1. To optimize

the path, modify the OSPF ASE priority to be higher than the
so
IS-IS priority using a route policy. This modification prevents

R2 from using a sub-optimal route.
Re
When the interface that connects R1 to network 10.0.0.0/24

goes Down, R2 imports route 10.0.0.0/24 into OSPF because
ng
it has learned the route through IS-IS even though the external
LSA has been aged in the OSPF area. R1 and R3 then learn
ni
the route 10.0.0.0/24. When R2 accesses network 10.0.0.0/24,

traffic passes through R4R3R1R2, causing a routing
ar
loop. In this scenario, use a tag to prevent routing loops.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Control route receiving and advertising

Only necessary and valid routes are received, which limits the
ht
routing table size and improves network security.
s:
R4 imports routes 10.0.X.0/24 into OSPF. According to service

ce
requirements, R1 can only receive routes 10.0.0.0/24 and

10.0.1.0/24, while R2 can only receive routes 10.0.2.0/24 and
ur
10.0.3.0/24. You can use a filter policy to meet this

requirement.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Generally, only routing information is filtered, but link state information

is not filtered.
ht
In OSPF, incoming and outgoing Type 3, Type 5, and Type 7

LSAs can be filtered.
Link-state routing protocols, such as OSPF and IS-IS, can filter
s:
only incoming routes but not LSAs that carry these routes.
ce
That is, OSPF and IS-IS do not add the filtered routes to the
local routing tables, but LSAs of these routes are still
ur
transmitted in the OSPF or IS-IS area.

The routes imported from other protocols can also be filtered.
so
For example, you can use the filter-policy export command to

filter the imported routes to be advertised from RIP. Only the
Re
external routes that pass the filtering can be converted into

AS-external LSAs and advertised. In this situation, other
ng
neighbors do not have specified routes imported from RIP.

This configuration can only be performed in the outbound
ni
direction.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
You can modify the Local_Pref attribute contained in a route
ht
using a route policy to change the path of traffic. R2 learn the

route 10.0.0.0/24 from EBGP and modify the Local Pref value
300, R3 learn the route 10.0.0.0/24 from EBGP and modify the
s:
Local Pref value 200. R1,R2,R3 have routes of each other

ce
from IBGP, ultimate AS100 prefers R2 to reach the 10.0.0.0/24.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PBR is a mechanism that selects routes based on user-defined policies.

It includes local PBR, interface PBR, and SPR. This course discusses
ht
only local PBR.
IP unicast PBR has the following advantages:

s:
Allows you to define policies for route selection according to

ce
service requirements, which improves route selection flexibility

and controllability.
ur
Sends different data flows through different links, which

improves link efficiency.
so
Uses low-cost links to transmit service data without affecting

service quality, which reduces the cost of enterprise data
Re
services.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Matching process
If a device finds a matching local PBR node, the device
ht
processes packets as follows:

Step 1 Checks whether the priority of packets has been set.
If so, the device applies the configured priority
s:
to the packets and performs step 2.

If not, the device performs step 2.
ce
Step 2 Checks whether an outbound interface has been

ur
configured for the local PBR.

If so, the device sends packets from the
so
outbound interface.
Re
Step 3 Checks whether next hops have been configured

for the local PBR. You can configure two next hops to
ng
implement load balancing.

If so, the device sends packets to the next hops.
ni
If not, the device searches the routing table for

a route based on the destination addresses of
ar
packets. If no route is available, the devices

performs step 4.
Le
Step 4 Checks whether the default outbound interface has

been configured for the local PBR.
If so, the device sends the packets from the
re
default outbound interface.

Mo

en
Step 5 Checks whether the default next hop has been
m/
configured for the local PBR.
If so, the device sends the packets to the
co
default next hop.
.
Step 6 Discards the packets and generates
ei
ICMP_UNREACH messages.
w
If the device does not find a matching local PBR node, it
ua
searches the routing table for a route based on the destination
addresses of the packets and then sends the packets.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The route-policy command creates a route policy and
ht
displays the route-policy view.
View
s:
System view
ce
Parameters
ur
route-policy route-policy-name { permit | deny } node node

route-policy-name: specifies the name of a route policy.
so
permit: specifies the matching mode of the route policy

as permit. In permit mode, if a route matches all the if-
Re
match clauses of a node, the route matches the route

policy, and the actions defined by the apply clause of
ng
the route are performed on the route, otherwise, the

route continues to match the next node.
ni
deny: specifies the matching mode of the route policy as

deny. In deny mode, if a route matches all the if-match
ar
clauses of a route, the route does not match the route

policy and cannot match the next node.
Le
node node: specifies the index of a node in the route

policy.
re
Precautions
Mo
A route policy is used to filter routes and set attributes for the routes
that match the route policy. A route policy consists of multiple nodes.
en
One node contains multiple if-match and apply clauses.
m/
The if-match clauses define matching conditions for this node, and the
apply clauses define the actions to be performed on the routes that
co
meet the matching conditions. The relationship between if-match
clauses is AND. That is, a route must match all the if-match clauses of
.
a node. The relationship between the nodes of a route policy is OR.
ei
That is, if a route matches a node, the route matches the route policy. If
w
a route does not match any node, the route does not match the route
ua
policy.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. Perform the configuration based on the configuration in

the previous case.
In requirement 2, use the least number of commands to
s:
implement the optimal configuration.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The filter-policy export command filters imported routes to be
ht
advertised according to the policy.
View
s:
System view
ce
Parameters
ur

prefix-name } export [ protocol [ process-id ] ]
so
acl-number: specifies the number of a basic ACL.

acl-name acl-name: specifies the name of an ACL.
Re

prefix list.
ng
protocol: specifies the protocol that advertises routing

information.
ni
process-id: specifies a process ID when the protocol that

advertises routing information is RIP, IS-IS, or OSPF.
ar
Precautions
Le
After external routes are imported into OSPF using the import-
route command, you can run the filter-policy export
command to filter the imported routes to be advertised.
re
Mo
en
This configuration allows only the external routes that meet the
m/
matching conditions to be translated into Type 5 LSAs (AS-
external-LSAs) and advertised. In this case, routing loops are
co
prevented.
You can specify protocol or process-id to filter the routes of a
.
specified protocol or process. If no protocol or process-id is
ei
specified, OSPF filters all of the imported routes.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
case. After meeting the requirements, check whether sub-

optimal routes and routing loops exist.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
After routing protocols import routes from each other, R4
ht
reaches 172.16.X.0/24 through a sub-optimal route (OSPF

route 172.16.X.0/24). This is because R4 first learns OSPF
route 172.16.X.0/24 and then learns RIP route 172.16.X.0/24.
s:
In fact, the optimal route is OSPF route 172.16.X.0/24.

ce
However, the preference of OSPF external routes is 150,and

the preference of RIP is 100,so R4 reaches 172.16.X.0/24
ur
through a sub-optimal route.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

To meet requirement 1, ensure that R4 accesses
172.16.X.0/24 through RIP, to void reaches 172.16.X.0/24
s:
through a sub-optimal route.

To meet requirement 2, use tags to control dual-node
ce
bidirectional route importing so as to prevent routing loops.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
If we do not filter routes when bidirectional route importing,
ht
routing loops occur when network environments change. In

order to avoid the loop should ensure that routing protocols
between imported only importing in the routing domain self
s:
routing. Based on the configuration in the previous, the

ce
advantage of using TAG is not required to specify the routing

entries specifically. When routing domain specific item or
ur
routing, the routing entries and restrictions will change, does

not need manual intervention, and has a good scalability.
so
Though the configuration in the previous could avoid routing

loops, but the sub-optimal route is still exist.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The reason of sub-optimal route is when dual-node bidirectional route

importing one of R3 and R4 will learn network 172.16.X.0/24 from both
ht
OSPF and RIP, and the preference of OSPF external routes is greater
than RIP, R3 or R4(one of them ) reaches 172.16.X.0/24 through a sub-
optimal. To slove this you need to modify the preference of OSPF
s:
external routes is smaller than RIP. The preference value of OSPF

ce
external routes smaller than the OSPF internal routes is unreasonable.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
When only route summarization is performed, two problems
ht
exist: R5 learns the summary route, and a routing loop occurs

between R3 and R4 when R2 pings a nonexistent IP address.
The reason why the first problem occurs is as follows: After R3
s:
and R4 learn the summary routes generated by themselves,

ce
they import the summary routes into the RIP area again.
The reason why the second problem occurs is as follows: After
ur
R3 and R4 learn the summary routes generated by themselves,

they add the summary routes to their routing tables.
so
To address the two problems, prevent R3 and R4 from

learning the summary routes generated by them and from
Re
importing the routes into the OSPF area. That is, filter the
summary route learned from each other on R3 and R4.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Configuration filter policy on R3 and R4, avoid receive specify

summary routes of OSPF to ensure not importing this to the
ht
domain of RIP for avoiding routing loops.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The policy-based-route command creates or modifies a PBR.
ht
The ip local policy-based-route command enables local PBR.
View
s:
policy-based-route: system view

ip local policy-based-route: system view
ce
ur
Parameters
policy-based-route policy-name { permit | deny } node node-
so
id
policy-name: specifies the PBR name.
Re
permit: performs PBR on the routes that meet matching

conditions.
ng
deny: does not perform PBR on the routes that meet

matching conditions.
ni
node-id: specifies the ID of a node.

ip local policy-based-route policy-name
ar
policy-name: specifies a PBR name.

Le
Precautions
When deploying PBR, do not configure a broadcast interface
such as an Ethernet interface as the outbound interface of
re
packets.
Mo
en
m/
Run the display bgp peer 1.1.1.1 orf ip-prefix command to
view prefix-based BGP ORF information received from a
co
specified peer.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
wei
ua
g .h
in
rn
ea
/l
:/
tp
Results
When R5 imports routes, accurate matching must be
ht
performed.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
When you tracert a nonexistent IP address that belongs to
ht
10.0.0.0/16, a routing loop occurs. This is because no route

pointing to Null0 is generated when OSPF generates a
summary route.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
You can configure static routes pointing to Null0 on R5 using a
ht
command to prevent routing loops.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht

follows:
s:

ce

The IP address of R1 S0/0/0 is 12.1.1.1/24, and the IP
ur
address of R2 S0/0/0 is 12.1.1.2/24. The IP address of

R1 S0/0/1 is 21.1.1.2/24, and the IP address of R2
so
S0/0/1 is 21.1.1.1/24.
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Use the ACL and route-policy commands to import two
ht
network segment into IS-IS, usually use the filter-policy XXX

export command to import routes.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
After you use tags to prevent routing loops, If IS-IS support
ht
Tags is necessary , the cost type must wide, otherwise the

routes of IS-IS can not be tagged.
To prevent the sub-optimal route, modify the preference of
s:
OSPF external route 10.0.0.0/16 to be smaller than that of IS-

ce
IS routes.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results
Configuration on this case avoid sub-optimal routes of R3 and
ht
R4. The difference of importing time cause one of R3 and R4

will learn 10.0.0.0/16 from ISIS or OSPF at the same time, If
R3 imported routes earlier, R4 will learn 10.0.0.0/16 from ISIS
s:
and OSPF at the same time, and compare their preference,

ce
the preference of OSPF external routes is 150, preference of

ISIS is 15, so R4 prefer ISIS to reach the network 10.0.0.0/16,
ur
but this one is sub-optimal route. So mofidy the preference of

10.0.0.0/16 on R4 smaller than the preference value of ISIS
so
can eliminate sub-optimal routes. The preference value of

OSPF external routes smaller than the OSPF internal routes is
Re
unreasonable.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
Results
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
Use local PBR to meet this requirement.
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
VLAN technology brings the following benefits:

Limits broadcast domains. A broadcast domain is limited in a
ht
VLAN. This saves bandwidth and improves network

processing capabilities.
Enhances network security. Packets from different VLANs are
s:
separately transmitted. Hosts in a VLAN cannot directly

ce
communicate with hosts in another VLAN.

Improves network robustness. A fault in a VLAN does not
ur
affect hosts in other VLANs.

Flexibly sets up virtual groups. With VLAN technology, hosts in
so
different geographical areas can be grouped together. This

facilitates network construction and maintenance.
Re
S1 and S2 are located in different positions. Each switch
ng
connects to two computers and the computers belongs to two

ni
different VLANs. The dashed box indicates a VLAN.

By default, PCs in VLAN 2 cannot communicate with PCs in
ar
VLAN 3. That is, broadcast packets are limited in a VLAN.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IEEE 802.1Q
IEEE 802.1Q is an Ethernet networking standard for a
ht
specified Ethernet frame format. It adds the 4-byte 802.1Q Tag

field between the Source address and the Length/Type fields
of the original frame.
s:
ce
Subfields in the 802.1q Tag field:

TPID: is short for Tag Protocol Identifier and indicates the
ur
frame type, which has 2 bytes. The value 0x8100 indicates an

802.1Q-tagged frame. An 802.1Q-incapable device discards
so
the received 802.1Q frame.

PRI: is short for priority and indicates the frame priority, which
Re
has 3 bits. The value ranges from 0 to 7. The greater the value,
the higher the priority. When QoS is deployed on a switch, the
ng
switch first sends data frames with higher priority.

CFI: is short for Canonical Format Indicator and indicates
ni
whether the MAC address is in canonical format. The value 0

indicates the MAC address in canonical format and the value 1
ar
indicates the MAC address in non-canonical format. CFI is

used to differentiate Ethernet frames, Fiber Distributed Digital
Le
Interface (FDDI) frames, and token ring network frames. The

value is 0 on the Ethernet.
VID: is short for VLAN ID and indicates the VLAN to which a
re
frame belongs, which has 12 bits.

Mo
en
Each frame sent by an 802.1Q-capable switch can carries a VLAN ID.
m/
In a VLAN, Ethernet frames are classified into the following types:
Tagged frame: frame with the 4-byte 802.1Q tag
co
Untagged frame: frame without the 4-byte 802.1Q tag
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The following link types are available:

Access link: Usually connects a host to a switch. Generally,
ht
a host does not need to know which VLAN it belongs to,

and host hardware cannot distinguish frames with VLAN
tags. Hosts therefore send and receive only untagged
s:
frames along access links.

Trunk link: Usually connects a switch to another switch or
ce
a router. Data of different VLANs is transmitted along a

trunk link. The two ends of a trunk link must be able to
ur
distinguish frames using VLAN tags, and so only tagged

so
frames are transmitted along trunk links.

Re
A host does not need to know the VLAN to which it
belongs. It sends only untagged frames.
ng
After receiving an untagged frame from a host, a

switching device determines the VLAN to which the frame
ni
belongs based on the configured VLAN assignment

ar
method such as interface information. The switching

device then processes the frame accordingly.
Le
If a frame needs to be forwarded to another switching

device, the frame must be transparently transmitted
along a trunk link. Frames transmitted along trunk links
re
must carry VLAN tags to allow other switching devices to

properly forward the frame based on the VLAN
Mo
information.
en
After a switching device determines the outbound
m/
interface of a frame and before the switching device
sends the frame to the destination host, the switching
co
device connected to the destination host removes the
VLAN tag from the frame to ensure that the host receives
.
an untagged frame.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Interface types
An access interface on a switch connects to an interface on a
ht
host. It can only connect to access links.

The access interface allows only the VLAN whose ID is
the same as the Port Default VLAN ID (PVID).
s:
If the access interface receives untagged frames from

ce
the remote device, the switch adds the PVID to the

untagged frames.
ur
Ethernet frames sent by the access interface are

always untagged frames.
so
A trunk interface on a switch connects to another switch. It can

only connect to trunk links.
Re
The trunk interface allows frames from multiple VLANs

to pass through.
If the tag in the frame sent by the trunk interface is the
ng
same as the PVID, the switch removes the tag from the
ni
frame. The trunk interface sends untagged frames in

this situation only.
ar
If the tag in the frame sent by the trunk interface is

different from the PVID, the switch directly sends the
Le
frame.
A hybrid interface on a switch can connect to either a host or
another switch. It can connect to either access or trunk links.
re
The hybrid interface allows frames from multiple

Mo
VLANs to pass through and removes tags from frames

on the outbound interface.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Interface-based VLAN assignment

VLANs are assigned based on interface numbers.
ht
The network administrator configures a PVID for each switch

interface, that is, an interface belongs to a VLAN by default.
When an untagged data frame reaches a switch
s:
interface that has the PVID configured, the PVID is

ce
added to the frame.

When a data frame carries a VLAN tag, the switch
ur
does not add a VLAN tag to the data frame even if the
interface is configured with a PVID.
so
Different types of interfaces process VLAN frames in different

manners.
Re
MAC address-based VLAN assignment

VLANs are assigned based on MAC addresses.
ng
The network administrator needs to configure the mappings

ni
between MAC addresses and VLAN IDs. When the switch

receives an untagged frame, it searches for the VLAN entry
ar
matching the source MAC address of the frame and adds the
VLAN ID to the frame.
Le
IP subnet-based VLAN assignment

When receiving an untagged frame, the switch adds a VLAN
re
tag to the packet based on the source IP address of the packet.

Mo
en
Protocol-based VLAN assignment
m/
VLAN IDs are allocated to packets received on an interface
according to the protocol (suite) type and encapsulation format
co
of the packets. The network administrator needs to configure
the mappings between protocol types and VLAN IDs. When
.
the switch receives an untagged frame, it searches the
ei
protocol-VLAN mapping table for a VLAN tag mapping the
w
protocol of the frame and adds it to the frame.
ua
The protocol support vlan assignment contains
IPV4\IPV6\IPX\AppleTalk(AT), encapsulation type is Ethernet
.h
II802.3 raw802.2 LLC802.2 SNAP.
g
Policy-based VLAN assignment
in
Terminals MAC addresses and IP addresses need to be
rn
configured and associated with VLANs on the switch. Only
terminals matching conditions can be added to a specified
ea
VLAN. After terminals matching conditions are added to the
VLAN, changes of the IP addresses or MAC addresses may
/l
cause the terminals to be removed from the VLAN.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To implement intra-communication in VLAN 2 and VLAN 3
ht
through the trunk link between S1 and S2, add Port 2 on S1

and Port 1 on S2 to VLAN 2 and VLAN 3.
PC1 sends a frame to PC2 as follows:
s:
The frame is first sent to Port 4 on S1.

Port 4 adds a tag to the frame. The VID field of the tag
ce
is 2, that is, the ID of the VLAN to which Port 4 belongs.

ur
S1 sends the frame to all interfaces in VLAN 2 except

Port 4 (Suppose the table of MAC address is empty).
so
Port 2 forwards the frame to S2.

After receiving the frame, S2 determines that the frame
Re
belongs to VLAN 2 based on the tag. S2 sends the

frame to all interfaces in VLAN 2 except for Port 1.
Port 3 sends the frame to PC2.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
R1 is a Layer 3 switch supporting sub-interfaces, and S1 is a
ht
Layer 2 switching device. LANs are connected using the

switched Ethernet interface on S1 and the routed Ethernet
interface on R1. To implement inter-VLAN communication,
s:
perform the following operations:

Create two sub-interfaces on the Ethernet interfaces
ce
connecting R1 and S1, and configure 802.1Q

ur
encapsulation on sub-interfaces corresponding to

VLAN 2 and VLAN 3.
so
Configure IP addresses for sub-interfaces to ensure the

two sub-interfaces have reachable routes.
Re
Configure Ethernet interfaces connecting S1 and R1 as

trunk or hybrid interfaces and configure them to allow
ng
frames from VLAN 2 and VLAN 3 to pass through.

Configure the default gateway address as the IP
ni
address of the sub-interface mapping the VLAN to

which the host belongs.
ar
PC1 communicates with PC2 as follows:

Le
PC1 checks the IP address of PC2 and determines that

PC2 is in another VLAN.
PC1 sends an ARP Request packet to R1 to request
re
R1's MAC address.

Mo
en
After receiving the ARP Request packet, R1 returns an
m/
ARP Reply packet in which the source MAC address is
the MAC address of the sub-interface mapping VLAN 2.
co
PC1 obtains R1's MAC address.
PC1 sends a packet in which the destination MAC
.
address is the MAC address of the sub-interface and
ei
the destination IP address is PC2's IP address to R1.
w
After receiving the packet, R1 forwards the packet and
ua
detects that the route to PC2 is a direct route. The
packet is forwarded by the sub-interface mapping
.h
VLAN 3.
R1 as the gateway in VLAN 3 broadcasts an ARP
g
Request packet requesting PC2's MAC address.
in
After receiving the ARP Request packet, PC2 returns
rn
an ARP Reply packet.
After receiving the ARP Reply packet, R1 sends the
ea
packet from PC1 to PC2. All packets sent from PC1 to
PC2 are sent to R1 first for Layer 3 forwarding.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A routing table must have correct routing entries so that new

data flows can be correctly forwarded. You can deploy VLANIF
ht
interfaces and routing protocols on Layer 3 switches to

implement Layer 3 connectivity.
s:
VLAN 2 and VLAN 3 are assigned. To implement inter-
ce
VLAN communication, perform the following operations:

Create two VLANIF interfaces on S1 and configure
ur
IP addresses for them to ensure the two VLANIF

so
interfaces have reachable routes.

Configure the default gateway address as the IP
Re
address of the VLANIF interface mapping the

VLAN to which the user host belongs.
PC1 communicates with PC2 as follows:
ng
PC1 checks the IP address of PC2 and determines

that PC2 is in another VLAN.
ni
PC1 sends an ARP Request packet to S1 to request

ar
S1's MAC address.

After receiving the ARP Request packet, S1 returns
Le
an ARP Reply packet in which the source MAC

address is the MAC address of VLANIF 2.
PC1 obtains S1's MAC address.
re
Mo
en
PC1 sends a packet in which the destination MAC
m/
address is the MAC address of the VLANIF
interface and the destination IP address is PC2's IP
co
address to S1.
After receiving the packet, S1 forwards the packet
.
and detects that the route to PC2 is a direct route.
ei
The packet is forwarded by VLANIF 3.
S1 as the gateway in VLAN 3 broadcasts an ARP
w

Request packet requesting PC2's MAC address.
ua
After receiving the ARP Request packet, PC2
.h
returns an ARP Reply packet.
After receiving the ARP Reply packet, S1 sends the
g
packet from PC1 to PC2. All packets sent from PC1
in
to PC2 are sent to S1 first for Layer 3 forwarding.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
VLAN aggregation, also known as the super-VLAN, partitions a

broadcast domain using multiple VLANs on a physical network so
ht
that different VLANs can belong to the same subnet.

Super-VLAN: is a set of multiple sub-VLANs. In a super-VLAN,
only Layer 3 interfaces are created, and no physical interface
s:
exists.
Sub-VLAN: is used to isolate broadcast domains. In the sub-
ce
VLAN, only physical interfaces exist and Layer 3 VLAN

ur
interfaces cannot be created. The super-VLAN is used to

implement Layer 3 switching.
so
A super-VLAN can contain one or more sub-VLANs. IP

addresses of hosts in sub-VLANs of the super-VLAN belong to
Re
the subnet of the super-VLAN.

ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The super-VLAN (VLAN 10) contains the sub-VLANs (VLAN 2
ht
and VLAN 3).

Proxy ARP between sub-VLANs is enabled on S1. The
communication process is as follows:
s:
After comparing PC2s IP address (1.1.1.20) with its IP

ce
address, PC1 finds that both IP addresses are on the

same network segment. The ARP table of PC1
ur
however has no corresponding entry for PC2.

PC1 broadcasts an ARP Request packet to request
so
PC2s MAC address.

PC2 is not in VLAN 2, and so PC2 cannot receive the
Re
ARP Request packet.

The gateway is enabled with proxy ARP between sub-
ng
VLANs, therefore after receiving the ARP Request

packet from PC1, the gateway finds that PC2s IP
ni
address (1.1.1.20) is the IP address of a directly

connected interface. The gateway then broadcasts an
ar
ARP Request packet to all the other sub-VLAN

interfaces to request for PC2s MAC address.
Le
After receiving the ARP Request packet, PC2 sends an

ARP Reply packet.
After receiving the ARP Reply packet from PC2, the
re
gateway replies its MAC address to PC1.

Mo
The ARP tables of both S1 and PC1 have

corresponding entries of PC2.
en
To send packets to PC2, PC1 first sends packets to the
m/
gateway, and then the gateway performs Layer 3
forwarding.
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The frame that enters S1 through Port 1 on PC1 is tagged with
ht
the ID of VLAN 2. The VLAN ID, however, is not changed to

the ID of VLAN 10 on S1 even if VLAN 2 is the sub-VLAN of
VLAN 10. After passing through Port 3, which is a trunk
s:
interface, this frame still carries the ID of VLAN 2. S1 discards

ce
the frames of VLAN 10 that are sent to S1 by other devices

because S1 has no physical interface corresponding to VLAN
ur
10.
A super-VLAN has no physical interface:
so
If you configure a super-VLAN and then a trunk interface, the

frames of a super-VLAN are filtered automatically according to
Re
the VLAN range configured on the trunk interface.

If you first configure a trunk interface and configure the trunk
ng
interface to allow all VLANs to pass through, you cannot

configure the super-VLAN on the device. The root cause is
ni
that any VLAN with physical interfaces cannot be configured

as the super-VLAN. The trunk interface allows frames from all
ar
VLANs to pass through, so no VLAN can be configured as a

super-VLAN.
Le
On S1, only VLAN 2 and VLAN 3 are valid, and all frames are
forwarded in these VLANs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
S2 is configured with super-VLAN 4, sub-VLAN 2, sub-VLAN 3,
ht
and common VLAN 10. S1 is configured with two common

VLANs, namely, VLAN 10 and VLAN 20. S2 is configured with
the route to the network segment 1.1.3.0/24, and S1 is
s:
configured with the route to the network segment 1.1.1.0/24.

ce
PC1 in sub-VLAN 2 of super-VLAN 4 then needs to

communicate with PC3 on connected to S1.
ur
After comparing PC3s IP address (1.1.3.2) with its IP

address, PC1 finds that two IP addresses are on
so
different network segments.

PC1 broadcasts an ARP Request packet to its gateway
Re
(S2) to request S2s MAC address.

After receiving the ARP Request packet, S2 checks the
ng
mapping between the sub-VLAN and the super-VLAN,

and sends an ARP Reply packet to PC1 through sub-
ni
VLAN 2. The source MAC address in the ARP Reply

packet is the MAC address of VLANIF 4 corresponding
ar
to super-VLAN 4.
PC1 learns S2s MAC address.
Le
PC1 sends the ARP Reply packet to S2. The ARP

Reply packet carries the destination MAC address as
the MAC address of VLANIF 4 corresponding to super-
re
VLAN 4 and the destination IP address of 1.1.3.2.

Mo
en
After receiving the ARP Reply packet, S2 performs
m/
Layer 3 forwarding and sends the ARP Reply packet to
S1, with the next hop address of 1.1.2.2 and outbound
co
interface as VLANIF 10.
After receiving the ARP Reply packet, Switch2
.
performs Layer 3 forwarding and sends the ARP Reply
ei
packet to PC3 through the directly connected interface
w
VLANIF 20.
ua
The ARP Reply packet from PC3 reaches S2 after
Layer 3 forwarding on S1.
.h
After receiving the ARP Reply packet, S2 performs
Layer 3 forwarding and sends the packet to PC1
g
through the super-VLAN.
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The MUX VLAN falls into the principal VLAN and subordinate VLAN.
The subordinate VLAN is classified into the separate VLAN and group
ht
VLAN.
Principal VLAN: A principal interface can communicate with all
interfaces in a MUX VLAN.
s:
Subordinate VLAN
Separate VLAN: A separate interface can communicate
ce
only with a principal interface and is isolated from other

ur
types of interfaces. A separate VLAN must be bound to

a principal VLAN.
so
Group VLAN: A group interface can communicate with

a principal interface and other interfaces in the same
Re
group VLAN, but cannot communicate with interfaces

in other group VLANs or a separate interface. A group
ng
VLAN must be bound to a principal VLAN.

ni
The principal interface connects to the enterprise server;
ar
separate interfaces connect to enterprise customers; group

interfaces connect to enterprise employees. In this manner,
Le
enterprise customers and enterprise employees can access

the enterprise server, enterprise employees can communicate
with each other, enterprise customers cannot communicate
re
with each other, and enterprise customers and enterprise

Mo
employees cannot communicate with each other.

Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
To meet requirement 2, configure VLAN 2 and VLAN 3 to be
ht
permitted by the trunk link.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The port link-type command sets the link type of an interface.
ht
The port trunk allow-pass vlan command adds a trunk

interface to VLANs.
The port hybrid untagged vlan command adds a hybrid
s:
interface to VLANs. Frames of the VLANs then pass through

ce
the hybrid interface in untagged mode.

ur
View
Interface view
so
Parameters
port link-type { access | dot1q-tunnel | hybrid | trunk }
Re
Access: configures the link type of an interface as

access.
ng
dot1q-tunnel: configures the link type of an interface as

QinQ.
ni
hybrid: configures the link type of an interface as hybrid.

trunk: configures the link type of an interface as trunk.
ar
Precautions
Before changing the link type of an interface, you need to
Le
delete the VLAN configuration of the interface. That is, the

interface can join only VLAN 1.
If a specified VLAN does not exist, the port trunk allow-pass
re
vlan command does not take effect. The port trunk allow-
Mo
pass vlan command cannot be used on a member interface of

an Eth-Trunk.
en
A hybrid interface can connect to either a user host or a switch.
m/
When a hybrid interface is connected to a user host, it must be
added to VLANs in untagged mode because user hosts cannot
co
process untagged frames. The port hybrid untagged vlan
command is invalid on a member interface of an Eth-Trunk. A
.
super VLAN cannot be specified in the port hybrid untagged
ei
vlan command.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
The topology is similar to that in slide 22. The difference is that
ht
MAC addresses are identified. Assign VLANs based on MAC

addresses to meet the requirement.
Before configuring MAC address-based VLAN assignment,
s:
ensure that the link type of the Layer 2 interface is hybrid.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The mac-vlan mac-address command associates a MAC
ht
address with a VLAN.

The mac-vlan enable command enables MAC address-
based VLAN assignment on an interface.
s:
ce
Precautions
After a MAC address is associated with a VLAN, it cannot
be associated with other VLANs.
ur
If MAC address-based assignment is enabled on an

so
interface:
When receiving an untagged packet, the interface
Re
searches for the VLAN entry matching the source MAC

address of the packet. If a matching entry is found, the
interface forwards the packet based on the VLAN ID. If no
ng
matching entry is found, the interface uses other

matching rules to forward the packet.
ni
When receiving a tagged packet, the interface forwards

ar
the packet based on the interface-based VLAN

assignment configuration.
Le
MAC address-based assignment can be configured only

on hybrid interfaces.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
The topology is similar to that in slide 22.
ht
Before configuring IP subnet-based VLAN assignment, ensure

that the link type of the Layer 2 interface is hybrid.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The ip-subnet-vlan command associates an IP subnet
ht
with a VLAN.
The ip-subnet-vlan enable command enables IP subnet-
based VLAN assignment on an interface.
s:
ce
Precautions
The ip-subnet-vlan command associated with a VLAN
cannot be a multicast network segment or multicast
ur
address.
so
IP subnet-based assignment can be configured only on

hybrid interfaces.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
Protocol-based assignment can be configured only on
ht
hybrid interfaces.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The protocol-vlan command associates a protocol with a
ht
VLAN.
The protocol-vlan vlan command associates an interface with
a protocol-based VLAN.
s:
ce
Precautions
Protocol-based assignment can be configured only on hybrid
ur
interfaces.
When protocol-based assignment is used on an interface, the
so
switch needs to parse the protocol type in the received packet

and convert it.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
You can use the VLANIF interface or sub-interface to
ht
implement communication between VLANs.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The interface vlanif command creates a VLANIF interface
ht
and displays the VLANIF interface view.

The dot1q termination vid command configures the single
VLAN ID of dot1q encapsulation on a sub-interface.
s:
The arp broadcast enable command enables ARP

broadcast on a sub-interface.
ce
ur
Precautions
Before running the interface vlanif command, you must run
so
the vlan command to create a VLAN specified by vlan-id.

Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
Configure VLAN aggregation to meet the requirements.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The aggregate-vlan command configures a VLAN as a
ht
super-VLAN.
The access-vlan command adds one or more sub-VLANs
to a super-VLAN.
s:
ce
Precautions
VLAN 1 cannot be configured as a super-VLAN.
The super-VLAN must be different from all its sub-VLANs.
ur
A VLAN can be added to only one super-VLAN.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
wei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
Configure the MUX VLAN to meet the requirements.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The mux-vlan command configures a VLAN as a principal
ht
VLAN.
The subordinate group command configures subordinate
group VLANs for a principal VLAN.
s:
The subordinate separate command configures a

ce
subordinate separate VLAN for a principal VLAN.

ur
Precautions for the principal VLAN

The super-VLAN, sub-VLAN, or subordinate VLAN cannot be
so
configured as a principal VLAN.

The VLAN where a VLANIF interface has been created cannot
Re
be configured as a principal VLAN.

Precautions for the subordinate group VLAN
Before configuring a subordinate group VLAN, you must
ng
configure a principal VLAN and enter the principal VLAN view.

ni
The VLAN to be configured as a subordinate group VLAN

must have been created.
ar
The VLAN to be configured as a subordinate group VLAN

cannot have a VLANIF interface configured or be configured
Le
as a super-VLAN.
Before running the undo subordinate group command delete
a subordinate group VLAN to which interfaces have been
re
added, delete the interfaces from the subordinate group VLAN.

Mo
A subordinate group VLAN must be different from the principal

VLAN.
en
A subordinate group VLAN must be different from a
m/
subordinate separate VLAN.
Precautions for the subordinate separate VLAN
co
Before configuring a subordinate separate VLAN, you must
configure a principal VLAN and enter the principal VLAN view.
.
The VLAN to be configured as a subordinate separate VLAN
ei
must have been created.
w
The VLAN to be configured as a subordinate separate VLAN
ua
cannot have a VLANIF interface configured or be configured
as a super-VLAN.
.h
Before running the undo subordinate separate command
delete a subordinate separate VLAN to which interfaces have
g
been added, delete the interfaces from the subordinate
in
separate VLAN.
A subordinate separate VLAN must be different from the
rn
principal VLAN.
ea
A subordinate separate VLAN must be different from a
subordinate group VLAN.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Check whether MAC address entries on the switch are correct.

Run the display mac-address command on the switch to
ht
check whether the MAC addresses, interfaces, and VLANs in

the learned MAC address entries are correct. If the learned
MAC address entries are incorrect, run the undo mac-
s:
address mac-address vlan vlan-id command on the interface

ce
to delete the existing entries so that the switch can learn MAC
address entries again.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
To implement communication between VLANs through RIPv2,
ht
configure at least two VLANIF interfaces on the switch.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Result
Perform the ping operation. PC1 in VLAN 2 and VLAN 3 can
ht
communicate with each other.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Result
To implement communication between VLANs through RIPv2,
ht
configure at least two VLANIF interfaces on the switch.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Proxy ARP
Routed proxy ARP: Routed proxy ARP enables network
ht
devices on the same network segment but on different

physical networks to communicate.
Intra-VLAN proxy ARP: If two hosts belong to the same VLAN
s:
where user isolation is configured, enable intra-VLAN proxy

ce
ARP on an interface associated with the VLAN to allow the

hosts to communicate.
ur
Inter-VLAN proxy ARP: If two hosts belong to different VLANs,

enable inter-VLAN proxy ARP on interfaces associated with
so
the VLANs to implement Layer 3 communication between the

two hosts.
Re
Topology Description
Routed proxy ARP
ng
The IP addresses of PC1 and PC2 are on the same

ni
network segment. When PC1 needs to communicate

with S1, PC1 broadcasts an ARP Request packet,
ar
requesting the MAC address of PC2. However, PC1

and PC2 are on different physical networks (in different
Le
broadcast domains). PC2 therefore cannot receive the

ARP Request packet sent from PC1 and does not
respond with an ARP Reply packet. To solve this
re
problem, enable proxy ARP on S1.

Mo
en
After receiving the ARP Request packet, S1 searches
m/
for a routing entry corresponding to PC2. If the routing
entry corresponding to PC2 exists, S1 responds to the
co
ARP Request packet with its own MAC address. PC1
forwards data based on the MAC address of S1. S1
.
functions as the proxy of PC2.
ei
Intra-VLAN proxy ARP
w
PC1 cannot communicate with PC2 in the same VLAN
ua
because interface isolation is configured on the
interface of S1 connected to PC1 and PC2. To solve
.h
this problem, enable intra-VLAN proxy ARP on the
interfaces of S1. After S1's interface connected to PC1
g
receives an ARP Request packet destined for another
in
address, S1 does not discard the packet but searches
rn
for the ARP entry corresponding to PC2. If the ARP
entry corresponding to PC2 exists, S1 sends its MAC
ea
address to PC1 and forwards packets sent from PC1 to
PC2. S1 functions as the proxy of PC2.
Inter-VLAN proxy ARP
/l
This function is used in VLAN aggregation. Refer to the
:/
VLAN documentation.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Gratuitous ARP provides the following functions:

Checks for duplicate IP addresses: Normally, a host does not
ht
receive an ARP Reply packet after sending an ARP Request

packet with the destination address as its own IP address. If
the host receives an ARP Reply packet, another host has the
s:
same IP address.
Advertises a new MAC address: If the MAC address of a host
ce
changes because its network adapter is replaced, the host

ur
sends a gratuitous ARP packet to notify all hosts of the change

before the ARP entry is aged out.
so
Notifies of an active/standby switchover in a VRRP group:

After an active/standby switchover is performed, the master
Re
switch sends a gratuitous ARP packet in the VRRP group to

notify of the switchover.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
After the system is reset or the interface card is hot swapped or reset,
the dynamic entries will be lost but the static and the blackhole entries
ht
are not lost.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Secure MAC addresses are classified into the following types:

Secure dynamic MAC address: is learned on an
ht
interface where port security is enabled but the sticky

MAC function is disabled. After port security is enabled
on an interface, dynamic MAC address entries that
s:
have been learned on the interface are deleted and

ce
MAC address entries learned subsequently turn into

secure dynamic MAC address entries. Secure dynamic
ur
MAC addresses will not be aged out by default. After

the switch restarts, secure dynamic MAC addresses
so
are lost and need to be learned again.

Sticky MAC address: is learned on an interface where
Re
both port security and the sticky MAC function are

enabled. Sticky MAC addresses will not be aged out.
ng
After you save the configuration and restart the switch,

sticky MAC addresses still exist.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
MAC address anti-flapping

Increasing the MAC address learning priority of an interface:
ht
When the same MAC address entry is learned by interfaces

with different priorities, the MAC address entry learned by the
interface with the highest priority overwrites the one learned by
s:
other interfaces.
Preventing MAC address overwriting on interfaces with the
ce
same priority: If the priority of an interface on a bogus device is

ur
the same as that on the authorized device, the MAC address

of the bogus device learned later does not overwrite the
so
correct MAC address. If the device powers off, the MAC

address of the bogus device is learned. After the device
Re
powers on again, the device cannot learn the correct MAC

address.
ng
ni
You can set a high MAC address learning priority on Port1 to

prevent PC3 from using the MAC address of PC1 to attack the
ar
switch.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
No loop prevention protocol is used on the switching network.
ht
If S2 and S4 are incorrectly connected with a network cable, a

loop occurs between S2, S3, and S4. When a broadcast
packet is sent, the packet is forwarded to S3 and received by
s:
Port1 on S1. When MAC address flapping detection is

ce
configured on Port1, S1 detects that the source MAC address

of the broadcast packet flaps between interfaces. If the MAC
ur
address flaps between interfaces frequently, S1 considers that

MAC address flapping occurs. The interface associated with
so
S1 can enter the error-down state or be removed from the

VLAN.
Re
MAC address flapping detection

Other dynamic VLAN technologies cannot be used with the
ng
removal of an interface from the VLAN where MAC address

ni
flapping occurs.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Link aggregation has the following advantages:

Increased bandwidth: The bandwidth of the link aggregation
ht
interface is the sum of bandwidth of member interfaces.

Higher reliability: When the physical link of a member interface
fails, the traffic can be switched to another available member
s:
link, improving reliability of the link aggregation interface.

Load balancing: In a Link Aggregation Group (LAG), traffic is
ce
load balanced among active member interfaces.

ur
Basic concepts of Ethernet link aggregation

Eth-Trunk: An LAG is the logical link bundled by many
so
Ethernet links, and is short for Eth-Trunk.

Member interfaces and member links: The interfaces that
Re
constitute an Eth-Trunk are member interfaces. The link

corresponding to a member interface is member link.
Active and inactive interfaces and links:
ng
Member interfaces are classified into active interfaces

ni
that forward data and inactive interfaces that do not

forward data.
ar
Links connected to active interfaces are called active

links, and links connected to inactive interfaces are
Le
called inactive links.

re
Mo
en
Upper threshold for the number of active interfaces: This
m/
setting guarantees higher network reliability. When the number
of active member interfaces reaches the upper threshold,
co
additional active member interfaces are set to Down and used
as backup links.
.
Lower threshold for the number of active interfaces: This
ei
setting ensure the minimum bandwidth of an Eth-Trunk. When
w
the number of active interfaces falls below this threshold, the
ua
Eth-Trunk goes Down.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Forwarding principle
An Eth-Trunk interface is assumed to be a physical interface at
ht
the MAC sub-layer. Therefore, frames transmitted at the MAC

sub-layer only need to be delivered to the Eth-Trunk module.
s:
Eth-Trunk forwarding entries:

HASH-KEY value: is calculated through the hash algorithm on
ce
the MAC address or IP address in the packet.

ur
Interface number: Eth-Trunk forwarding entries are relevant to

the number of member interfaces in an Eth-Trunk. Different
so
HASH-KEY values are mapped to different outbound

interfaces.
Re
Figure description
For example, If three physical interfaces, 1, 2, and 3, are
ng
bundled into an Eth-Trunk, the Eth-Trunk forwarding table

ni
contains three entries, as shown in the preceding figure. In the

Eth-Trunk forwarding table, the HASH-KEY values are 0, 1, 2,
ar
3, 4, 5, 6, 7, and the corresponding interface numbers are 1, 2,

3, 1, 2, 3, 1, 2.
Le
re
Mo
en
Forwarding process
m/
The Eth-Trunk module receives a frame from the MAC sub-
layer, and then extracts its source MAC address/IP address or
co
destination MAC address/IP address according to the load
balancing mode.
.
The Eth-Trunk module calculates the HASH-KEY value using
ei
the hash algorithm.
w
Based on the HASH-KEY value, the Eth-Trunk module
ua
searches the Eth-Trunk forwarding table for the interface
number, and then sends the frame from the corresponding
.h
interface.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Mis-sequencing in common load balancing mode

Because there are multiple physical links between devices of
ht
an Eth-Trunk, the first data frame of the same data flow is

transmitted on one physical link, and the second data frame
may be transmitted on another physical link. In this case, the
s:
second data frame may arrive at the peer device earlier than
ce
the first data frame. As a result, packet mis-sequencing occurs.

ur
Eth-Trunk load balancing

The Eth-Trunk uses the load balancing mechanism. This
so
mechanism uses the hash algorithm to calculate the address

in a data frame and generates a hash key value. The system
Re
then searches for the outbound interface in the Eth-Trunk

forwarding table based on the generated hash key value. Each
ng
MAC or IP address corresponds to a hash key value, so the

system uses different outbound interfaces to forward data.
ni
This mechanism ensures that frames of the same data flow

are forwarded on the same physical link and implements flow-
ar
based load balancing. Flow-based load balancing ensures the

sequence of data transmission, but cannot guarantee the
Le
bandwidth use efficiency.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Manual load balancing mode

If an active link fails, the other active links load balance the
ht
traffic evenly. If a high link bandwidth between two directly

connected devices is required but the device does not support
the LACP protocol, you can use the manual load balancing
s:
mode.
ce
LACP mode
ur
LACP uses a standard negotiation mechanism for switching

devices. LACP enables switching devices to automatically
so
create and enable aggregated links based on their

configurations. After aggregated links are created, LACP
Re
maintains the link status. If an aggregated link's status

changes, LACP automatically adjusts or disables the link
ng
aggregation.
ni
LACP concepts
LACP system priority: The LACP system priority (default value
ar
of 32768) is used to differentiate priorities of devices at both

ends of an Eth-Trunk. In LACP mode, active interfaces
Le
selected by both devices must be consistent; otherwise, the

LAG cannot be established. To keep active interfaces
consistent at both ends, set a higher priority for one end.
re
Mo
en
In this manner, the other end selects active member
m/
interfaces based on the selection of the peer. The smaller the
LACP system priority value, the higher the LACP system
co
priority. When LACP system priorities are the same, the device
with smaller MAC address functions as the Actor.
.
LACP interface priority: The LACP interface priority (default
ei
value of 32768) is used to determine whether a member
w
interface can be selected as an active interface. The smaller
ua
the LACP interface priority value, the higher the LACP
interface priority.
.h
In LACP mode, LACP determines active and inactive links in
an LAG. This mode is also called M:N mode, where M refers to
g
the number of active links and N refers to the number of
in
backup links. This mode guarantees high reliability and allows
rn
load balancing to be carried out across M active links.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
LACP implementation
After member interfaces are added to an Eth-Trunk in LACP
ht
mode, each end sends LACPDUs to inform its peer of its

system priority, MAC address, interface priority, interface
number, and keys. After being informed, the peer compares
s:
this information with that saved on itself, and selects which

ce
interfaces to be aggregated. Both ends determine active

interfaces and links.
ur
Negotiation process
so
Devices at both ends send LACPDUs to each other.

Create an Eth-Trunk in LACP mode on S1 and S2 and
Re
add member interfaces to the Eth-Trunk. The member

interfaces are then enabled with LACP, and devices at
ng
both ends send LACPDUs to each other.

Determine the Actor and active links.
ni
When S2 receives LACPDUs from S1, S2 checks and

records information about S1 and compares system
ar
priorities. If the system priority of S1 is higher than that

of S2, S1 acts as the Actor.
Le
After devices at both ends select the Actor, they select

active interfaces according to the priorities of the
Actor's interfaces.
re
Mo
en
LACP preemption
m/
E1 becomes faulty, and then recovers. When E1 fails,
E3 replaces E1 to transmit services. After E1 recovers,
co
if LACP preemption is not enabled on the Eth-Trunk,
E1 still retains a backup state. If LACP preemption is
.
enabled on the Eth-Trunk, E1 becomes the active
ei
interface and E3 becomes the backup interface
w
because E1 has higher priority than E3.
ua
LACP preemption delay
When LACP preemption occurs, the backup link waits
.h
for a given period of time before switching to the active
state.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
GVRP
GVRP is based on GARP and is used to maintain VLAN
ht
attributes dynamically on devices. Through GVRP, VLAN

attributes of one device can be propagated throughout the
entire switching network. GVRP enables network devices to
s:
dynamically deliver, register, and propagate VLAN attributes,

ce
reducing the workload of network administrators and ensuring

correct configuration.
ur
GVRP applies to only trunk links.

GVRP uses the multicast MAC address of 01-80-C2-00-00-21.
so
Participant
Re
On a device running GVRP, each GVRP-enabled port is

considered as a GVRP participant.
ng
VLAN registration and deregistration

ni
GVRP implements automatic registration and deregistration of

VLAN attributes.
ar
VLAN registration: adds an interface to a VLAN.

VLAN deregistration: removes an interface from a
Le
VLAN.
GVRP registers and deregisters VLAN attributes through
attribute declarations and reclaim declarations:
re
When an interface receives a VLAN attribute

Mo
declaration, it registers the VLAN specified in the

declaration.
en
That is, the interface is added to the VLAN.
m/
When an interface receives a VLAN attribute reclaim
declaration, it deregisters the VLAN specified in the
co
declaration. That is, the interface is removed from the
VLAN.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
GARP participants exchange attribute information by sending

messages. GVRP messages fall into Join, Leave, and LeaveAll
ht
messages.
Join message: When a GARP participant requires that other
devices register its attributes, receives Join messages from
s:
other GARP participants, or have attributes configured

ce
statically, it sends Join messages.

Leave message: A GARP participant sends Leave messages
ur
to have its attributes deregistered from other devices. The

GARP participant also sends Leave messages when
so
receiving Leave messages from other GARP participants or

when attributes are manually deregistered.
Re
LeaveAll message: A GARP participant sends LeaveAll

messages to deregister all its attributes from all the other
ng
GARP participants. LeaveAll messages are used to

periodically delete garbage attributes. For example, a garbage
ni
attribute may be created when a device fails to send a Leave

message, due to sudden loss of power, that is used to notify
ar
other devices to deregister an attribute that it has removed.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Join timer
To ensure that a Join message is reliably transmitted to
ht
other GARP participants, a GARP participant may send the

Join message twice. When sending the first Join message,
the GARP participant starts the Join timer. If a Join
s:
message is received before the Join timer expires, the

GARP participant does not send the second Join message.
ce
If not, the GARP participant re-sends the Join message.

The Join timer is configured on a per-port basis.
ur
so
Hold timer
When you configure an attribute on a participant or when
Re
the participant receives a request message, the

participant does not propagate the message to the other
devices immediately. Instead, it sends the request
ng
messages received within a period of time and sends

them in one GARP PDU. This period of time is specified by
ni
the Hold timer. By making full use of the data portion of

ar
GARP PDUs to send multiple messages in one packet, the

mechanism reduces the number of transmitted packets
Le
and contributes to network stability.

The Hold timer value must be no greater than half of the
Join timer value.
re
Mo
en
Leave timer
m/
Upon receiving a Leave or LeaveAll message, a GARP
participant starts its Leave timer. If it receives no Join message
co
containing the attribute carried in the Leave or LeaveAll
message when the Leave timer expires, it deregisters the
.
attribute.
ei
The Leave timer value is twice that of the Join timer value.
w
ua
LeaveAll timer
Upon startup, a GARP participant starts the LeaveAll timer.
.h
When the LeaveAll timer expires, the GARP participant sends
out a LeaveAll message, and then restarts the LeaveAll timer
g
to start another cycle.
in
When receiving a LeaveAll message, a GARP participant re-
rn
starts all timers, including the LeaveAll timer.
If LeaveAll timers of multiple devices expire at the same time,
ea
multiple LeaveAll messages will be sent at the same time,
creating unnecessary traffic. To avoid this problem, the actual
/l
LeaveAll timer value of a participant is a random value
between the LeaveAll timer value and the LeaveAll timer value
:/
multiplied by 1.5. A LeaveAll event is equivalent to
deregistering all attributes network wide by sending Leave
tp
messages.
The LeaveAll timer value must be at least larger than the
ht
Leave timer value.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
One-way registration of VLAN attributes

Manually create static VLAN 2 on S1. In response to this
ht
action, GVRP automatically assigns the GVRP-enabled ports

on S2 and S3 to VLAN 2 through one-way registration. The
process is as follows:
s:
After VLAN 2 is created on S1, E1 on S1 starts the Join

ce
timer and Hold timer. When the Hold timer expires, S1

sends the first JoinEmpty message to S2. When the
ur
Join timer expires, E1 restarts the Hold timer. When the

Hold timer expires again, Port 1 sends the second
so
JoinEmpty message.
After E2 on S2 receives the first JoinEmpty message,
Re
S2 creates dynamic VLAN 2 and adds E2 to VLAN 2.

In addition, S2 requests E3 to start the Join timer and
ng
Hold timer. When the Hold timer expires, E3 sends the

first JoinEmpty message to S3. When the Join timer
ni
expires, E3 restarts the Hold timer. When the Hold

timer expires again, E3 sends the second JoinEmpty
ar
message. After E2 receives the second JoinEmpty

message, S2 does not take any action because E2 has
Le
been added to VLAN 2.

re
Mo
en
After E4 of S3 receives the first JoinEmpty message,
m/
S3 creates dynamic VLAN 2 and adds E4 to VLAN 2.
After E4 receives the second JoinEmpty message, S3
co
does not take any action because E4 has been added
to VLAN 2.
.
Every time the LeaveAll timer expires or a LeaveAll
ei
message is received, each device restarts the LeaveAll
w
timer, Join timer, Hold timer, and Leave timer. E1 then
ua
repeats step 1 to send JoinEmpty messages. E3 of S2
sends JoinEmpty messages to S3 in the same way.
.h
Two-way registration of VLAN attributes
g
After one-way registration is complete, E1, E2, and E4 are
in
added to VLAN 2 but E3 is not added to VLAN 2 because only
rn
interfaces receiving a JoinEmpty or JoinIn message can be
added to dynamic VLANs. To transmit traffic of VLAN 2 in both
ea
directions, VLAN registration from S3 to S1 is required. The
/l
After one-way registration is complete, static VLAN 2 is
created on S3 (the dynamic VLAN is replaced by the
:/
static VLAN). E4 on S3 starts the Join timer and Hold
timer. When the Hold timer expires, E4 on S3 sends
tp
the first JoinIn message (because it has registered

ht
VLAN 2) to S2. When the Join timer expires, E4

restarts the Hold timer. When the Hold timer expires,
E4 sends the second JoinIn message.
s:
After E3 on S2 receives the first JoinIn message, S2

adds E3 to VLAN 2 and requests E2 to start the Join
ce
timer and Hold timer. When the Hold timer expires, E2

sends the first JoinIn message to S1. When the Join
ur
timer expires, E2 restarts the Hold timer. When the

Hold timer expires again, E2 sends the second JoinIn
so
message. After E3 receives the second JoinIn

Re
message, S2 does not take any action because E3 has

been added to VLAN 2.
When S1 receives the JoinIn message, it stops sending
ng
JoinEmpty messages to S2. Every time the LeaveAll

timer expires or a LeaveAll message is received, each
ni
device restarts the LeaveAll timer, Join timer, Hold

timer, and Leave timer. E1 on S1 sends a JoinIn
ar
message to S2 when the Hold timer expires.

Le
S2 sends a JoinIn message to S3.

After receiving the JoinIn message, S3 does not create
dynamic VLAN 2 because static VLAN 2 has been
re
created.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
One-way deregistration of VLAN attributes

When VLAN 2 is not required on devices, the devices can
ht
deregister VLAN 2. The process is as follows:

After static VLAN 2 is manually deleted from S1, E1 on
S1 starts the Hold timer. When the Hold timer expires,
s:
E1 sends a LeaveEmpty message to S2. E1 needs to

ce
send only one LeaveEmpty message.

After E2 on S2 receives the LeaveEmpty message, it
ur
starts the Leave timer. When the Leave timer expires,

E2 deregisters VLAN 2. Then E2 is deleted from VLAN
so
2, but VLAN 2 is not deleted from S2 because E3 is still

in VLAN 2. At this time, S2 requests E3 to start the
Re
Hold timer and Leave timer. When the Hold timer

expires, E3 sends a LeaveIn message to S3. Static
ng
VLAN 2 is not deleted from S3, so E3 can receive the

JoinIn message sent from E4 after the Leave timer
ni
expires. In this case, S1 and S2 can still learn dynamic

VLAN 2.
ar
After S3 receives the LeaveIn message, E4 is not

deleted from VLAN 2 because VLAN 2 is a static VLAN
Le
on S3.
Two-way deregistration of VLAN attributes

re
To delete VLAN 2 from all devices, two-way deregistration is

Mo
required. The process is as follows:

en
After static VLAN 2 is manually deleted from S3, E4 on
m/
S3 starts the Hold timer. When the Hold timer expires,
E4 sends a LeaveEmpty message to S2.
co
.
E3 deregisters VLAN 2. Then E3 is deleted from
ei
dynamic VLAN 2, and dynamic VLAN 2 is deleted from
w
S2. At this time, S2 requests E2 to start the Hold timer.
ua
When the Hold timer expires, E2 sends a LeaveEmpty
message to S1.
.h
g
E1 deregisters VLAN 2. Then E1 is deleted from
in
dynamic VLAN 2, and dynamic VLAN 2 is deleted from
rn
S1.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Manually configured VLANs are called static VLANs and VLANs

created using GVRP are called dynamic VLANs.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
To enable PC1 and PC2 whose interfaces are isolated in
ht
VLAN 2 to communicate with each other, enable intra-VLAN

proxy ARP on S1.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The port-isolate enable command enables port isolation.
ht
The arp-proxy inner-sub-vlan-proxy enable command

enables intra-VLAN proxy ARP.
s:
View
Interface view
ce
ur
Parameters
port-isolate enable [ group group-id ]
so
group-id: specifies the ID of a port isolation group. The

default value is 1.
Re
Precautions
You can use the display port-isolate command to view the
ng
port isolation group configuration.

ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
Preemption needs to be enabled to meet requirement 3.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The mode command configures the working mode of an Eth-
ht
Trunk.
The eth-trunk command adds an interface to an Eth-Trunk.
The load-balance command sets a load balancing mode of an
s:
Eth-Trunk.
The max active-linknumber command sets the upper
ce
threshold for the number of active member links on an Eth-

ur
Trunk.
The lacp priority command sets the LACP system or interface
so
priority.
The lacp preempt enable command enables priority
Re
preemption in static LACP mode.
Precautions
ng
When adding an interface to an Eth-Trunk, pay attention to the

ni
following points:
An Eth-Trunk contains a maximum of 8 member
ar
interfaces.
A member interface cannot be configured with any
Le
service or static MAC address.

The link type of the member interface added to the Eth-
Trunk must be hybrid.
re
Mo
en
An Eth-Trunk cannot be nested, that is, its member
m/
interface cannot be an Eth-Trunk.
An Ethernet interface can be added to only one Eth-
co
Trunk. To add the Ethernet interface to another Eth-
Trunk, delete it from the original Eth-Trunk first.
.
Member interfaces of an Eth-Trunk must be of the
ei
same type. That is, FE and GE interfaces cannot join
w
the same Eth-Trunk.
ua
Ethernet interfaces on different LPUs can join the same
Eth-Trunk.
.h
The remote interface directly connected to the local
Eth-Trunk member interface must also be bundled into
g
an Eth-Trunk; otherwise, the two ends cannot
in
communicate.

rn
When member interfaces use different rates,
congestion may occur on the low-rate interface,
ea
causing packet loss.
After interfaces are added to an Eth-Trunk, MAC
/l
addresses are learned on the Eth-Trunk but not the
member interfaces.
:/
When all member interfaces of an Eth-Trunk work in
half-duplex mode, the Eth-Trunk cannot negotiate an
tp
Up state.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
ei
w
ua
.hg
in
rn
ea
/l
:/
tp
Case description
Deploy GVRP to meet requirement 2.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The gvrp command enables GVRP globally or on an interface.
ht
Precautions
Before enabling GVRP on an interface, you must set the link
type of the interface to trunk.
s:
The display gvrp vlan-operation command displays the

ce
dynamic VLANs to which an interface is added.

ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PPP includes three protocols:

Link Control Protocol (LCP): is used to establish, monitor, and
ht
tear down PPP data links. LCP can automatically detect the
link environment, for example, check whether there are loops.
It also negotiates link parameters such as the maximum
s:
packet length and authentication protocol to be used.

ce
Compared with other data link layer protocols, PPP has an

important feature, that is, it can provide the authentication
ur
function. The two ends of a link can negotiate the

authentication protocol to be used and implement
so
authentication. The ends can be connected only when the

authentication succeeds. Due to this feature, PPP is
Re
appropriate for carriers to provide access to distributed users.

Network Control Protocol (NCP): is used to negotiate the
ng
format and type of packets transmitted on data links. For

example, IP Control Protocol (IPCP) and Internetwork Packet
ni
Exchange Control Protocol (IPXCP) are used to control

parameter negotiation of IP and IPX packets respectively.
ar
PPP extensions: give PPP support functions. For example,

PPP extensions provide the Password Authentication Protocol
Le
(PAP) and Challenge Handshake Authentication Protocol

(CHAP) to ensure network security.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PPP packet format

Flag field
The Flag field identifies the start and end of a physical
ht
frame and is always 0x7E.

Address field
The Address field identifies a peer. Two communicating
s:
devices connected by using PPP do not need to know the

data link layer address of each other because PPP is used
ce
on P2P links. This field must be filled with a broadcast

address of all 1s and is of no significance to PPP.

ur
Control field
The Control field value defaults to 0x03, indicating
an unsequenced frame. By default, PPP does not
so
use sequence numbers or acknowledgement

mechanisms to ensure transmission reliability.
Re
The Address and Control fields identify a PPP

packet, so the PPP packet header value is FF03.
Protocol field
ng
The Protocol field identifies the datagram

encapsulated in the Information field of a PPP data
ni
packet.
LCP packet format
Code field
ar
The Code field is 1 byte in length and identifies the

LCP packet type.
Le
re
Mo
en
Identifier field
The Identifier field is 1 byte long. It is used to match
m/
request and response packets. If a device receives a
packet with an invalid Identifier field, the device
co
discards the packet.
The sequence number of a Configure-Request
.
packet usually begins with 0x01 and increases by 1
ei
each time a Configure-Request packet is sent. After
a receiver receives a Configure-Request packet, it
w
must send a response packet with the same
ua
sequence number as that of the received Configure-
Request packet.
Length field
.h
The Length field specifies the total number of bytes
in the LCP packet. It specifies the length of an LCP
g
packet, including the Code, Identifier, Length and
in
Data fields.
The Length field value cannot exceed the maximum
rn
receive unit (MRU) of the link. Bytes outside the
range of the Length field are treated as padding and
ea
are ignored after they are received.
Data field
The Type field specifies the negotiation option type.
/l
The Length field specifies the total length of the Data
field, including Type, Length, and Data.
:/
The Data field contains the contents of the
negotiation option.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The PPP link establishment process is as follows:

Dead: PPP starts and ends with the Dead phase. After the physical
ht
status of two communicating devices becomes Up (marked as UP in

the figure), PPP enters the Establish phase.
Establish: The two devices negotiate link layer parameters in the
s:
Establish phase. If negotiation of link layer parameters fails (marked as

ce
FAIL in the figure), a PPP connection cannot be established and PPP

returns to the Dead phase. If negotiation of link layer parameters
ur
succeeds (marked as OPENED in the figure), PPP enters the

Authenticate phase.
so
Authenticate: In the Authenticate phase, the authenticating party

authenticates the authenticated party. If authentication fails (marked as
Re
FAIL in the figure), PPP enters the Terminate phase. If authentication

succeeds (marked as SUCCESS in the figure) or none authentication is
configured, PPP enters the Network phase.
ng
Network: In the Network phase, the two devices use NCP to negotiate
ni
network-layer parameters. If negotiation succeeds, a PPP connection

can be established and data packets can be transmitted over the PPP
ar
connection. When the upper-layer protocol determines that the PPP

connection (for example, an on-demand circuit)should be disconnected
Le
or an administrator manually disconnects the PPP connection, PPP

enters the Terminate phase.
Terminate: In the Terminate phase, the two devices use LCP to
re
disconnect the PPP connection. After the PPP connection is

Mo
disconnected (marked as Down in the figure), PPP enters the Dead

phase.
en
Note: The working phases of PPP listed in this slide are not the status
m/
of the PPP protocol because PPP is a protocol suite that does not have
a protocol status. Only specified protocols such as LCP and NCP can
co
have a protocol status that can change from one state to another.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
3 Type packets of LCP Protocal:

1.Link configure packet, used to establish and configure links:
ht
Configure-Request, Configure-Ack, Configure-Nak, Configure-Reject.

2.Link disconnection packet, used to end links: Terminate-Request,
Terminate-Ack.
s:
3.Link maintenance packet, used to management and debug links:

ce
Code-Reject, Protocol-Reject, Echo-Request, Echo-Reply, Discard-

Request.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
LCP is used to negotiate the following parameters:

MRU is used on the Versatile Routing Platform (VRP) to indicate the
ht
maximum transmission unit configured on an interface.

The PPP authentication protocols include PAP and CHAP. Two ends
of a PPP link can use different protocols to authenticate the peer.
s:
However, the authenticated party must support the authentication

ce
protocol used by the authenticating party and have authentication

information such as the user name and password correctly configured.
ur
LCP uses the magic number to detect link loops and other exceptions.
A magic number is a randomly generated digit. It should be ensured
so
that the two ends do not generate the same magic number.
After a device receives a Configure-Request packet, it compares the
Re
magic number in the Configure-Request packet received with the

locally generated magic number. If they are different, link loops do not
occur and the device sends a Confugure-Ack packet (if other
ng
parameters are successfully negotiated) to indicate that negotiation of

ni
the magic number succeeds. If subsequent packets contain the Magic-

Number field, the value of the field is set to the successfully negotiated
ar
magic number and LCP does not generate a new magic number.
If the magic number in the Configure-Request packet received is the
Le
same as that received previously, the receiver sends a Confugure-Nak

packet to the sender, carrying a new magic number. The sender sends
a new Configure-Request packet carrying a new magic number,
re
regardless of whether the magic number in the Configure-Nak packet

Mo
received is the same as that . If a link loop exists, the process persists.
If no link loop exists, packet exchange will soon be restored.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Link negotiation success:

As shown in the figure, R1 and R2 are connected in series and run
ht
PPP. When the physical status of the link becomes Up, R1 and R2
use the LCP to negotiate link layer parameters. In this example, R1
sends an LCP packet.
s:
R1 sends a Configure-Request packet to R2, carrying link-layer

ce
parameters configured on the sender (R1). The link-layer

parameters use the Type, Length, Value structure.
ur
After receiving the Configure-Request packet, R2 sends a

Configure-Ack packet to R1 if it can identify all the link-layer
so
parameters in the packet and determines that the value of each

parameter is acceptable.
Re
If R1 does not receive a Configure-Ack packet, it re-transmits a

Configure-Request packet once every 3 seconds. If R1 still cannot
ng
receive a Configure-Ack packet after the Configure-Request packet

is re-transmitted for 10 consecutive times, it determines that the
ni
peer is unavailable and stops sending Configure-Request packets.

Note: After the process is complete, R2 determines that the link-layer
ar
parameters configured on R1 are acceptable. R2 also needs to

send Configure-Request packets to R1, so that R1 can determine
Le
whether the link-layer parameters configured on R2 are acceptable.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Link negotiation failure:

After R2 receives a Configure-Request packet from R1, R2 sends a
ht
Configure-Nak packet to R1 if R2 can identify all the link-layer

parameters in the packet, but determines that all or some of the
parameter values are unacceptable, indicating that parameter
s:
negotiation fails.
The Configure-Nak packet contains only the parameters whose
ce
values are unacceptable, and the value of each parameter is changed

ur
to a value or value range that is acceptable on R2.

After receiving the Configure-Nak packet, R1 changes the parameter
so
values used locally based on the values in the Configure-Nak packet,

and then sends a Configure-Request packet.
Re
If negotiation still fails after the Configure-Request packet is sent for

five consecutive times, the parameters are disabled and parameter
negotiation stops.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The link negotiation parameters cannot be identified.

After receiving a Configure-Request packet from R1, R2 sends a
ht
Configure-Reject packet to R1 if R2 cannot identify all or some link-

layer parameters in the packet.
The Configure-Reject packet contains only the parameters that
s:
cannot be identified.
After receiving the Configure-Reject packet, R1 sends a Configure-
ce
Request packet to R2, carrying only parameters that can be identified

ur
by R2.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The link state detection process is as follows:

After a connection is set up using LCP, Echo-Request and Echo-
ht
Reply packets can be used to detect the link status. If a device

replies an Echo-Reply packet each time it receives an Echo-
Request packet, the link status is normal.
s:
By default, the VRP platform sends an Echo-Request packet once

ce
every 10 seconds.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The process of tearing down a connection is as follows:

LCP can tear down an existing connection if the authentication fails or
ht
an administrator manually shuts down the connection.

LCP uses Terminate-Request and Terminate-Ack packets to
disconnect a connection. The Terminate-Request packet is used to
s:
request the peer to disconnect the connection. After receiving a

ce
Terminate-Request packet, the device replies a Terminate-Ack packet

to confirm that the connection is to be disconnected.
ur
If a device fails to receive a Terminate-Ack packet, it re-transmits a

Terminate-Request packet once every 3 seconds. If the device still
so
does not receive a Terminate-Ack packet after sending the Terminate-

Request packet twice consecutively, it determines that the peer is
Re
unavailable, and then disconnects the connection.

ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
A PAP packet is encapsulated in the PPP packet directly.

ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The PAP authentication process is as follows:

The authenticated party sends an Authenticate-Request
ht
packet carrying the user name and password in plaintext to

the authenticating party. In this example, the user name
and password are huawei and hello.
s:
After receiving the user name and password from the

ce
authenticated party, the authenticating party compares the

user name and password with those configured locally to
ur
check whether they are correct. If the user name and

password are correct, the authenticating party returns an
so
Authenticate-Ack packet, indicating that the authentication

succeeds. If the user name and password are incorrect, the
Re
authenticating party returns an Authenticate-Nak packet,

indicating that the authentication fails.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The encryption algorithm Message Digest 5 (MD5) is used to calculate

a 16-byte character string, which is the concatenation of
ht
Identifier+password+challenge. The authenticated party adds the

calculated 16-byte character string to the Data field of the Response
packet and sends the packet to the authenticating party.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
CHAP is a three-way handshake authentication protocol. The Request

packet and Response packet exchanged between two communicating
devices during one CHAP process contain the same Identifier.
ht
s:
Unidirectional CHAP authentication is applicable to two scenarios: the

authenticating party is configured with a user name, and the
authenticating party is not configured with a user name. It is
ce
recommended that the authenticating party be configured with a user

name.
ur
When the authenticating party is configured with a user name (that is,
the ppp chap user username command is configured on the interface):
so
The authenticating party initiates an authentication request

by sending a Challenge packet that carries the local user
Re
name to the authenticated party.

After receiving the Challenge packet on an interface, the
authenticated party checks whether the ppp chap password
command is used on the interface. If this command is used,
ng
the authenticated party uses MD5 to calculate the

concatenation of Identifier, password generated by the ppp
ni
chap password command, and a random number. The

authenticated party then sends a Response packet carrying
ar
the calculated ciphertext password and local user name to

the authenticating party. If the ppp chap password
Le
command is not configured, the authenticated party

searches the local user table for the password matching
the user name of the authenticating party in the received
Challenge packet, and encrypts the matching password by
re
using MD5 in a similar way. The authenticated party sends

a Response packet carrying the calculated ciphertext
Mo
password and local user name to the authenticating party.

en
The authenticating party encrypts the locally saved
m/
password of the authenticated party by using MD5. The
authenticating party then compares the generated
co
ciphertext password with that carried in the received
Response packet, and returns a response based on the
.
check result.
ei
When the authenticating party is not configured with a user name
w
(that is, the ppp chap user username command is not configured on
ua
the interface):
The authenticating party initiates an authentication
.h
request by sending a Challenge packet.
After receiving the Challenge packet, the
g
authenticated party uses MD5 to calculate the
in
concatenation of Identifier, password generated by
rn
the ppp chap password command, and a random
number. It then sends a Response packet carrying
ea
the ciphertext password and local user name to the
authenticating party.
/l
The authenticating party encrypts the locally saved
password of the authenticated party by using MD5.
:/
The authenticating party then compares the
generated ciphertext password with that carried in
tp
the received Response packet, and returns a

ht
response based on the check result.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPCP negotiates IP addresses of two devices to transmit IP packets

over PPP links.
ht
IPCP and LCP have the same negotiation mechanism, packet type,
and working process.
Topology
s:
Configure two IP addresses 12.1.1.1/24 and 12.1.1.2/24 for the two

ce
ends. (IPCP can be used to negotiate IP addresses even if they are

not on the same network segment.)
ur
The static IP address negotiation process is as follows:

R1 and R2 send a Configure-Request packet carrying the
so
local IP address to each other.

After receiving the Configure-Request packet from the peer,
Re
R1 and R2 check the IP address in the packet. If the IP

address is a valid unicast IP address, and is different from
ng
the local IP address configured, R1/R2 determines that the

peer can use this address and returns a Configure-Ack
ni
packet.
IPCP uses Configure-Request and Configure-Ack packets
ar
to allow two ends at a PPP link to discover each others 32-

bit IP address.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
As shown in the figure, R1 requests the peer to allocate an IP address

for it and R2 is configured with a static IP address 12.1.1.2/24. R2 is
ht
enabled to allocate an IP address 12.1.1.1 to R1.

The dynamic IP address negotiation process is as follows:
s:
R1 sends a Configure-Request packet carrying the IP address 0.0.0.0

to R2, requesting R2 to allocate an IP address for it.
ce
After receiving the Configure-Request packet, R2 determines that the

IP address 0.0.0.0 is invalid and returns a Configure-Nak packet
ur
carrying a new IP address 12.1.1.1 to R1.

After receiving the Configure-Nak packet, R1 updates the local IP
so
address, and then sends a Configure-Request packet carrying the new

Re
IP address 12.1.1.1 to R2.

After receiving the Configure-Request packet, R2 determines that the
IP address 12.1.1.1 is valid, and returns a Configure-Ack packet to R1.
ng
In addition, R2 also sends a Configure-Request packet carrying the

IP address 12.1.1.2 to R1. R1 determines that the IP address 12.1.1.2
ni
is valid, and returns a Configure-Ack packet to R2.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multilink PPP fragments a packet and sends the fragments to the same
destination over multiple PPP links.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PPPoE overview
PPPoE allows a large number of hosts on an Ethernet to
ht
connect to the Internet using a remote access device and

controls each host using PPP. PPPoE features a large
application scale, high security, and convenient accounting.
s:
ce
Topology
A PPPoE session is set up between each PC and the
ur
router on the carrier network. Each PC functions as a

PPPoE client and has a unique account, which facilitates
so
user accounting and control by the carrier. The PPPoE

client software must be installed on the PCs.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The PPPoE session establishment process includes three stages:

Discovery, Session, and Terminate.
ht
Discovery stage:
A PPPoE client broadcasts a PPPoE Active Discovery
Initial (PADI) packet that contains service information
s:
required by the PPPoE client.

After receiving the PADI packet, all PPPoE servers
ce
compare the requested service with the services they can

provide. The PPPoE servers that can provide the
ur
requested service unicast PPPoE Active Discovery Offer

(PADO) packets to the PPPoE client.
so
Based on the network topology, the PPPoE client may

receive PADO packets from more than one PPPoE server.
Re
The PPPoE client selects the PPPoE server from which the
first PADO packet is received and unicasts a PPPoE Active
Discovery Request (PADR) packet to the PPPoE server.
ng
The PPPoE server generates a unique session ID to

identify the PPPoE session with the PPPoE client. The
ni
PPPoE server sends a PPPoE Active Discovery Session-

confirmation (PADS) packet containing this session ID to
ar
the PPPoE client. When the PPPoE session is established,

the PPPoE server and PPPoE client enter the PPPoE
Le
Session stage.
When the PPPoE session is established, the PPPoE server
and PPPoE client share the unique PPPoE session ID and
re
learns the peer Ethernet address.

Mo
en
Session stage:
m/
PPP negotiation at the PPPoE Session stage is the same
as common PPP negotiation.
co
When PPP negotiation succeeds, PPP data packets can be
forwarded.
.
At the PPPoE Session stage, the PPPoE server and client
ei
send all Ethernet data packets in unicast mode.
Terminate stage:
w
After a PPPoE session is established, the PPPoE client or
ua
the PPPoE server can unicast a PADT packet to terminate
the PPPoE session at any time. When a PADT packet is
.h
received, no further PPP traffic can be sent using this
session.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Four types of FR interfaces are available:

A user's device is called a DTE, and the corresponding
ht
interface type is DTE.

A network device that provides access services for DTE
devices is called a DCE, and the corresponding interface
s:
type is DCE or NNI.

A UNI interface interconnects the DTE and DCE.
ce
An NNI interface interconnects two FR switches.

A Virtual Circuit (VC) is a logical circuit established between two
ur
network devices on the same network.

Based on establishment mode, VCs are classified into two
so
types:
PVC: refers to the manually created VC.
Re
SVC: refers to the VC that can be created or deleted

automatically through negotiation.
The PVC status of the DTE is determined by the DCE. The
ng
PVC status of the DCE is determined by the network.

VCs are identified by the DLCI and a DLCI takes effect only on a local
ni
interface and its directly connected interface. On an FR network, a

DLCI can identify multiple VCs established on different physical
ar
interfaces.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
LMI: local management interface used to monitor the PVC status.

The system supports three LMI protocols: ITU-T Q.933
ht
Annex A, ANSI T1.617 Annex D, and non-standard

compatible protocol. The non-standard compatible protocol
is used for interconnection with a device from a vendor
s:
except Huawei.
The PVC status of the DTE is determined by the DCE. The
ce
PVC status of the DCE is determined by the network.

ur
When two network devices are directly connected, the PVC

status of the DCE is set by the device administrator.
so
The LMI negotiation process is as follows:

The DTE periodically sends Status Enquiry messages.
Re
After receiving the Status Enquiry message, the DCE

replies a Status message.
The DTE parses the received Status message to obtain the
ng
link status and PVC status.

ni
When the DTE and DCE can normally send and receive
LMI negotiation messages, the link protocol status changes
ar
to Up, and the PVC status changes to Active.

The FR LMI negotiation succeeds.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
After the FR LMI negotiation succeeds and the PVC status changes to
Active, two devices on a PVC start the InARP negotiation process:
ht
If a protocol address is configured on the local interface,

the local device (for example, R1) sends an Inverse ARP
Request packet to the peer device (for example, R2) over
s:
the VC. The Inverse ARP Request packet carries the

ce
protocol address of R1.

After receiving the Inverse ARP Request packet, R2
ur
obtains the protocol address of R1, generates an address

mapping, and sends an Inverse ARP Response packet to
so
R1.
After receiving the Inverse ARP Response packet, R1
Re
parses the address of R2 in the packet and generates an

address mapping.
R1 generates the address mapping 12.1.1.2 to 100, while
ng
R2 generates the address mapping 12.1.1.1 to 100.

ni
If a static mapping is configured manually or a dynamic mapping is

created, the local device does not send an InARP Request packet to
ar
the remote device over the VC regardless of whether the remote

address in the address mapping is correct. The local device sends an
Le
InARP Request packet to the remote device only when no mapping

exists.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Sub-interfaces can solve the problem caused by split horizon on an FR

network. One physical interface can contain multiple logical sub-
ht
interfaces. Each sub-interface can connect to a remote router over one

or multiple DLCIs. The routers are connected over the FR network.
You can define logical sub-interfaces on the serial line.
s:
Every sub-interface uses one or multiple DLCIs to connect

ce
to the remote router. After a DLCI is configured on a sub-

interface, the mapping between the destination protocol
ur
address and this DLCI needs to be created.

As shown in the figure, R4 has only one physical serial
so
interface S0; however, DLCIs are defined on S0 to connect

the sub-interfaces S0.1, S0.2, and S0.3 to R1, R2, and R3
Re
respectively.
Two types of sub-interfaces are available:
P2P sub-interface: used to connect to a single remote
ng
device. Each P2P sub-interface can be configured with only

ni
one PVC. In this case, the remote device can be

determined uniquely without the static address mapping.
ar
Therefore, when the PVC is configured for the sub-

interface, the peer address is identified.
Le
re
Mo
en
P2MP sub-interface: used to connect to multiple remote
m/
devices. Each sub-interface can be configured with multiple
PVCs. Each PVC maps the protocol address of its
co
connected remote device. In this way, different PVCs can
reach different remote devices. You can manually configure
.
the address mapping, or use InARP to dynamically create
ei
the address mapping.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
The NCP protocol can be used to allocate an IP address to the peer.
ht
You need to configure the ppp chap user Huawei command on R1's
interface to enable R1 to send a Challenge packet to R2 carrying the
user name Huawei.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ppp authentication-mode: Configures the PPP authentication mode
in which the local device authenticates the remote device.
ht
ppp chap user: Configures a user name for CHAP authentication.

ppp chap password: Configures a password for CHAP
authentication.
s:
ip address ppp-negotiate: Configures IP address negotiation on an

interface to allow the interface to obtain an IP address from the remote
ce
device.
remote address: Configures the local device to assign an IP address
ur
or specify an IP address pool for the remote device.

Usage scenario
Interface view
so
Parameters
Re
ppp authentication-mode { chap | pap }

chap: Indicates the CHAP authentication mode.
pap: Indicates the PAP authentication mode.
ng
ppp chap user username

username: Specifies a user name for CHAP authentication.
ppp chap password { cipher | simple } password
ni
cipher: Indicates a ciphertext password.

Simple: Indicates a plaintext password.
ar
Password: Specifies the password for CHAP authentication.

remote address { ip-address | pool pool-name }
Le
cipher: Indicates a ciphertext password.

Simple: Indicates a plaintext password.
Password: Specifies the password for CHAP authentication.
re
Mo
en
Precautions
In CHAP authentication, the authenticated party does not send the
m/
password to the authenticating party.
The local device can use IPCP to learn the 32-bit host address from
co
the remote
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
interface mp-group: Creates an MP-Group interface and enters the
ht
MP-Group interface view.

ppp mp mp-group: Binds an interface to the MP-Group interface so
that the interface works in MP mode.
s:
restart: Restarts the current interface.

ce
Precautions
Data frames will be lost after you disable the interface. Exercise
ur
caution when you use the restart command.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
You need to get familiar with the configurations of the PPPoE
ht
server and PPPoE client in this case.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
virtual-template: Creates a VT interface and enters the VT interface
ht
view.
pppoe-server bind virtual-template: Binds a specified VT interface
to an Ethernet interface and enables PPPoE on the Ethernet interface.
s:
remote address: Configures the local device to assign an IP address

ce
or specifies an IP address pool for the remote device.

dialer-rule: Enters the dialer rule view.
ur
dialer-rule: Specifies a dialer ACL for a dialer access group and

defines conditions to initiate calls.
so
interface dialer: Creates a dialer interface and enters the dialer

interface view.
Re
dialer user: Enables the resource-shared DCC and specifies the

remote user name of the dialer interface.
dialer-group: Adds an interface to a dialer access group. That is, the
ng
number of the dialer rule is specified.

ni
dialer bundle: Specifies a dialer bundle for a dialer interface in the

resource-shared DCC.
ar
pppoe-client dial-bundle-number: Specifies a dialer bundle for a

PPPoE session.
Le
Parameters
remote address { ip-address | pool pool-name }
ip-address: Specifies an IP address to be allocated to the remote
re
device.
Mo
pool pool-name: Specifies the name of the IP address pool, from which
an IP address is allocated to the remote device.
en
dialer-rule dialer-rule-number { acl { acl-number | name acl-name }
m/
| ip { deny | permit } | ipv6 { deny | permit } }
dialer-rule-number: Specifies the number of a dialer access group. The
co
number is the same as the value of group-number in the dialer-group
command.
.
acl { acl-number |name acl-name }: Indicates the number or name of
ei
the dialer ACL.
w
ip { deny | permit }: Indicates whether the dialer ACL allows or forbids
ua
IPv4 packets.
.h
Precautions
To configure the local device to allocate an IP address to the remote
g
device, run the ppp ipcp remote-address forced command in the
in
interface view.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
.co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In the case of FR network, you do not need to manually
ht
configure the mapping relationship for a P2P sub-interface.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Precautions
You do not need to manually configure the mapping
ht
relationship if the sub-interface is a P2P sub-interface no

matter that has InARP disabled or not.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Broadcast storm
ht
Assume that STP is not enabled on the switching

devices. If PC1 broadcasts a request, the request is
received by port1 and forwarded by port2 on S1 and S2.
s:
On S1 and S2, port 2 receives the request broadcast

ce
by the other switch and port1 forwards the request. As

such transmission repeats and resources on the entire
ur
network are exhausted, causing the network to break

down.
so
MAC address table flapping

Port2 on S1 can learn the MAC address of the PC2.
Re
Since S2 forwards data frames sent by PC2 to its other

ports, S1 may learn the MAC address of PC2 on port1.
ng
S1 continuously modifies its MAC address table,

causing flapping of the MAC address table.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
STP
STP can eliminate network loops. STP is used to build a loop-
ht
free network (tree) to ensure the unique data transmission

path and prevent infinite looping of packets. STP works at the
data link layer of the OSI model.
s:
STP-capable switches exchange BPDUs and perform

ce
distributed calculation to determine which ports need to be

blocked to prevent loops.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Root bridge
The root bridge is the bridge with the smallest BID, which is
ht
composed of the priority and MAC address.

Root Port
The root port is the port with the smallest root path to the root
s:
bridge, and is responsible for forwarding data to the root bridge.

ce
The root port is determined based on the path cost. Among all
STP-capable ports on a network bridge, the port with the
ur
smallest root path cost is the root port. There is only one root
port on an STP-capable device, but there is no root port on the
so
root bridge.
Re
Designated port and bridge

The bridge closest to the root bridge on each network segment
ng
is used as the designated bridge. The port on the designated

bridge to the network segment is called designated port.
ni
The designated port is responsible for forwarding traffic, and

ar
the designated bridge is responsible for forwarding

configuration BPDUs.
Le
After the root bridge, root port, and designated port are selected
successfully, the entire tree topology is set up. When the topology is
re
stable, only the root port and the designated port forward traffic. All the
Mo
other ports are in Blocking state, and receive only STP BPDUs but not
forward user traffic.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A configuration BPDU is generated in one of the three following

ht
scenarios:
When ports are enabled with STP, the designated ports send
configuration BPDUs at intervals specified by the Hello timer.
s:
When a root port receives configuration BPDUs, the device

ce
where the root port resides sends a copy of the configuration

BPDUs to its designated port.
ur
When receiving a configuration BPDU with a lower priority, the

designated port immediately sends its own configuration
so
BPDUs to the downstream device.

Root identifier
Re
The root identifier is composed of the priority and MAC

address of the root bridge. The default priority is 32768.
Root path cost
ng
Cumulative cost of all links to the root bridge.

ni
Bridge Identifier (BID)

BID of the device sending configuration BPDUs. On a LAN,
ar
the BID is the ID of the designated bridge.

Port Identifier (PID)
Le
PID of the port sending configuration BPDUs. The PID

consists of the port priority and port number. On a LAN, the
PID is the ID of the designated port.
re
Mo
en
Hello Time
m/
The Hello timer specifies the interval at which an STP-capable
device sends configuration BPDUs to detect link faults.
co
When the network topology becomes stable, the change of the
interval takes effect only after a new root bridge takes over.
.
After a topology changes, TCN BPDUs will be sent. This
ei
interval is irrelevant to the transmission of TCN BPDUs.
w
The default value is 2 seconds.
ua
Max Age
After a non-root bridge running STP receives a configuration
.h
BPDU, the non-root bridge compares the Message Age value
with the Max Age value in the received configuration BPDU.
g
If the Message Age value is smaller than or equal to
in
the Max Age value, the non-root bridge forwards the
rn
configuration BPDU.
If the Message Age value is larger than the Max Age
ea
value, the configuration BPDU ages and the non-root
bridge directly discards it. In this case, the network size
/l
is considered too large and the non-root bridge
disconnects from the root bridge.
:/
In real world situations, each time a configuration BPDU
passes through a bridge, the value of Message Age increases
tp
by 1.
The default value is 20.
ht
Forward Delay
The Forward Delay timer specifies the delay for interface
s:
status transition. The default value is 15 seconds.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
STP Topology Calculation

After all devices on the network are enabled with STP, each
ht
device considers itself as the root bridge. Each device only

transmits and receives BPDUs but does not forward user
traffic. All ports are in Listening state. After exchanging
s:
configuration BPDUs, all devices participate in the selection of

ce
the root bridge, root port, and designated port.

During network initialization, every device considers itself as
ur
the root bridge and sets the root bridge ID as the device ID.
Devices exchange configuration BPDUs to compare the root
so
bridge IDs. The device with the smallest BID is elected as the
root bridge.
Re
The switch priority is configurable. The value ranges from 0 to

65535. The default priority is 32768.
ng
Assume that the priorities of S1 and S2 are 0 and 1. Port A on

ni
S1 connects to Port B on S2. S1 sends the configuration

BPDU of {0, 0, 0, Port A} and S2 sends the configuration
ar
BPDU of {1, 0, 1, Port B}. After the two switches compare the
configuration BPDUs, S1 is deemed to have a higher priority
Le
than S2, so S1 becomes the root bridge.

re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Priorities of S1, S2, and S3 are 0, 1, and 2, and the path costs
ht
between S1 and S2, between S1 and S3, and between S2 and

S3 are 5, 10, and 4 respectively.
Initial configuration BPDUs on ports of S1, S2, and S3:
s:
S1: {0, 0, 0, PortA1} on PortA1 and {0, 0, 0, Port A2} on

ce
Port A2
S2: {1, 0, 1, PortB1} on PortB1 and {1, 0, 1, Port B2} on
ur
Port B2
S3: {2, 0, 2, PortC1} on PortC1 and {21, 0, 2, Port C2}
so
on Port C2
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
First exchange of configuration BPDUs

Ports on S1, S2, and S3 send their configuration BPDUs. Each
ht
network bridge considers itself as the root bridge, so the RPC

is 0.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Comparison for the first exchange of configuration BPDUs

S1
ht
Port A1 receives the configuration BPDU {1, 0, 1, Port

B1} from Port B1 and finds that its configuration BPDU
s:
{0, 0, 0, Port A1} has higher priority than the

configuration BPDU {1, 0, 1, Port B1}, so Port A1
ce
discards the configuration BPDU {1, 0, 1, Port B1}.

Port A2 receives the configuration BPDU {2, 0, 2, Port
ur
C1} from Port C1 and finds that its configuration BPDU

{0, 0, 0, Port A2} has higher priority than the
so
configuration BPDU {2, 0, 2, Port C1}, so Port A2

Re
discards the configuration BPDU {2, 0, 2, Port C1}.

After finding that both the root and the designated
switch IDs refer to itself in the configuration BPDU on
ng
each port, S1 considers itself as the root bridge. S1

then sends configuration BPDUs from each port
ni
periodically without modifying the configuration BPDUs.

The configuration BPDU {0, 0, 0, Port A1} on Port
ar
A1 and configuration BPDU {0, 0, 0, Port A2} on

Le
Port A2 are optimal.

Because S1 is the root bridge, all ports on S1 are
designated ports.
re
Mo
en
S2
m/
Port B1 receives the configuration BPDU {0, 0, 0, Port
A1} from Port A1 and finds that its configuration BPDU
co
{0, 0, 0, Port A1} has a higher priority than the
configuration BPDU {1, 0, 1, Port B1}, so Port B1
.
updates its configuration BPDU.
ei
w
ua
{1, 0, 1, Port B2} has a higher priority than the
configuration BPDU {2, 0, 2, Port C2}, so Port B2
.h
discards the configuration BPDU {2, 0, 2, Port C2}.
g
B1 and the configuration BPDU {1, 0, 1, Port B2} on
in
Port B2 are optimal.

rn
Comparison of configuration BPDUs on ports:
S2 compares the configuration BPDU on each
ea
port and finds that the configuration BPDU on
Port B1 has the highest priority, so Port B1 is
/l
used as the root port and the configuration
BPDU on Port B1 remains unchanged.
:/
S2 calculates the BPDU {0, 5, 1, Port B2} for
Port B2 based on the configuration BPDU and
tp
path cost of the root port, and compares the

ht
configuration BPDU {0, 5, 1, Port B2} with its

configuration BPDU {1, 0, 1, Port B2} on Port
B2. S2 finds that the calculated configuration
s:
BPDU has a higher priority, so Port B2 is used

as the designated port, and its configuration
ce
BPDU is replaced by the calculated

configuration BPDU and the calculated
ur
configuration BPDU is sent periodically.

S3
so
Port C1 receives the configuration BPDU {0, 0, 0, Port

Re
A2} from Port A2 and finds that the configuration BPDU

{0, 0, 0, Port A2} has a higher priority than its
configuration BPDU {2, 0, 2, Port C1}, so Port C1
ng

ni
B2} from Port B2 and finds that the configuration BPDU

{1, 0, 1, Port B2} has a higher priority than its
ar
configuration BPDU {2, 0, 2, Port C2}, so Port C2

Le

re
Mo
en
m/
C1 and configuration BPDU {1, 0, 1, Port B2} on
Port C2 are optimal.
co
Comparison of configuration BPDUs on ports:
S3 compares the configuration BPDU on each
.
port and finds that the configuration BPDU on
ei
Port C1 has the highest priority, so Port C1 is
w
used as the root port and the configuration
ua
BPDU on Port C1 remains unchanged.
S3 calculates the configuration BPDU {0, 10, 2,
.h
Port C2} for Port C2 based on the configuration
BPDU and path cost of the root port, and
g
compares the configuration BPDU {0, 10, 2,
in
Port C2} with its configuration BPDU {1, 0, 1,
rn
Port B2} on Port C2. S3 finds that the calculated
configuration BPDU has a higher priority, so
ea
Port C2 is used as the designated port and its
configuration BPDU is replaced by the
/l
calculated configuration BPDU.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Second exchange of configuration BPDUs

S1 is the root bridge. Configuration BPDUs sent by S1
ht
The configuration BPDU sent by Port A1 is {0, 0, 0,

Port A1}.
The configuration BPDU sent by Port A2 is {0, 0, 0,
s:
Port A2}.
Configuration BPDUs sent by S2
ce
S1 is the root bridge, so S2 does not send

ur
configuration BPDUs to S1.

The configuration BPDU sent by Port B2 is {0, 5, 1,
so
Port B2}.
Configuration BPDUs sent by S3
Re
S1 is the root bridge, so S3 does not send

configuration BPDUs to S1.
The configuration BPDU sent by Port C2 is {0, 10, 2,
ng
Port C2}.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Comparison for the second exchange of configuration BPDUs

S2
ht

A1} from Port A1 and finds that the received
s:
configuration BPDU is the same as its own

configuration BPDU, so Port B1 discards the received
ce
one.
ur

{0, 5, 1, Port B2} has a higher priority, so Port B2
so
discards it.
After comparison, the optimal configuration BPDUs
Re
on Port B1 and Port B2 are {0, 0, 0, Port A1} and {0,

5, 1, Port B2} respectively.
ng
Because the optimal configuration BPDU on each port

remains unchanged, the port role does not change.
ni
S3
ar
A2} from S1 and finds that the received configuration

Le
BPDU is the same as its own configuration BPDU, so

Port C1 discards the received one.
re
B2} from S1 and compares it with its configuration

BPDU {0, 10, 2, Port C2}.
Mo
en
Because the root bridge ID is the same, the root path
m/
costs are compared. Port C2 finds that the received
configuration BPDU has a higher priority(10>9), so Port
co
C2 updates its BPDU as {0, 5, 1, Port B2}.
After comparison, the optimal configuration BPDUs
.
on Port C1 and Port C2 are {0, 0, 0, Port A2} and {0,
ei
5, 1, Port B2} respectively.
w
Comparison of configuration BPDUs on each port:
ua
S3 compares the root path cost of Port C1 (root
path cost of 0 in the received configuration
.h
BPDU + path cost 10 of the link) with the root
path cost of Port C2 (root path cost of 5 in the
g
received configuration BPDU + path cost 4 of
in
the link). The root path cost of Port C2 is
rn
smaller, so the configuration BPDU of Port C2
is preferred. Port C2 is used as the root port
ea
and its configuration BPDU remains unchanged.
S3 calculates the configuration BPDU {0, 9, 2,
/l
Port C1} for Port C1 according to the
configuration BPDU and path cost of the root
:/
port, and compares the calculated configuration
BPDU with its configuration BPDU. S3 finds
tp
that its configuration BPDU has a higher priority,

ht
so Port C1 is blocked and the configuration

BPDU of S3 remains unchanged. In this case,
Port C1 does not forward data. Furthermore,
s:
spanning tree calculation may be triggered, for

example, the link between S2 and S3 becomes
ce
Down.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Topology on the Left Side

According to the root bridge selection principle of STP, S1 is
ht
the root bridge. Then determine the root port, designated port,
and alternate port.
E0 and E1 on S2 receive BPDUs {0, 0, 0, E0} and {0, 0, 0, E1}
s:
from S1. In the two BPDUs, only the transmit port is different.
ce
The port with smaller PID has a higher priority, so E0 is the

root port and E1 is the alternate port.
ur
Topology on the Right Side

According to the root bridge selection principle of STP, S1 is
so
the root bridge. Then determine the root port, designated port,
and alternate port.
Re
E0 and E1 on S2 receive BPDUs {0, 0, 0, E0} and {0, 0, 0, E1}

from S1. The two BPDUs have the same priority, only the PIDs
ng
are compared. E0 has smaller PID, so E0 is the root port and

E1 is the alternate port.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Generally, only the root bridge generates and sends configuration

BPDUs. Other non-root-bridges only forward the configuration BPDU
ht
from the root port using their designated ports. The designated port on
a non-root-bridge sends the optimal BPDU only after receiving BPDUs
with a lower priority.
s:
Topology description:
ce
After S2 receives a BPDU with a lower priority from S4, S2

sends a configuration BPDU. This is because network bridges
ur
save the optimal configuration BPDU.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The figure on the left side shows the initial topology. The path
ht
costs are the same. S1, S2, and S3 are connected, S1 is the
root port, and interconnected ports are in forwarding state. In
the figure on the right side, a link between S1 and S2 is added.
s:
After S2 receives BPDUs from S1 and S3, S2 considers that

ce
the port connected to S1 is the new root port and the port
connected to S3 is the designated port. All ports are root ports
ur
or designated ports in forwarding state. In this case, a loop

occurs. The loop can be eliminated only when configuration
so
BPDUs are transmitted to each network bridge and S2 blocks

the port connected to S3 through calculation.
Re
There is a delay for a port (for example, port E on S2) to

change from non-forwarding to forwarding so that ports that
ng
want to enter the non-forwarding state can complete spanning

tree calculation.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Forward Delay
The default interval for port status transition is 15 seconds.
ht
There are specific calculation between Forwarding Delay, hello

timer and Max Age, the default value is based on the diameter
7 calculating.
s:
ce
Port Status Description

After a port is enabled, the port enters the Listening state and
ur
starts the spanning tree calculation.

If the port needs to be configured as the alternate port through
so
calculation, the port enters the Blocking state.

If the port needs to be configured as the root port or
Re
designated port through calculation, the port enters the

Learning state from the Listening state after a Forward Delay
ng
period. The port then enters the Forwarding state from the
Learning state after the Forward Delay period. The port in
ni
Forwarding state can forward data frames.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Huawei switch port status

Huawei datacom devices use MSTP by default. After a device
ht
transitions from the MSTP mode to the STP mode, its STP-
capable port supports the same port states as those supported
by an MSTP-capable port, including the Forwarding, Learning,
s:
and Discarding states.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Port status transition

The port is initialized or enabled.
ht
The port is blocked or the link fails.

The port is selected as the root port or designated port.
The port is no longer the root port or designated port.
s:
The Forward Delay timer expires.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
TCN BPDU processing:

After the network topology changes, the downstream device
ht
continuously sends a TCN BPDU to its upstream device which

the port status turn to forwarding.
After the upstream device receives the TCN BPDU from the
s:
downstream device, only the designated port processes it. The

ce
other ports may receive the TCN BPDU but do not process it.
The upstream device sets the TCA bit of the Flags field in the
ur
configuration BPDU to 1 and returns the configuration BPDU

to instruct the downstream device to stop sending TCN
so
BPDUs.
The upstream device sends a copy of the TCN BPDU to the
Re
root bridge.
Steps 1 to 4 repeat until the root bridge receives the TCN
ng
BPDU.
After receiving the TCN BPDU, the root bridge resets the TCA
ni
bit in the subsequent configuration BPDU for acknowledgment

and sets the TC bit of the Flags field in the configuration BPDU
ar
to 1 to notify all network bridges of the topology change.

After the periods of Max Age and Forward Delay, the root
Le
bridge sends the BPDU with the reset TC bit. The network
bridge that receives the BPDU reduces the aging time of MAC
address entries to the Forward Delay period.
re
Mo
en
Topology Description:
m/
Through STP calculation, S1 is the root bridge and port E1 on
S4 is blocked.
co
When the link of port E1 on S3 fails, the STP will be
calculation again, port E1 of S4 will turn to designated port and
.
the status is forwarding, S4 immediately sends a TCN BPDU
ei
to the upstream.
w
After S2 receives the TCN BPDU from S3, S2 resets the TCA
ua
bit in the subsequent configuration BPDU and sends it to S4
from port E3. S2 also sends the TCN BPDU to the root from
.h
the root port E1.
After S1 receives the TCN BPDU from S2, S1 resets the TCA
g
and TC bits in the subsequent configuration BPDU and sends
in
it to S2 from the designated port E1. Within the period of 35
rn
seconds (20 seconds + 15 seconds), S1 resets the TC bit in
the configuration BPDU. After receiving the configuration
ea
BPDU with the reset TC bit, each network bridge changes its
aging time of MAC address entries to 15 seconds.
/l
When the topology change, the MAC address table will
established soon, which can avoid wasting of bandwidth.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Root bridge failure:

When S1 becomes faulty, S2 and S3 cannot receive BPDUs
ht
from the root bridge. S2 and S3 detect the root bridge failure
only after a Max Age period. S2 and S3 then determine the
new root bridge, root port, and designated port. The topology
s:
convergence period is 50 seconds (BPDU aging period plus

ce
value twice the Forward Delay period).

Link failure:
ur
When the link between S3 and S1 fails, S3 can immediately

detect this event. The blocked port on S3 immediately enters
so
the Listening state and sends the configuration BPDU with

itself as the root. After S2 receives the BPDU with lower
Re
priority from S3, S2 sends a configuration BPDU with S1 as

the root. The port on S2 connected to S3 therefore becomes
ng
the root port, and the port on S3 connected to S2 becomes the

designated port. The period for the S3 port status change from
ni
Listening, Learning, to Forwarding is 30 seconds.

When a link fails or is added, the fault can be rectified after 30 seconds.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
STP Limitation:
Port statuses or port roles are not distinguished in a fine-
ht
granular manner. For example, ports in Listening and Blocking

states do not forward user traffic or learn MAC addresses.
The STP algorithm determines topology changes after the time
s:
set by the timer expires, which slows down network

ce
convergence.
The STP algorithm requires a stable network topology. After
ur
the root bridge sends configuration BPDUs, other devices

process the configuration BPDUs so that the configuration
so
BPDUs are advertised to the entire network.

Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RSTP has all functions of STP, and the RSTP-capable and STP-
capable network bridges can work together.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RSTP defines four port roles: root port, designated port, alternate port,
and backup port.
ht
The functions of the root port and designated port are the same as
those defined in STP. The alternate port and backup port are described
as follows.
s:
From the perspective of configuration BPDU transmission:

An alternate port is blocked after learning the
ce
configuration BPDUs with a higher priority from other

ur
bridges.
A backup port is blocked after learning the
so
configuration BPDUs with a higher priority than itself.

From the perspective of user traffic:
Re
An alternate port backs up the root port and provides

an alternate path from the designated bridge to the root
ng
bridge.
A backup port backs up the designated port and
ni
provides an alternate path from the root bridge to a

network segment.
ar
After all RSTP-capable ports are assigned roles, topology convergence

is completed.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Port statuses are simplified from five types to three types. Based on
whether a port forwards user traffic and learns MAC addresses, the port
ht
is in one of the following states:

If a port neither forwards user traffic nor learns MAC
addresses, the port is in Discarding state.
s:
If a port does not forward user traffic but learns MAC

ce
addresses, the port is in Learning state.

If a port forwards user traffic and learns MAC addresses, the
ur
port is in Forwarding state.

so
RSTP Calculation
Roles of ports in Discarding state are determined:
Re
The root port and designated port enter the learning

state after the Forward Delay period. A port in Learning
ng
state learns MAC addresses and enters the Forwarding

state after a Forward Delay period. RSTP accelerates
ni
this process using another mechanism.

An alternate port maintains a Discarding state.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Configuration BPDUs in RSTP are differently defined. Port roles are

described based on the Flags field defined in STP. When compared
ht
with STP, RSTP slightly redefines the format of configuration BPDUs.

The value of the Type field is no longer set to 0 but 2. The
STP-capable device therefore always discards the
s:
configuration BPDUs sent by an RSTP-capable device.

The 6 bits in the middle of the original Flags field are reserved.
ce
Such a configuration BPDU is called an RST BPDU.

ur
Flags field in an RST BPDU:

Bit 0 indicates the TC bit, which is the same as that in STP.
so
Bit 1 indicates the Proposal flag bit, indicating that the BPDU is
the Proposal packet in the fast convergence mechanism.
Re
Bit 2 and bit 3 indicate the port role. The value 00 indicates the
unknown port; the value 01 indicates the root port; the value
ng
10 indicates the alternate or backup port; the value 11

indicates the designated port.
ni
Bit 4 indicates that the port is in Learning state.

Bit 5 indicates that the port is in Forwarding state.
ar
Bit 6 indicates the Agreement packet in the fast convergence

mechanism.
Le
Bit 7 indicates the TCA bit, which is the same as that in STP.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Configuration BPDUs are processed in a different manner.

Transmission of configuration BPDUs after the topology
ht
becomes stable
In STP, after the topology becomes stable, the root
bridge sends configuration BPDUs at an interval set by
s:
the Hello timer. A non-root-bridge does not send

ce
configuration BPDUs until it receives configuration

BPDUs sent from the upstream device. This renders
ur
the STP calculation complicated and time-consuming.

In RSTP, after the topology becomes stable, a non-
so
root-bridge sends configuration BPDUs at an interval

set by the Hello timer, regardless of whether it has
Re
received the configuration BPDUs sent from the root

bridge. Such operations are implemented on each
ng
device independently.
Shorter timeout interval of BPDUs
ni
In STP, a device has to wait for the Max Age period

before determining a negotiation failure. In RSTP, if a
ar
port does not receive configuration BPDUs sent from

the upstream device for three consecutive intervals set
Le
by the Hello timer, the negotiation between the local

device and its peer fails.
re
Mo
en
Processing of RST BPDUs with lower priority
m/
In RSTP, when a port receives an RST BPDU from the
upstream designated bridge, the port compares the
co
received RST BPDU with its own RST BPDU. If its own
RST BPDU has higher priority than the received one,
.
the port discards the received RST BPDU and
ei
immediately responds to the upstream device with its
w
own RST BPDU. After receiving the RST BPDU, the
ua
upstream device updates its own RST BPDU based on
the corresponding fields in the received RST BPDU. In
.h
this manner, RSTP processes BPDUs with lower
priority more rapidly, independent of any timer that is
g
used in STP.
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
STP convergence
To eliminate loops, STP uses timers to complete convergence.
ht
The default period from the time the port is enabled to the time
the port is in Forwarding state is 30 seconds. Shortening the
values of timers may cause the network to become unstable.
s:
ce
RSTP fast convergence

Edge port
ur
In RSTP, a designated port on the network edge is

called an edge port. An edge port directly connects to a
so
terminal and does not connect to any other switching

devices. An edge port does not receive configuration
Re
BPDUs, so it does not participate in the RSTP

calculation. It can directly change from the Disabled
ng
state to the Forwarding state without any delay, just like

an STP-incapable port. If an edge port receives bogus
ni
configuration BPDUs from attackers, it becomes a

common STP port. The STP recalculation is performed,
ar
causing network flapping.

Fast switching of the root port
Le
If the root port fails, the optimal alternate port on the

network becomes the root port and enters the
Forwarding state. This is because there must be a path
re
from the root bridge to a designated port on the

Mo
network segment connecting to the alternate port.

en
Proposal/Agreement mechanism
m/
When a port is selected as a designated port, in STP,
the port does not enter the Forwarding state until a
co
Forward Delay period expires; in RSTP, the port enters
the Discarding state, and then the Proposal/Agreement
.
mechanism allows the port to immediately enter the
ei
Forwarding state. The Proposal/Agreement mechanism
w
must be applied on the P2P links in full-duplex mode.
ua
The P/A mechanism is short for the
Proposal/Agreement mechanism
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Edge port
An edge port directly connects to a terminal. When the network
ht
topology changes, loops do not occur on the edge port. The

edge port therefore can directly enter the Forwarding state
without waiting for two Forward Delay periods.
s:
An edge port does not receive configuration BPDUs, so it does

ce
not participate in the RSTP calculation. It can directly change

from the Disabled state to the Forwarding state without any
ur
delay, just like an STP-incapable port. If an edge port receives

bogus configuration BPDUs from attackers, it becomes a
so
common STP port. The STP recalculation is performed,

causing network flapping.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Fast switching of the root port

In RSTP, an alternate port is the backup of the root port. When
ht
the root port of a network bridge becomes discarding, the

optimal alternate port is used as the new root port and
s:
becomes Forwarding states. Because the network segment

connects to this alternate port must have a designated port
ce
whitch can reach to the root bridge.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
P/A mechanism
The Proposal/Agreement (P/A) mechanism enables a
ht
designated port to rapidly enter the Forwarding state.

The P/A mechanism requires that the link between two
switching devices should be P2P and work in full-duplex mode.
s:
When P/A negotiation fails, the designated port is selected

ce
after two Forward Delay periods. The negotiation process is

the same as that in STP.
ur
After a new link is established, the negotiation process of the

P/A mechanism is as follows:
so
p0 and p1 become designated ports and send RST

BPDUs.
Re
After receiving an RST BPDU with higher priority, p1

on S2 determines that it will become a root port but not
ng
a designated port. p1 then stops sending RST BPDUs.

p0 on S1 enters the Discarding state and sends RST
ni
BPDUs with the Proposal field of 1.

After receiving an RST BPDU with the Proposal field of
ar
1, S2 sets the sync variable to 1 for all its ports.

As p2 has been blocked, its status remains unchanged;
Le
p4 is an edge port and does not participate in

calculation. Only the non-edge designated port p3
therefore needs to be blocked.
re
Mo
en
After p2, p3, and p4 enter the Discarding state, their
m/
synced variables are set to 1. The synced variable of
the root port p1 is then set to 1, and p1 sends an RST
co
BPDU with the Agreement field of 1 to S1. With
exception of the Agreement field that is set to 1 and the
.
Proposal field that is set to 0, the RST BPDU is the
ei
same as that received.
w
After receiving this RST BPDU, S1 identifies the RST
ua
BPDU as a response to the Proposal packet that it just
sent, and p0 immediately enters the Forwarding state.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The P/A negotiation with the downstream device as follows.

When a link between S1 and S2 is added, the P/A mechanism works
ht
as follows:
S1 sends an RST BPDU with the Proposal field of 1 to S2.
After receiving the RST BPDU, S2 determines that E2 is the
s:
root port. S2 blocks designated ports of E1 and E3, sets the

ce
root port to the Forwarding state, and sends an Agreement

packet to S1.
ur
After S1 receives the Agreement packet, its designated port

E1 immediately enters the Forwarding state.
so
The non-edge designated ports of E1 and E3 on S2 sends

Proposal packets.
Re
After S3 receives the Proposal packets from S2, S3

determines that E1 is the root port and starts synchronization.
ng
Because the downstream port of S3 is the edge port, S3

directly sends an Agreement packet.
ni
After S2 receives the Agreement packet from S3, its port E1

immediately enters the Forwarding state.
ar
The process on S4 is similar to that on S3.

After S2 receives the Agreement packet from S4, its port E3
Le
immediately enters the Forwarding state.

The P/A process is completed.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
In RSTP, if a non-edge port changes to the Forwarding state, the

topology changes.
ht
After a switching device detects the topology change (TC), it performs

the following operations:
Start a TC While timer for every non-edge port. The TC While
s:
Timer value doubles the Hello timer value. All MAC address
ce
entries learned by the ports whose status changes are cleared

before the timer expires. These ports send RST BPDUs with
ur
the TC field of 1. Once the TC While timer expires, the ports

stop sending the RST BPDUs.
so
After another switching device receives the RST BPDU, it

clears the MAC addresses learned by all ports excluding the
Re
one that receives the RST BPDU. The switching device then
starts a TC While timer for all non-edge ports and the root port.
ng
The process is similar.

In this manner, RST BPDUs flood the network.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When a port switches from RSTP to STP, the port loses RSTP features
such as fast convergence.
ht
On a network where both STP-capable and RSTP-capable devices are

deployed, STP-capable devices ignore RST BPDUs; if a port on an
RSTP-capable device receives a configuration BPDU from an STP-
s:
capable device, the port switches to the STP mode after two intervals
ce
specified by the Hello timer and starts to send configuration BPDUs. In

this manner, RSTP and STP are interoperable.
ur
After STP-capable devices are removed, Huawei RSTP-capable

datacom devices can switch back to the RSTP mode.
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RSTP, an enhancement to STP, implements fast convergence of the

network topology. There is a defect for both RSTP and STP: All VLANs
ht
on a LAN use one spanning tree, and VLAN-based load balancing

cannot be performed. Once a link is blocked, it will no longer transmit
traffic, wasting bandwidth and causing the failure in forwarding certain
s:
VLAN packets.
ce
ur
STP or RSTP is deployed on the LAN. The broken line shows

the spanning tree; S6 is the root switching device; the links
so
between S1 and S4 and between S2 and S5 are blocked.

VLAN packets are transmitted by using only the links marked
Re
with "VLAN2" or "VLAN3." PC2 and PC3 belong to VLAN 2 but

they cannot communicate with each other because the link
ng
between S2 and S5 is blocked and the link between S3 and

S6 rejects packets from VLAN 2.
ni
MSTP can be used to address this issue. MSTP implements fast

convergence and provides multiple paths to load balance VLAN traffic.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A Multiple Spanning Tree (MST) region contains multiple switching

devices and network segments between them. The switching devices
ht
of one MST region have the following identical characteristics:

MSTP-enabled
Region name
s:
VLAN-MSTI mappings
MSTP revision level
ce
ur
An instance is a collection of VLANs. Binding multiple VLANs to an

instance saves communication costs and reduces resource usage. The
so
topology of each MSTI is calculated independent of one another, and

traffic can be balanced among MSTIs. Multiple VLANs that have the
Re
same topology can be mapped to one instance. The forwarding status

of the VLANs for a port is determined by the port status in the MSTI.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The Common and Internal Spanning Tree (CIST), calculated using STP
or RSTP, connects all switching devices on a switching network.
ht
The CIST root is the network bridge with the highest priority on
the entire network, that is, root bridge of the CIST.
In the preceding topology, the lines in red in MSTIs and the
s:
lines in blue between MSTIs form a CIST. The root bridge of

ce
the CIST is S1 in MST region 1.

ur
A Common Spanning Tree (CST) connects all the MST regions on a

switching network.
so
The CST is calculated by all nodes using STP or RSTP.

In the preceding topology, the lines in blue form a CST. The
Re
CST root is MST region 1.
An Internal Spanning Tree (IST) resides within an MST region.

ng
Each spanning tree in an MST region has an MSTI ID. An IST

ni
is a special MSTI with the MSTI ID of 0, called MSTI 0. The

VLANs that do not map to other MSTIs map to MSTI 0.
ar
An IST is a segment of the CIST in an MST region.

In the preceding topology, the lines in red form a IST.
Le
The master bridge is the IST master, which is the switching device
closest to the CIST root in a region.
re
If the CIST root is in an MST region, the CIST root is the

Mo
master bridge of the region.

en
In the preceding topology, S1, S4, and S7 are master bridges.
m/
A Single Spanning Tree (SST) is formed in either of the following
co
situations:
A switching device running STP or RSTP belongs to only one
.
spanning tree.
ei
An MST region has only one switching device.
w
There is no SST in the preceding topology.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
MSTI
An MST region can contain multiple spanning trees, each
ht
called an MSTI. An MSTI regional root is the root of the MSTI.

Each MSTI has its own regional root.
MSTIs are independent of each other. An MSTI can map to
s:
one or more VLANs, but one VLAN can map to only one MSTI.
Each MSTI has an MSTI ID. The MSTI ID starts from 1, which
ce
is distinguished with the IST (MSTI 0).

ur
In the preceding topology, VLAN 2 maps to MSTI 2 and VLAN

4 to MSTI 4.
so
MSTI regional root

Re
The MSTI regional root is the network bridge with the highest
priority in each MSTI. You can specify different roots in
ng
different MSTIs.
In the preceding topology, assuming that S9 has the highest
ni
priority in MSTI 2, S9 is the regional root in MSTI 2. Assuming

that S8 has the highest priority in MSTI 4, S8 is the regional
ar
root in MSTI 2.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When compared to RSTP, MSTP has two additional port types. MSTP
ports include the root port, designated port, alternate port, backup port,
ht
edge port, master port, and regional edge port.

Master port
A master port is on the shortest path connecting MST
s:
regions to the CIST root.

BPDUs of an MST region are sent to the CIST root
ce
through the master port.

ur
Master ports are special regional edge ports,

functioning as root ports in the CIST and master ports
so
in instances.
In the preceding topology, the port on S7 connected to
Re
MST region 1 is the master port.

Regional edge port
A port connecting the network bridge in an MST region
ng
to another MST region or an STP or RSTP-enabled

ni
network bridge is a regional edge.

In the preceding topology, the port on S8 connected to
ar
MST region 2 is the regional edge port.

Le
Network bridges may have different roles in different MSTIs, so ports

with exception to the master port on network bridges may have
different roles. The master port retains its role in all MSTIs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Currently, there are two MST BPDU formats:

dot1s: BPDU format defined in IEEE 802.1s
ht
legacy: private BPDU format

In using the stp compliance command, you can configure a port
on a Huawei datacom device to automatically adjust the MST
s:
BPDU format.
ce
With exception to MSTP-specific fields, other fields in an intra-region or

ur
inter-region MST BPDU are the same as those in an RST BPDU.

The Root ID field in an RST BPDU indicates the CIST root ID
so
in an MST BPDU.
The EPC field in an MST BPDU indicates the total path cost
Re
from the MST region where the network bridge sending the
BPDU resides to the MST region where the CIST root resides.
The Bridge ID field in an MST BPDU indicates the regional
ng
root ID in the CIST.

ni
The Port ID field in an MST BPDU indicates the ID of the

designated port in the CIST.
ar
MSTP-specific fields:
Version 3 Length: indicates the BPDUv3 length, which
Le
is used to check received MST BPDUs.

MST Configuration Identifier: indicates the MST
configuration identifier, which has four fields.
re
Mo
en
This field identifies an MST region where a network
m/
bridge is located. Neighboring switches are in the same
MST region only when the following fields on the
co
switches are the same:
Format Selector: indicates the 802.1s-defined
.
protocol selector. It has a fixed value of 0.
ei
Name: indicates the configuration name, that is,
w
the MST region name of a switch. The value
ua
has 32 bytes. Each switch has an MST region
name configured. The default value is the
.h
switchs MAC address.
Config Digest: indicates the configuration digest,
g
which has 16 bytes. Switches in an MST region
in
should maintain the same mapping between
rn
VLANs and MSTIs. However, the MST
configuration table is too large (8192 bytes) and
ea
cannot be easily transmitted between switches.
This field is the digest calculated from the MST
/l
configuration table using the MD5 algorithm.
Revision Level: indicates the revision level of an
:/
MST region, which has two bytes. The default
value is all 0s. The value of the Config Digest
tp
field is the digest of the MST configuration table,

ht
there is a low probability that MST configuration

tables are different but the digest is the same.
In this case, switches in different MST regions
s:
may be incorrectly considered in the same MST

region. It is recommended that different MST
ce
regions use different revision levels to prevent

the preceding problem.
ur
CIST Internal Root Path Cost: indicates the total path

cost from the local port to the IST master. This value is
so
calculated based on link bandwidth.

Re
CIST Bridge Identifier: indicates the ID of the

designated switching device on the CIST.
CIST Remaining Hops: indicates the remaining hops of
ng
a BPDU in the CIST. This field is used to limit the MST

scale. A BPDU has the maximum hop count on the
ni
CIST regional root. The hop count decreases by 1

every time the BPDU passes a network bridge. The
ar
network bridge discards the BPDU with the hop of 0.

Le
MSTI Configuration Messages(may be absent):

indicates an MSTI configuration message.
MSTI Flag: has eight bits. Bits 1 to 7 are the
re
same as those in RSTP. Bit 8 indicates whether

the network bridge is the master bridge, and
Mo
replaces the TCA bit in RSTP.

en
MSTI region Root ID: indicates the regional root
m/
ID of the MSTI.
MSTI IRPC: indicates the path cost from the
co
network bridge sending the BPDU to the MSTI
regional root.
.
MSTI Bridge Priority: indicates the priority of the
ei
network bridge that sends the BPDU.
w
MSTI Port Priority: indicates the priority of the
ua
port that sends the BPDU.
MSTI Remaining Hops: indicates the remaining
.h
number of hops in an MSTI.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
MSTP Topology Calculation

In MSTP, the entire Layer 2 network is divided into multiple
ht
MST regions, which are interconnected by a single CST. In an

MST region, multiple spanning trees are calculated, each of
which is called an MSTI. Among these MSTIs, MSTI 0 is also
s:
known as the internal spanning tree (IST). Like STP, MSTP

ce
uses configuration BPDUs to calculate spanning trees, but the

configuration BPDUs are MSTP-specific.
ur
Vectors
so
Root switching device ID: identifies the root switching device in

the CIST. The root switching device ID consists of the priority
Re
value (16 bits) and MAC address (48 bits). The priority value is
the priority of MSTI 0.
External root path cost (ERPC): indicates the external root
ng
path cost from the CIST regional root to the CIST root. ERPCs
ni
saved on all switching devices in an MST region are the same.

If the CIST root is in an MST region, ERPCs saved on all
ar
switching devices in the MST region are 0s.

Regional root ID: identifies the MSTI regional root. The
Le
regional root ID consists of the priority value (16 bits) and MAC
address (48 bits).
Internal root path cost (IRPC): indicates the path cost from the
re
local bridge to the regional root.

Mo
en
Designated switching device ID: indicates the network bridge
m/
that sends the BPDU.
Designated port ID: identifies the port on the designated
co
switching device connected to the root port on the local device.
The port ID consists of the priority value (4 bits) and port
.
number (12 bits). The priority value must be a multiple of 16.
ei
Receiving port ID: identifies the port that receives the BPDU.
w
The port ID consists of the priority value (4 bits) and port
ua
number (12 bits). The priority value must be a multiple of 16.
.h
If the priority of a vector carried in the configuration message of a
BPDU received by a port is higher than the priority of the vector in the
g
configuration message saved on the port, the port replaces the saved
in
configuration message with the received one. In addition, the port
rn
updates the global configuration message saved on the device. If the
priority of a vector carried in the configuration message of a BPDU
ea
received on a port is equal to or lower than the priority of the vector in
the configuration message saved on the port, the port discards the
BPDU.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
CST Calculation
CST and IST calculation is similar to the calculation in RSTP.
ht
During CST calculation, an MST region is considered as a

network bridge and the ID of the network bridge is the IST
regional root ID.
s:
CIST uses the following vectors: {root switching device ID,

ce
ERPC, regional root ID, IRPC, designated switching device ID,

designated port ID, receiving port ID}. CST uses the following
ur
vectors: {CIST root, ERPC, regional root ID, designated port ID,
receiving port ID}.
so
Assume that S1, S4, and S7 are regional roots in
Re
Region1, Region2, and Region3 respectively. S1 has

the highest priority, S4 has the lowest priority, and the
ng
cost of each path is the same.

Each MST region is considered as a network bridge,
ni
and the ID of the network bridge is the regional root ID.

Each MST region sends a BPDU with itself as the CIST
ar
root and external cost of 0 to other MST regions.

Through RSTP calculation, S1 is the CIST root.
Le
Through ERPC comparison, the port of each regional

root connected to Region1 is the master port.
Through comparison of priorities in regional root IDs,
re
the regional edge port is determined.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IST Calculation
CST and IST calculation is similar to the calculation in RSTP.
ht
MSTP calculates an IST for each MST region, and computes a

CST to interconnect MST regions. The CST and ISTs
constitute a CIST for the entire network.
s:
CIST uses the following vectors: {root switching device ID,

ce
ERPC, regional root ID, IRPC, designated switching device ID,

designated port ID, receiving port ID}. IST uses the following
ur
vectors: {CIST root, IRPC, designated bridge ID, designated

port ID, receiving port ID}.
so
After CST calculation is complete, S1, S4, and S7 are
Re
regional roots in Region1, Region2, and Region3

respectively. In this situation, the regional root is the
ng
network bridge closest to the CIST root but not the

network bridge with the highest priority.
ni
The role of a port on each network bridge is determined

based on the regional root as the root bridge and IRPC,
ar
and then the IST is obtained.

Network bridges in an MST region compare IRPCs to
Le
determine the IST root port.

Port roles in the IST are determined based on priorities
in BPDUs.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Region1 Calculation
In an MST region, MSTP calculates an MSTI for each VLAN
ht
based on mappings between VLANs and MSTIs. Each MSTI is

calculated independently. The calculation process is similar to
the process for STP to calculate a spanning tree.
s:
In Region1, VLAN 2 maps to MSTI 2, VLAN 4 to MSTI
ce
4, and other VLANs to MSTI 0.

ur
Different priorities are specified for network bridges in

different MSTIs. Assume that S2 is the root bridge in
so
MSTI 2 and S3 is the root bridge in MSTI 4.

In MSTI 2, S2, S1, and S3 are in descending order of
Re
priority. Through calculation, the port on S3 connected

to S1 is blocked.
ng

ni
to S1 is blocked.
MSTIs have the following characteristics:
ar
The spanning tree is calculated independently for each MSTI,

and spanning trees of MSTIs are independent of each other.
Le
MSTP calculates the spanning tree for an MSTI in a manner

similar to STP.
Spanning trees of MSTIs can have different roots and
re
topologies.
Mo
en
Each MSTI sends BPDUs in its spanning tree.
m/
The topology of each MSTI is configured by using commands.
A port can be configured with different parameters for different
co
MSTIs.
A port can play different roles or have different statuses in
.
different MSTIs.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Region2 Calculation
ht

s:

ce

ur

to S4 is blocked.
so

Re
to S4 is blocked.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Region3 Calculation
ht

s:

ce

In MSTI 2, S9, S10, S8, and S7 are in descending
ur
order of priority. Through calculation, the port on S7

connected to S8 and the port on S8 connected to S10
so
are blocked.
In MSTI 4, S8, S7, S10, and S9 are in descending
Re
order of priority. Through calculation, the port on S9

connected to S7 and the port on S10 connected to S7
ng
are blocked.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
MSTI Calculation
After CIST and MSTI calculations are complete, the mapping
ht
between VLANs and MSTIs in each MST region is

independent.
On an MSTP-aware network, a VLAN packet is forwarded
s:
along the following paths:

MSTI including the IST in an MST region
ce
CST among MST regions

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Interoperability between MSTP and RSTP

An RSTP or STP-enabled network bridge considers an MST
ht
region as the RSTP-enabled bridge with the bridge ID as the

regional root ID.
When an RSTP or STP-enabled network bridge receives an
s:
MST BPDU, it obtains the CIST root, ERPC, regional root ID,
ce
and designated port ID in the MST BPDU as the RID, RPC,

BID, and PID.
ur
When an MSTP-enabled network bridge receives an STP or

RST BPDU, it obtains the RID, RPC, BID, and PID as the
so
CIST root, ERPC, regional root ID, and designated port ID.
The BID is used as the regional root ID and designated switch
Re
ID, and the IRPC is 0.

ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
In MSTP, the P/A mechanism works as follows:

The upstream device sends a Proposal packet to the
ht
downstream device, requesting fast switching. After receiving

the Proposal packet, the downstream device sets its port
connecting to the upstream device to the root port and blocks
s:
all non-edge ports.

The upstream device continues to send an Agreement packet.
ce
After receiving the Agreement packet, the root port enters the
ur
Forwarding state.
The downstream device replies with an Agreement packet.
so
After receiving the Agreement packet, the upstream device

sets its port connecting to the downstream device to the
Re
designated port, and the port enters the Forwarding state.
By default, Huawei datacom devices use the enhanced P/A mechanism.

ng
To enable a Huawei datacom device to communicate with third-party

ni
devices that use the ordinary P/A mechanism, run the stp no-
agreement-check command to configure the ordinary P/A mechanism
ar
on the Huawei datacom device.

Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
S1, S2, and S3 must be in descending order of priority to meet
ht
requirements 2 and 3.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The stp mode command sets the working mode of a spanning
ht
tree protocol on a switching device.

The stp root command configures a switching device as the
root bridge or secondary root bridge of a spanning tree.
s:
The stp priority command sets the priority of the switching

ce
device in a spanning tree.

The stp cost command sets the path cost of a port in a
ur
spanning tree.
so
Parameters
stp mode { mstp | rstp | stp }
Re
mstp: indicates the MSTP mode.

rstp: indicates the RSTP mode.
ng
stp: indicates the STP mode.

stp [ instance instance-id ] root { primary | secondary }
ni
instance instance-id: specifies the ID of a spanning tree

instance. It needs to be specified in MSTP.
ar
primary: indicates that the switching device functions as

the primary root bridge of a spanning tree.
Le
secondary: indicates that the switching device functions

as the secondary root bridge of a spanning tree.
re
Mo
en
stp [ instance instance-id ] priority priority
m/
priority priority: specifies the priority of the switching
device in a spanning tree. The priority ranges from 0 to
co
61440. The value is a multiple of 4096, such as 0, 4096
and 8192. The default is 32768.
.
stp [ instance instance-id ] cost cost
ei
cost: specifies the path cost of a port. When the path
w
cost of a port changes, spanning tree recalculation will
ua
be performed.
.h
Precautions
On an STP/RSTP/MSTP network, each spanning tree has only
g
one root bridge, which is responsible for sending BPDUs and
in
connecting devices on the entire network. Because the root
rn
bridge is important on a network, the switching device with
high performance and network hierarchy is required to be
ea
selected as the root bridge. Such a device may not have high
priority, so you can run the stp root command to configure a
/l
switching device as the root bridge in a spanning tree.
A switching device in a spanning tree cannot function as both
:/
the primary and secondary root bridges.
After the stp root command is run to configure a switching
tp
device as the primary root bridge, the priority value of the

ht
switching device is 0 in the spanning tree and the priority

cannot be modified.
After the stp root command is run to configure a switching
s:
device as the secondary root bridge, the priority value of the

switching device is 4096 in the spanning tree and the priority
ce
cannot be modified.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In the preceding topology:
ht
Requirement 1 involves interoperability between RSTP

and STP.
Requirement 2 involves the stp root command usage.
s:
Requirement 3 involves the edge port, BPDU filtering,

ce
and BPDU protection.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The stp mcheck command configures a port to automatically
ht
switch from the STP mode back to the RSTP/MSTP mode.

The stp edged-port default command configures all ports on
a switching device as edge ports.
s:
The stp bpdu-filter default command configures all ports on a

ce
switching device as BPDU-filter ports.

The stp bpdu-protection command enables BPDU protection
ur
on a switching device.
The stp root-protection command enables root protection on
so
a port.
Precautions
Re
After the stp bpdu-filter default and stp edged-port default

commands are run in the system view, none of the ports on
ng
the device will initiate any BPDUs or negotiate with the directly
connected port on the remote device, and all the ports are in
ni
Forwarding state. This may lead to a loop and cause a

broadcast storm. Exercise caution when using the stp bpdu-
ar
filter default and stp edged-port default commands in the

system view.
Le
After BPDU protection is enabled on a switching device, the

switching device sets an edge port in error-down state if the
edge port receives a BPDU and retains the port as an edge
re
port.
Mo
en
The role of a designated port enabled with root protection
m/
cannot be changed. When a designated port enabled with root
protection receives a BPDU with a higher priority, the port
co
enters the Discarding state and does not forward packets. If
the port does not receive any BPDUs with higher priority after
.
a given period of time (generally two Forward Delay periods),
ei
the port automatically enters the Forwarding state.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
S1 must be configured as the root bridge in MSTI2 and S3
ht
must be configured as the root bridge in MSTI3 to meet

requirement 3, the Alternate port as figure above. So, S1 need
be configured as the root bridge in MSTI2, S2, S3, and S4
s:
must be in descending order of priority; and S3 need be

ce
configured as the root bridge in MSTI3, S1, S4, and S2 must

be in descending order of priority.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The region-name command configures the MST region name
ht
of a switching device.
The instance command maps a VLAN to an MSTI.
The revision-level command configures the revision level of
s:
an MST region of a switching device. The default value is 0.

The active region-configuration command activates the
ce
configuration of an MST region.

ur
The stp loop-protection command enables loop protection on

a port.
so
Precautions
Re
Two switching devices belong to the same MST region only

when they have the following identical configurations:
MST region name
ng
Mappings between MSTIs and VLANs

ni
MST region revision level

Loop protection
ar
On a network running a spanning tree protocol, a

switching device maintains the status of the root port
Le
and blocked port by continuously receiving BPDUs

from the upstream switching device.
re
Mo
en
If ports cannot receive BPDUs from the upstream
m/
switching device due to link congestion or
unidirectional link failure, the switching device will re-
co
select a root port. The original root port then becomes
a designated port and the original blocked port enters
.
the Forwarding state. As a result, loops may occur on
ei
the network.
w
Loop protection can be deployed to prevent this
ua
problem. If the root port or alternate port cannot receive
BPDUs from the upstream device for a long period of
.h
time after loop protection is enabled, the root port or
alternate port will send a notification message to the
g
NMS. The root port will enter the Discarding state, and
in
the alternate port remains in Blocking state and no
rn
longer forwards packets. This prevents loops on the
network. The root port or alternate port restores the
ea
Forwarding state after receiving BPDUs.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If the topology of an MSTI changes, the forwarding paths of VLANs that

are mapped to this MSTI change. As a result, ARP entries relevant to
ht
these VLANs need to be updated. Based on methods for processing

ARP entries, the convergence modes of a spanning tree protocol are
classified into fast and normal:
s:
In fast mode, the switch directly deletes the ARP entries that
ce
need to be updated in an ARP table.

In normal mode, the switch ages the ARP entries that need to
ur
be updated in the ARP table. If the number of ARP probes for

aging ARP entries is larger than 0, the switch probes these
so
ARP entries before aging them.

In fast mode, frequent ARP entry deletion will affect services
Re
and even may cause 100% CPU usage. As a result, packet

processing will time out, causing network flapping.
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Unicast
In unicast mode, the amount of data transmitted on a network
ht
is proportional to the number of users that require the data. If a

large number of users require the same data, the multicast
source must send many copies of data to these users,
s:
consuming high bandwidth on the multicast source and

ce
network. Therefore, the unicast mode is not suitable for batch

data transmission and is applicable only to networks with a
ur
small number of users.

Broadcast
so
In broadcast mode, data is sent to all hosts on a network

segment regardless of whether they require the data. This
Re
threatens information security and causes broadcast storms on

the network segment. Therefore, the broadcast mode is not
ng
suitable for data transmission from a source to specified

destinations. In addition, the broadcast mode wastes network
ni
bandwidth.
Multicast has the following advantages over unicast and broadcast:
ar
Compared with the unicast mode, the multicast mode starts to

copy data and distribute data copies on the network node as
Le
far from the source as possible. Therefore, the amount of data

and the level of network resource consumption will not
increase greatly when the number of receivers increases.
re
Mo
en
Compared with the broadcast mode, the multicast mode
m/
transmits data only to receivers that require the data. This
saves network resources and enhances data transmission
co
security.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast basic concepts

Multicast group: A group of receivers identified by an IP
ht
multicast address. User hosts (or other receiver devices) that

have joined a multicast group become members of the group
and can identify and receive the IP packets destined for the
s:
multicast group address.

Multicast source: A sender of multicast data. The server in the
ce
topology is a multicast source. A multicast source can

ur
simultaneously send data to multiple multicast groups. Multiple

multicast sources can simultaneously send data to the same
so
multicast group. A multicast source does not need to join any

multicast groups.
Re
Multicast group member: A host that has joined a multicast

group. PC1 and PC2 in the following topology are multicast
ng
group members. Memberships in a multicast group change

dynamically. Hosts can join or leave a multicast group anytime.
ni
Members of a multicast group are located anywhere on a

network.
ar
Multicast router: A router or Layer 3 switch that supports IP

multicast. The routers in the following topology are multicast
Le
routers. In addition to multicast routing functions, multicast

routers connected to user network segments provide multicast
membership management.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast service models are classified for receiver hosts and do not
affect multicast sources. All multicast data packets sent from a
ht
multicast source use the IP address of the multicast source as the

source IP address and use a multicast group address as the
destination address. Depending on whether receiver hosts can select
s:
multicast sources, two multicast models are defined: any-source

ce
multicast (ASM) model and source-specific multicast (SSM) model. The

two models use multicast group addresses in different ranges.
ur
ASM model: Receiver hosts can only specify the group they
want to join and cannot select multicast sources.
so
SSM model: Receiver hosts can specify the multicast sources

from which they want to receive multicast data when they join
Re
a group. After joining the group, the hosts receive only the data
sent from the specified sources.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast addresses
IP addresses 224.0.0.0 to 224.0.0.255 are reserved as
ht
permanent group addresses by the Internet Assigned

Numbers Authority (IANA). In this address range, 224.0.0.0 is
not allocated, and the other addresses are used by routing
s:
protocols for topology discovery and maintenance. These

ce
addresses are locally valid. Packets with these addresses will

not be forwarded by routers regardless of the time-to-live (TTL)
ur
values in the packets.

Addresses in the range of 224.0.1.0 to 231.255.255.255 and
so
233.0.0.0 to 238.255.255.255 are ASM group addresses and

are globally valid.
Re
Addresses 232.0.0.0 to 232.255.255.255 are SSM group

addresses available to users and are globally valid.
Addresses 239.0.0.0 to 239.255.255.255 are local
ng
administrative multicast addresses and are valid only in the

ni
local administrative domain. Local administrative group

addresses are private addresses. A local administrative group
ar
address can be used in different administrative domains.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Mapping from IPv4 multicast addresses to MAC addresses

The first four bits of an IPv4 multicast address are 1110,
ht
mapped to the leftmost 25 bits of a MAC multicast address.

Only 23 bits of the last 28 bits are mapped to a MAC address.
This means that 5 bits of the IP address are lost. As a result,
s:
32 multicast IP addresses are mapped to the same MAC

ce
address. For example, IP multicast addresses 224.0.1.1,

224.128.1.1, 225.0.1.1, and 239.128.1.1 are all mapped to
ur
MAC multicast address 01-00-5e-00-01-01. Address conflicts

must be considered in address assignment.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IGMP
IGMP is deployed between multicast routers and user hosts.
ht
On a multicast router, IGMP is configured on interfaces

connected to hosts.
On hosts, IGMP allows group members to dynamically join and
s:
leave multicast groups. On routers, IGMP manages and

ce
maintains group memberships and exchanges information with

upper-layer multicast routing protocols.
ur
PIM
PIM has two modes: PIM-DM and PIM-SM.
so
It must be enabled on all interfaces of all multicast routers.

It provides multicast routing and forwarding, and maintains the
Re
multicast routing table based on network topology changes.

IGMP snooping
IGMP snooping is deployed in VLANs on Layer 2 switches
ng
between multicast routers and hosts.

ni
It listens on IGMP messages exchanged between routers and

hosts to create and maintain a Layer 2 multicast forwarding
ar
table. In this manner, multicast data can be forwarded on a

Layer 2 network.
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IGMP
IGMP is an IPv4 group membership management protocol in
ht
the TCP/IP protocol suite. IP hosts use IGMP to report their

group memberships to any immediately-neighboring multicast
routers.
s:
IGMP is deployed between multicast routers and hosts. On a

ce
multicast router, IGMP is configured on interfaces connected

to hosts.
ur
On hosts, IGMP allows group members to dynamically join and

leave multicast groups. On routers, IGMP manages and
so
maintains group memberships and exchanges information with

upper-layer multicast routing protocols.
Re
The IGMP versions are backward compatible. Therefore, a

multicast router running a later IGMP version can identify
ng
Membership Report messages sent from hosts running an

earlier IGMP version, although the IGMP messages in different
ni
versions use different formats.

All of the IGMP versions support the any-source multicast
ar
(ASM) model. IGMPv3 can be independently used in the

source-specific multicast (SSM) model, whereas IGMPv1 and
Le
IGMPv2 must be used with SSM mapping.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IGMP messages are encapsulated in IP packets. IGMPv1 defines the

following types of messages:
ht
General Query: Sent by a querier to all hosts and routers on

the shared network segment to discover which multicast
groups have members on the network segment.
s:
Report: Sent by a host to request to join a multicast group or

ce
respond to a General Query message.

How IGMPv1 works
ur
IGMPv1 uses a query-report mechanism to manage multicast

groups. When multiple multicast routers exist on a network
so
segment, one router is elected as the IGMP querier to send

Query messages. In IGMPv1 implementation, a unique Assert
Re
winner or designated router (DR) is elected by Protocol

Independent Multicast (PIM) to work as the querier. (The
ng
election mechanism will be described later). The querier is the

only device that sends Membership Query messages on the
ni
local network segment.

General query and report
ar
In the multicast network, R1 and R2 connect to a user network

segment with three receivers: PC1, PC2, and PC3. R1 is the
Le
querier on the network segment. PC1 and PC2 want to receive

data sent to group G1, and PC3 wants to receive data sent to
group G2. The general query and report process is as follows:
re
Mo
en
The IGMP querier (R1) sends a General Query
m/
message with the destination address 224.0.0.1
(indicating all hosts and routers on the same network
co
segment). The IGMP querier sends General Query
messages at intervals. The interval can be configured
.
using a command, and the default interval is 60
ei
seconds.
w
All hosts on the network segment receive the General
ua
Query message. PC1 and PC2 then start a timer for G1
(Timer-G1), and PC3 starts a timer for G2 (Timer-G2).
.h
The timer length is a random value between 0 and 10,
in seconds.
g
The host with the timer expiring first sends a Report
in
message for the multicast group. In this example,
rn
Timer-G1 on PC1 expires first, and PC1 sends a
Report message with the destination address as G1.
ea
When PC2 detects the Report message sent by PC1,
PC2 stops Timer-G1 and does not send any Report
/l
messages for G1. This mechanism reduces the
number of Report messages transmitted on the
:/
network segment, lowering loads on multicast routers.
When Timer-G2 on PC3 expires, PC3 sends a Report
tp
message with the destination address as G2 to the

ht
network segment.
After the routers receive the Report message, they
know that multicast groups G1 and G2 have members
s:
on the local network segment. The routers use the

multicast routing protocol to create (*, G1) and (*, G2)
ce
entries, in which * stands for any multicast source.

Once the routers receive data sent to G1 and G2, they
ur
forward the data to this network segment.

A member joins a group
so
A new host PC4 connects to the network segment. PC4wants

Re
to join multicast group G3 but detects no multicast data for G3.

In this case, PC4 immediately sends a Report message for G3
without waiting for a General Query message. After receiving
ng
the Report message, the routers know that a member of G3

has connected to the network segment, and they create a (*,
ni
G3) entry. When the routers receive data sent to G3, they
forward the data to this network segment.
ar
A member leaves a group

Le
IGMPv1 does not define a Leave message. After a host leaves

a multicast group, it no longer responds to General Query
messages. Assume that PC4 has left group G3. It does not
re
send Report messages for G3 when receiving General Query

messages.
Mo
en
Because there is no other member of G3, routers no longer
m/
receive Report message for G3. After a period of time (130
seconds, Membership timeout interval = IGMP general query
co
interval x Robustness variable + Maximum response time), the
routers delete the multicast forwarding entry of G3.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IGMPv2 defines two types of new messages in addition to General

Query and Report messages:
ht
Group-Specific Query: sent by a querier to a specified group

on the local network segment to check whether the group has
members.
s:
Leave: sent by a host to notify routers on the local network

ce
segment that it has left a group.

IGMPv2 modifies the General Query message format by
ur
adding the Max Response Time field in the message. The field
value controls the response speed of group members and is
so
configurable.
Querier election
Re
IGMPv2 defines an independent querier election mechanism.

When multiple multicast routers are available on a shared
ng
network segment, the router with the smallest IP address is

elected as the querier. IGMPv1 depends on upper-layer
ni
multicast protocols such as PIM for querier election.

ar
Each IGMPv2 router considers itself as a querier when

it starts and sends a General Query message to all
Le
hosts and routers on the local network segment.

When other routers receive the General Query
message, they compare the source IP address of the
re
message with their own interface IP addresses.

Mo
en
The router with the smallest IP address becomes the
m/
querier, and the other routers are non-queriers. In this
network, R1 has a smaller interface IP address than R2,
co
so R1 becomes the querier.
All non-querier routers start a timer (Other Querier
.
Present Timer, Timer length = Robustness variable x
ei
IGMP general query interval + (1/2) x Maximum
w
response time. If the robustness variable, IGMP
ua
general query interval, and maximum response time
are all default values, the Other Querier Present Timer
.h
length is 125 seconds.) If non-querier routers receive a
Query message from the querier before the timer
g
expires, they reset the timer. If non-querier routers
in
receive no Query message from the querier when the
rn
timer expires, they trigger election of a new querier.
Leave mechanism
ea
In IGMPv2 implementation, the following process occurs when
PC3 wants to leave multicast group G2 and if PC3 is the group
member of last response query:
/l
PC3 sends a Leave message for G2 to all multicast
:/
routers on the local network segment. The destination
address of the Leave message is 224.0.0.2.
tp
When the querier receives the Leave message, it

ht
sends Group-Specific Query messages for G2 at

intervals to check whether G2 has other members on
the network segment. The sending interval and number
s:
of Group-Specific Query messages sent by the querier

are configurable. By default, the querier sends a total of
ce
two Group-Specific Query messages, at an interval of 1

second. In addition, the querier starts the membership
ur
timer (Timer-Membership, Timer length = Interval for

sending Group-Specific Query messages x Number of
so
messages sent).
Re
If G2 has no other member on the network segment,

the routers cannot receive any Report message for G2.
After Timer-Membership expires, the routers delete the
ng
downstream interface connected to the network

segment from the (*, G2) entry. Then the routers no
ni
longer forward data of G2 to the network segment.

If G2 has other members on the network segment, the
ar
members send a Report message for G2 within the

Le
maximum response time. The routers continue

maintaining membership of G2.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IGMPv3 was developed to support the source-specific multicast (SSM)

model. IGMPv3 messages can contain multicast source information so
ht
that hosts can receive data sent from a specific source to a specific
group.
IGMPv3 also defines two types of messages: Query and Report.
s:
Compared with IGMPv2, IGMPv3 has the following changes:

In addition to General Query and Group-Specific Query
ce
messages, IGMPv3 defines a new Query message type:

ur
Group-and-Source-Specific Query. A querier sends a Group-

and-Source-Specific Query message to members of a specific
so
group on the shared network segment, to check whether the

group members want data from specific sources. A Group-
Re
and-Source-Specific Query message carries one or more

multicast source addresses.
A host can send a Report message to notify a multicast router
ng
that it wants to join a multicast group and receive data from

ni
specified multicast sources. IGMPv3 supports source filtering

and defines two filter modes: INCLUDE and EXCLUDE.
ar
Group-source mappings are represented as (G, INCLUDE, (S1,

S2...)) or (G, EXCLUDE, (S1, S2...)). The (G, INCLUDE, (S1,
Le
S2...)) entry indicates that a host only wants to receive data

sent from the listed multicast sources to group G. The (G,
EXCLUDE, (S1, S2...)) entry indicates that a host wants to
re
receive data sent from all multicast sources except the listed
Mo
ones to group G.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Group Record types in IGMPv3 Report messages

IS_IN
ht
Indicates that the source filter mode is INCLUDE for a

multicast group. That is, members of the group want to
receive only data sent from the specified sources.
s:
IS_EX
Indicates that the source filter mode is EXCLUDE for a
ce
multicast group. That is, members of the group want to

ur
receive data sent from multicast sources except the

specified sources.
so
TO_IN
Indicates that the source filter mode for a multicast
Re
group has changed from EXCLUDE to INCLUDE. If the

source list is empty, the members have left the
ng
multicast group.
TO_EX
ni
Indicates that the source filter mode for a multicast

group has changed from INCLUDE to EXCLUDE.
ar
ALLOW
Indicates that members of a multicast group want to
Le
receive data from the specified multicast sources in

addition to the current sources. If the source filter mode
for the multicast group is INCLUDE, the specified
re
sources are added to the source list. If the source filter

Mo
mode is EXCLUDE, the specified sources are deleted

from the source list.
en
BLOCK
m/
Indicates that members of a multicast group no longer
want to receive data from the specified multicast
co
sources. If the source filter mode for the multicast
group is INCLUDE, the specified sources are deleted
.
from the source list. If the source filter mode is
ei
EXCLUDE, the specified sources are added to the
w
source list.
ua
An IGMPv3 Report message can carry multiple groups, whereas an
IGMPv1 or IGMPv2 Report message can carry only one group. IGMPv3
.h
greatly reduces the number of messages transmitted on a network.
Unlike IGMPv2, IGMPv3 does not define a Leave message. Group
g
members send Report messages of a specified type to notify multicast
in
routers that they have left a group. For example, if a member of group
rn
225.1.1.1 wants to leave the group, it sends a Report message with
(225.1.1.1, TO_IN, (0)).
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If IGMPv1 or IGMPv2 is running between a host and its upstream router,

the host cannot select multicast sources when it joins group G. The
ht
host receives data from both S1 and S2, regardless of whether it

requires the data. If IGMPv3 is running between the host and its
upstream router, the host can choose to receive only data from S1
s:
using either of the following methods:

Method 1: Send an IGMPv3 Report (G, IS_IN, (S1)),
ce
requesting to receive only the data sent from S1 to G.

ur
Method 2: Send an IGMPv3 (G, IS_EX, (S2)), notifying the

upstream router that it does not want to receive data from S2.
so
Only data sent from S1 is then forwarded to the host.

Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Compatibility with IGMPv1 routers

When IGMPv2 hosts discover an IGMPv1 router, they must
ht
send IGMP Report messages to the router and cannot send

Leave messages.
If there are both IGMPv1 and IGMPv2 routers on a network
s:
segment, the querier must send IGMPv1 messages.

ce
Compatibility with IGMPv1 hosts

IGMP v2 hosts must allow their Report messages to be
ur
suppressed by IGMPv1 Report messages. Otherwise, the

querier will not know existence of IGMPv1 hosts on the shared
so
network segment. If the querier is an IGMPv2 router and

receives a Leave message for a group (there are IGMPv1
Re
hosts in the group), the IGMPv1 hosts will not receive traffic for
this group.
If an IGMPv2 router detects IGMPv1 hosts on the local
ng
network segment, the router ignores any subsequent Leave

ni
messages received.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
SSM mapping is implemented based on static SSM mapping entries. A

multicast router converts (*, G) information in IGMPv1 and IGMPv2
ht
Report messages to (S, G) information according to static SSM

mapping entries, so as to provide the SSM service for IGMPv1 and
IGMPv2 hosts. By default, SSM group addresses range from 232.0.0.0
s:
to 232.255.255.255.
ce
IGMP SSM mapping does not apply to IGMPv3 Report messages. To

enable hosts running any IGMP version on a network segment to
ur
obtain the SSM service, IGMPv3 must run on interfaces of multicast

routers on the network segment.
so
With SSM mapping entries configured, a router checks the group

Re
address G in each IGMPv1 or IGMPv2 Report message received, and

processes the message based on the check result:
If G is in the range of any-source multicast (ASM) group
ng
addresses, the router provides the ASM service for the host.
ni
If G is in the range of SSM group addresses:

When the router has no SSM mapping entry matching G,
ar
it does not provide the SSM service and drops the

Report message.
Le
If the router has an SSM mapping entry matching G, it

converts (*, G) information in the Report message into (S,
G) information and provides the SSM service for the host.
re
Mo
en
On an SSM network, PC1 runs IGMPv3, PC2 runs IGMPv2, and
m/
PC3 runs IGMPv1. PC2 and PC3 cannot run IGMPv3. To
provide the SSM service for all the hosts on the network
co
segment, IGMP SSM mapping must be configured on R1.
Before SSM mapping is enabled, the group-source mappings
.
on R1 are as follows:
ei
Group 232.0.0.0/8 mapped to source 10.10.1.1
w
ua
After SSM mapping is enabled on R1, R1 checks group
.h
addresses of received packets to see whether the group
addresses are in the SSM group address range. If the group
g
addresses are in the SSM group address range, R1 generates
in
the following multicast entries according to the configured SSM
rn
mapping entries. If a group address is mapped to multiple
sources, R1 generates multiple (S, G) entries. The following are
ea
entries generated according to information in Report messages
sent from PC2 and PC3:
10.10.1.1232.1.2.2
10.10.2.2232.1.2.2
/l
:/
10.10.1.1232.1.3.3
10.10.2.2232.1.3.3
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Report message to the upstream device. The upstream device can

send multicast packets to the host after receiving the Report message.
ht
IGMP messages are encapsulated in IP packets (Layer 3 packets).

Layer 2 devices between hosts and multicast routers, however, cannot
process Layer 3 information carried in IP packets. In addition, Layer 2
s:
devices cannot learn any MAC multicast address because the source
ce
MAC addresses of link layer data frames are not MAC multicast
addresses. When a Layer 2 device receives a data frame with a
ur
multicast destination MAC address, the device cannot find a matching

entry in its MAC address table. Consequently, the device broadcasts
so
the multicast packet. This wastes bandwidth resources and poses

threats to network security.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Concepts
A router port is a link layer device's port towards a
ht
multicast router. The link layer multicast device

receives packets through the router port. Router ports
are classified into two types:
s:
Dynamic router port: A port that can receive

ce
IGMP Query messages or PIM Hello messages

whose source addresses are not 0.0.0.0.
ur
Dynamic router ports are dynamically

maintained based on protocol packets
so
exchanged between multicast devices and

hosts. Each dynamic router port has a timer.
Re
When the timer expires, the member port ages

out.
Static router port: Manually specified using a
ng
command. Static router ports will not age out.

ni
A group member port is a port towards user hosts. A

link layer multicast device sends multicast packets to
ar
receiver hosts through group member ports. Group

member ports are classified into two types:
Le
Dynamic member port: A port that can receive

IGMP Report messages. Dynamic member
ports are dynamically maintained based on
re
protocol packets exchanged between multicast

Mo
devices and hosts.

en
Each dynamic member port has a timer. When
m/
the timer expires, the member port ages out.
Static member port: Manually specified using a
co
command. Static member ports will not age out.
The output port list is important information for layer-2
.
multicast, include port of router and port of member.
ei
Working mechanisms
w
When a router port on an Ethernet switch receives an
ua
IGMP General Query message, the switch resets the
aging timer of the router port. If the port that receives
.h
the General Query message is not a router port, the
switch starts the aging timer for the port. (The aging
g
time is 180 seconds or the Holdtime value carried in
in
PIM Hello messages received by the switch. The
rn
default Holdtime value is 105 seconds.)
When an Ethernet switch receives an IGMP Report
ea
message, it checks whether there is a MAC multicast
group matching the IP multicast group that the user
wants to join.
/l
If the MAC multicast group does not exist, the
:/
switch creates the MAC multicast group, adds
the port that receives the Report message to
tp
the MAC multicast group, and starts the aging

ht
timer on the port (Timer length = Robustness

variable x General query interval + Maximum
response time). In addition, the switch adds all
s:
router ports in the same VLAN as the member

port to the MAC multicast forwarding entry. It
ce
then creates an IP multicast group and adds the

port that receives the Report message to the IP
ur
multicast group.
If the MAC multicast group exists but the port
so
that receives the IGMP Report message is not

Re
in the group, the switch adds the port to the

MAC multicast group and starts the aging timer
on the port. The switch then checks whether the
ng
IP multicast group exists. If the IP multicast

group does not exist, the switch creates the IP
ni
multicast group and adds the port to it. If the IP

multicast group exists, the switch adds the port
ar
to the group directly.

Le
If the MAC multicast group exists and the port

that receives the IGMP Report message is
already in the group, the switch resets the aging
re
timer on the port.

Mo
en
IGMP Leave message: When an Ethernet switch
m/
receives an IGMP Leave message for a group on a
port, it sends an IGMP Group-Specific Query
co
message to the port to check whether the group has
other members on the port. At the same time, the
.
switch starts the query response timer (Timer length =
ei
Group-specific query interval x Robustness variable).
w
If the switch does not receive any IGMP Report
ua
message for the group when the query response
timer expires, it deletes the port from the matching
.h
MAC multicast group. If the MAC multicast group has
no member port, the switch requests the upstream
g
multicast router to delete this branch from the
in
multicast tree.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Layer 2 multicast
If users in different VLANs require the same multicast data, the
ht
upstream router still has to send multiple copies of identical

multicast data to different VLANs.
Users in VLAN 2 and VLAN 3 need to receive the same
s:
multicast data flow. Multicast router R1 replicates the multicast

ce
data in each VLAN and sends two copies of data to

downstream switch S1. This wastes bandwidth between the
ur
router and Layer 2 device and increases loads on the router.

Multicast VLAN
so
The multicast VLAN feature allows Layer 2 network devices to

replicate multicast data across VLANs.
Re
After the multicast VLAN function is configured on S1, R1

replicates multicast data in the multicast VLAN (VLAN 4) and
ng
sends only one copy to S1. As the router does not need to
replicate multicast data in VLAN 2 and VLAN 3, network
ni
bandwidth is conserved and loads on the router are reduced.

Concepts
ar
Multicast VLAN: VLAN to which a network-side interface

belongs. A multicast VLAN is used to aggregate multicast data
Le
flows. One multicast VLAN can be bound to multiple user

VLANs.
User VLAN: VLAN to which a user-side interface belongs. A
re
user VLAN is used to receive multicast data flows from the

Mo
multicast VLAN. A user VLAN can be bound only to one

multicast VLAN.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
We have learned about the Internet Group Management Protocol

(IGMP). The IGMP protocol runs between receiver hosts and multicast
ht
routers, whereas a multicast routing protocol needs to run between

routers.
A multicast routing protocol is used to create and maintain multicast
s:
routes, and to forward multicast data packets correctly and efficiently.

ce
Multicast routes construct a unidirectional loop-free data transmission

path from a data source to multiple receivers. This transmission path is
ur
a multicast distribution tree. Multicast routing protocols can be intra-

domain or inter-domain protocols. This course introduces PIM, a typical
so
intra-domain multicast routing protocol.

Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PIM router
Routers with PIM enabled on interfaces are called PIM routers.
ht
A multicast distribution tree contains the following types of PIM

routers:
Leaf router: The PIM router directly connected to a user
s:
host, which may not be multicast group members.

First-hop router: The PIM router directly connected to a
ce
multicast source on the multicast forwarding path and

ur
responsible for forwarding multicast data from the

multicast source.
so
Last-hop router: The PIM router directly connected to a

multicast group member on the multicast forwarding
Re
path and responsible for forwarding multicast data to

the member.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast distribution tree

On a PIM network, a point-to-multipoint multicast forwarding
ht
path is set up for each multicast group on routers. The

multicast forwarding path is in a tree topology, so it is also
called a multicast distribution tree.
s:
There are two types multicast distribution trees: source tree

ce
and shared tree.

Source tree
ur
A source tree is rooted at a multicast source and combines the

shortest paths from the source to receivers.
so
Therefore, a source tree is also called a shortest path tree

(SPT). For a multicast group, routers need to establish an SPT
Re
from each multicast source that sends packets to the group.

In this example, there are two multicast sources (S1 and S2)
ng
and two receivers (PC1 and PC2). Therefore, two source trees
are established on the network.
ni
PIM routing entry

PIM routing entries are created by the PIM protocol to guide
ar
multicast forwarding.
An (S, G) entry contains a known multicast source for a group,
Le
and is used to establish an SPT on PIM routers. (S, G) entries

apply to both PIM-DM and PIM-SM networks.
If an (S, G) entry exists on a PIM router, the router forwards
re
multicast packet according to the (S, G) entry.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast distribution tree

On a PIM network, a point-to-multipoint multicast forwarding
ht
path is set up for each multicast group on routers. The

multicast forwarding path is in a tree topology, so it is also
called a multicast distribution tree.
s:
There are two types multicast distribution trees: source tree

ce
and shared tree.

Shared tree
ur
A shared tree is rooted at a rendezvous point (RP) and

combines shortest paths from the RP and all receivers. It is
so
therefore also called a rendezvous point tree (RPT). Each

multicast group has only one shared tree. All multicast sources
Re
and receivers of a group send and multicast data packets

along the shared tree. A multicast source first sends data
ng
packets to the RP, which then forwards the packets to all

receivers.
ni
In this example, multicast sources S1 and S2 share one RPT.

PIM routing entry
ar
PIM routing entries are created by the PIM protocol to guide

multicast forwarding.
Le
A (*, G) entry contains a known multicast group, with the

multicast source unknown. It is used to establish an RPT on
PIM routers. (*, G) entries apply only to PIM-SM networks.
re
If no (S, G) entry is available and only a (*, G) entry exists on a

Mo
router, the router creates an (S, G) entry based on this (*, G)

entry, and then forwards multicast packets according to the (S,
G) entry.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PIM DM overview
PIM-DM uses the push mode to forward multicast packets and
ht
is often used on small-scale networks with densely distributed

multicast group members. PIM-DM assumes that each
network segment has multicast group members. When a
s:
multicast source sends multicast packets, PIM-DM floods the

ce
multicast packets to all PIM routers on the network and prunes

the branches with no members. PIM-DM establishes and
ur
maintains a unidirectional loop-free SPT (source-specific

shortest path tree) through periodical flood-and-prune
so
processes. If a new group member connects to a leaf router on

a pruned branch, the router can initiate a grafting process to
Re
restore multicast forwarding before the next flood-and-prune

process.
PIM-DM uses the following mechanisms: neighbor discovery, flooding,
ng
pruning, grafting, assert, and state refresh. The flooding, pruning, and
ni
grafting mechanisms are used to establish an SPT.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PIM routers send Hello messages through all PIM-enabled interfaces.

The multicast packet encapsulating a Hello message has a destination
ht
IP address of 224.0.0.13 (indicating all PIM routers on a network

segment), and the source IP address is the IP address of the interface
sending the multicast packet. The TTL value of the multicast packet is 1.
s:
Hello messages are used to discover PIM neighbors, adjust PIM

ce
protocol parameters, and maintain neighbor relationships.

Discovering PIM neighbors
ur
PIM routers on the same network segment must

receive multicast packets with the destination address
so
224.0.0.13. By exchanging Hello messages, directly

connected PIM routers learn neighbor information and
Re
establish neighbor relationships.

A PIM router can receive other PIM messages to
ng
create multicast routing entries only after it establishes

neighbor relationships with other PIM routers.
ni
Adjusting PIM protocol parameters

A Hello message carries the following PIM protocol
ar
parameters to control PIM message exchange between

PIM neighbors:
Le
DR_Priority: indicates the priority used by an

interface in DR election. The interface with the
highest priority becomes the DR. This
re
parameter is used for DR election only on PIM-

Mo
SM networks.
en
Holdtime: indicates timeout interval of a
m/
neighbor relationship. A PIM router considers its
neighbor reachable within the Holdtime interval.
co
LAN_Delay: indicates the delay in transmitting
Prune messages on a shared network segment.
.
Neighbor-Tracking: indicates the neighbor
ei
tracking function.
w
Override-Interval: indicates the interval for
ua
overriding a pruning operation.
Maintaining neighbor relationships
.h
PIM routers periodically send Hello messages to each
other. If a PIM router does not receive any Hello
g
message from a PIM neighbor within the Holdtime
in
interval, the router considers the neighbor unreachable
rn
and deletes the neighbor from the neighbor list.
Changes of PIM neighbors lead to changes in the
ea
multicast network topology. If an upstream or
downstream neighbor in the multicast distribution tree
/l
is unreachable, multicast routes need to re-converge,
and the multicast distribution tree will change.
:/
IGMPv1 querier election
Routers on a PIM-DM network compare the priorities and IP
tp
addresses carried in Hello messages to elect a DR for each

ht
network segment. The DR functions as the IGMPv1 querier on

the network segment.
If the DR fails, neighboring routers trigger a new DR election
s:
process when the Hello timeout timer expires.

Hello timers
ce
The default Hello interval is 30 seconds.

The default Hello timeout interval is 105 seconds.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
On a PIM-DM network, multicast packets sent from a multicast source

are flooded throughout the entire network. When a PIM router receives
ht
a multicast packet, the router performs an RPF check on the packet

against the unicast routing table. If the packet passes the RPF check,
the router creates an (S, G) entry, in which the downstream interface
s:
list contains all the interfaces connected to downstream PIM neighbors.

ce
The router then forwards subsequent multicast packets through each

downstream interface.
ur
When multicast packets reach a leaf router, the leaf router processes
so
the packets as follows:

If the network segment connected to the leaf router has group
Re
members, the leaf router adds its interface connected to the

network segment to the downstream interface list of the (S, G)
ng
entry, and forwards subsequent multicast packets to the group

members.
ni
If the network segment connected to the leaf router has no

group member and the leaf router does not need to forward
ar
multicast packets to downstream PIM neighbors, the leaf

router initiates a pruning process.
Le
Multicast source S sends a multicast packet to multicast group
G.
re
Mo
en
When R1 receives the multicast packet, it performs an RPF
m/
check on the packet against the unicast routing table. After the
packet passes the RPF check, R1 creates an (S, G) entry, in
co
which the downstream interface list contains interfaces
connected to R2 and R5. R1 then forwards subsequent
.
packets to R2 and R5.
ei
R2 receives the multicast packet from R1. After the packet
w
passes the RPF check, R2 creates an (S, G) entry, in which
ua
the downstream interface list contains the interfaces
connected to R3 and R4. R2 then forwards subsequent
.h
packets to R3 and R4.
R5 receives the multicast packet from R1. Because the
g
downstream network segment does not have group members
in
or PIM neighbors, R5 triggers a pruning process.
R3 receives the multicast packet from R2. After the packet
rn
passes the RPF check, R3 creates an (S, G) entry, in which
ea
the downstream interface list contains the interface connected
to PC1. R3 then forwards subsequent packets to PC1
/l
R4 receives the multicast packet from R2. Because the
downstream network segment does not have group members
:/
or PIM neighbors, R4 triggers a pruning process.
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When a PIM router receives a multicast packet, it performs an RPF

check on the packet. If the packet passes the RPF check but the
ht
downstream network segment does not have any group member, the
PIM router sends a Prune message to the upstream router. After
receiving the Prune message from the downstream interface, the
s:
upstream router deletes the downstream interface from the downstream

ce
interface list of the (S, G) entry. The multicast packets will not be
forwarded to this downstream interface. A pruning operation is initiated
ur
by a leaf router. The Prune message is sent upstream hop by hop, and
PIM routers receiving the Prune message deletes the downstream
so
interface from the (S, G) entry. Finally, the multicast distribution tree
contains only branches with group members.
Re
A PIM router starts a prune timer (210 seconds by default) for the
pruned downstream interface and resumes multicast forwarding on the
ng
interface after the timer expires. Multicast packets are then flooded on
ni
the entire network, and new group members can receive multicast
packets. Subsequently, leaf routers without group members attached
ar
trigger pruning processes. PIM-DM updates the SPT through periodic

flood-and-prune processes.
Le
re
Mo
en
After a downstream interface of a leaf router is pruned:
m/
If new members join the multicast group on the interface and
want to receive multicast packets before the next flood-and-
co
prune process, the leaf router initiates a grafting process.
If no member joins the multicast group and multicast
.
forwarding still needs to be suppressed on the interface, the
ei
leaf router initiates a state refresh process.
w
ua
R5 sends a Prune message to R1 to notify R1 that the
.h
downstream network segment no longer needs to receive
multicast data.
g
After receiving the Prune message, R1 stops forwarding data
in
through its downstream interface connecting to R5, and
rn
deletes this downstream interface from the (S, G) entry. R1
has another downstream interface in forwarding state, so the
ea
pruning process ends. Subsequent multicast packets are only
forwarded to R2.
/l
R4 sends a Prune message to R2 to notify R2 that the
downstream network segment no longer needs to receive
:/
multicast data.
After receiving the Prune message, R2 waits for 3 seconds
tp
(LAN-delay +override-interval). R3 also receives the Prune

ht
message sent by R4. Because R3 connects to a downstream

receiver, R3 sends a Join message to override the Prune
message.
s:
After R2 receives the Join message, it ignores the Prune

message sent from R4 and continues forwarding multicast
ce
traffic to the downstream interface.

ur
The LAN-delay and override-interval are explained as follows:

Hello messages carry the LAN-delay and override-interval
so
parameters. The LAN-delay parameter specifies the packet

Re
transmission delay (500 milliseconds by default), and the

override-interval specifies the interval during which
downstream routers can override a pruning operation (2500
ng
milliseconds by default).
If a router sends a Prune message upstream but other routers
ni
on the same network segment still need to receive multicast

data, they must send a Join message to override the pruning
ar
operation within the override-interval.

Le
If routers on a link have different override-interval values, the

maximum override-interval value used among the routers is
used on the link.
re
Mo
en
The total of LAN-delay and override-interval is the prune-
m/
pending timer (PPT). After a router receives a Prune message
from a downstream interface, it waits until the PPT expires,
co
and then prune the downstream interface. If the router receives
a Join message from the downstream interface before the PPT
.
expires, it cancels the pruning operation.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast routers prune branches without group members to establish a

new SPT according to received Prune messages. Although routers no
ht
longer forward multicast packets to pruned branches, the

corresponding (S, G) entry still exists on each router. Once new
members join the group on the pruned branches, the downstream
s:
interfaces can be quickly added to the entry to resume multicast

ce
forwarding.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PIM-DM uses the grafting mechanism to enable new group members

on a pruned network segment to rapidly obtain multicast data. A leaf
ht
router can determine that a multicast group G has new members on a

network segment according to IGMP messages. The leaf router then
sends a Graft message to notify the upstream router that the
s:
downstream network segment needs multicast data. After receiving the

ce
Graft message, the upstream router adds the downstream interface to

the downstream interface list of the (S, G) entry.
ur
A grafting process is initiated by a leaf router and ends on the router

that can receive multicast packets.
so
Re
Pruned downstream nodes can resume multicast forwarding

when the prune timer expires, but they must wait for 210
ng
seconds before the prune timer expires. This is quite a long

time for new group members. To reduce the waiting time, a
ni
pruned downstream router can send a Graft message to notify

the upstream router.
ar
When the network segment connected to R5 has a new group

member, R5 sends a Graft message towards the multicast
Le
source S. When R1 receives the Graft message, it replies with

a Graft ACK message. After that, multicast data can be
forwarded to the previously pruned branch.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To prevent pruned interfaces from resuming multicast forwarding after

the prune timer expires, the first-hop router nearest to the multicast
ht
source periodically sends a State-Refresh message throughout the

entire PIM-DM network. Other PIM routers reset the prune timer after
receiving the State-Refresh message. In this way, pruned downstream
s:
interfaces remain suppressed if leaf routers connected to the interfaces

ce
have no new group members attached.

ur
R1 sends a State-Refresh message to R2 and R5 to initiate a
so
state refresh process.

R5 has a pruned interface and resets the prune timer on the
Re
interface. If R5 still has no group member on the connected

network segment when the next flood-and-prune process
ng
starts, the pruned interface is still suppressed.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If multicast PIM routers forward multicast packets to the same network

segment after the multicast packets pass the RPF check, only one PIM
ht
router can be selected through the assert mechanism to forward

multicast packets to the network segment. When a PIM router receives
a multicast packet that is the same as the multicast packet it sends to
s:
other neighbors, the PIM router sends an Assert message with the
ce
destination address 224.0.0.13 to all other PIM routers on the same

network segment. When the other PIM routers receive the Assert
ur
message, they compare local parameters with those carried in the

Assert message for assert election. The assert election is performed
so
according to the following rules:

The router with the highest priority of the unicast routing
Re
protocol wins.
If these routers have the same priority, the router with the
ng
smallest route cost to the multicast source wins.

If these routers have the same priority and the same route cost
ni
to the multicast source, the router with the largest downstream

interface IP address wins.
ar
The PIM routers perform the following operations based on assert

Le
election results:
The downstream interface of the router that wins the election is
the assert winner and forwards multicast packets to the shared
re
network segment.
Mo
en
The downstream interfaces the PIM routers that lose the
m/
election are assert losers and no longer forward multicast
packets to the shared network segment. The PIM routers
co
delete the downstream interfaces from the downstream
interface list of their (S, G) entries.
.
After the assert election is complete, only one downstream
ei
interface is active on the network segment, so only one copy of
w
multicast packets is transmitted to the network segment. All
ua
assert losers can resume multicast packet forwarding after a
specified interval (180 seconds by default), triggering periodic
.h
assert elections.
g
in
In this example, R2 has a smaller cost to the multicast source
rn
than R3.
R2 and R3 receive a multicast packet from each other through
ea
their downstream interfaces, but both the packets fail the RPF
check and are dropped. R2 and R3 then send an Assert
message to the network segment.
/l
R2 compares its routing information with that carried in the
:/
Assert message sent by R3 and finds that its own route cost to
the multicast source is smaller. Therefore, R2 wins the election.
tp
R2 continues forwards multicast packets to the network

ht
segment, whereas R3 drops subsequent multicast packets

because these packets fail the RPF check.
R3 compares its routing information with that carried in the
s:
Assert message sent by R2 and finds that its own router cost
to the multicast source is larger. Therefore, R3 fails the
ce
election. R3 then blocks multicast forwarding on its

downstream interface and deletes the interface from the
ur
downstream interface list of the (S, G) entry.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PIM-SM applies to the any-source multicast (ASM) and source-specific

multicast (SSM) models. In the ASM model, PIM-SM uses the pull
ht
mode to forward multicast packets. This mode is used in networks with

a lot of sparsely distributed group members. PIM-SM is implemented as
follows:
s:
A PIM router works as the rendezvous point (RP) to serve

ce
group members or multicast sources that appear on the

network. All PIM routers on the network know the RP's position.
ur
When a new group member appears on the network (a host

sends an IGMP message to request to join a multicast group
so
G), the last-hop router sends a Join message to the RP. The
Join message is transmitted hop by hop, and all the routers
Re
receiving the message create a (*, G) entry. Finally, an RPT

rooted at the RP is set up.
When an active multicast source appears on the network (the
ng
multicast source sends the first multicast packet to a multicast

ni
group G), the first-hop router encapsulates the multicast data

in a Register message and sends the Register message to the
ar
RP in unicast mode. The RP then creates an (S, G) entry, and

the multicast source is registered on the RP.
Le
PIM-SM uses the following mechanisms in the ASM model: neighbor

discovery, DR election, RP discovery, RPT setup, multicast source
re
registration, SPT switchover, pruning, and assert. You can also

Mo
configure a bootstrap router (BSR) to implement fine-grained

management in a PIM-SM domain.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The network segment of a multicast source or receivers may connect to

multiple PIM routers. The PIM routers exchange Hello messages to set
ht
up PIM neighbor relationships. The Hello message sent by a router

carries the DR priority of the router and IP address of the interface
connected to the network segment. Each PIM router compares its own
s:
information with the information carried in the Hello messages received

ce
from its neighbors. The DR elected among the PIM routers is

responsible for forwarding multicast packets for the multicast source or
ur
receivers. The DR is elected according to the following rules:

The PIM router with the highest DR priority wins (all routers on
so
the network segment support the DR priority).

If PIM routers have the same DR priority or at least one PIM
Re
router does not allow the DR priority field in Hello messages,

the PIM router with the largest IP address wins.
If the current DR fails, other PIM routers trigger a new DR
ng
election when the PIM neighbor timeout timer expires (105

ni
seconds by default).
ar
In the ASM model, the DR provides the following functions:

The DR on the shared network segment connected to a
Le
multicast source sends Register messages to the RP. This DR

is called the source DR.
The DR connected to the shared network segment of group
re
members sends Join messages to the RP. This DR is called

Mo
the receiver DR.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
On a PIM-SM network, an RPT is a multicast distribution tree with the

RP as the root and PIM routers that have group memberships as
ht
leaves. In the topology shown in the figure, when a group member

appears on the network (a user sends an IGMP message to join a
multicast group G), the receiver DR sends a Join message to the RP.
s:
The Join message is transmitted hop by hop, and routers receiving the
ce
message create a (*, G) entry. Finally, an RPT rooted at the RP is set

up.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
On a PIM-SM network, any new multicast source must register on the

RP so that the RP can forward multicast data from the multicast source
ht
to group members. The multicast source registration process is as

follows:
A multicast source sends a multicast packet to the source DR
s:
(R1).
After receiving the multicast packet, the source DR
ce
encapsulates the multicast packet into a Register message

ur
and sends the Register message to the RP (R2).

The RP decapsulates the received Register message, creates
so
an (S, G) entry, and forwards the multicast packet to group

members along the RPT.
Re
The RP no longer needs any Register message sent from R1,

so it sends a Register-Stop message to R1. R1 then stops
ng
sending Register messages to the RP.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
On a PIM-SM network, each multicast group can have only RP and one
RPT. Before an SPT switchover, all multicast packets destined for a
ht
multicast group must be encapsulated in Register messages and then

sent to the RP. The RP decapsulates Register messages and forwards
multicast packets along the RPT. All multicast packets pass through the
s:
RP. As the rate of multicast packets increases, the RP faces heavy

ce
loads. To resolve this problem, PIM-SM allows the RP or the receiver

DR to trigger an SPT switchover.
ur
SPT switchover conditions

so
When the multicast traffic rate exceeds the specified threshold,

PIM-SM triggers an RPT-to-SPT switchover.
Re
According to default configuration of the VRP, routers

connected to receivers join the SPT immediately after
ng
receiving the first multicast data packet from a multicast source.

ni
The receiver DR periodically checks the rate of multicast packets for an

(S, G) and triggers an SPT switchover when the rate exceeds the
ar
specified threshold.
The receiver DR sends a Join message to the source DR. The
Le
Join message is transmitted hop by hop, and routers receiving

the message create an (S, G) entry. Finally, an SPT is set up
from the source DR to the receiver DR.
re
Mo
en
After the SPT is set up, the receiver DR sends a Prune
m/
message to the RP. The Prune message is transmitted hop by
hop along the RPT, and routers receiving the message delete
co
their downstream interfaces from the (S, G) entry. After the
pruning process is complete, the RP no longer forwards
.
multicast packets along the RPT.
ei
If the SPT does not pass through the RP, the RP continues to
w
send a Prune message to the source DR, so that routers along
ua
the path between the RP and source DR delete their
downstream interfaces from the (S, G) entry. After the pruning
.h
process is complete, the source DR no longer forwards
multicast packets along the SPT to the RP.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
On a PIM-SM network, the root of a shared tree is an RP.

ht
An RP provides the following functions:

Forwards all multicast packets transmitted in the shared tree to
receivers.
s:
Forwards multicast data of several or all multicast groups. A

ce
network can have one or multiple RPs. You can configure an

RP to serve multicast groups in a specified range. An RP can
ur
serve multiple multicast groups, but each multicast group can

have only one RP. Multicast packets sent from a multicast
so
source to all receivers of a group are aggregated on the RP.

RP discovery:
Re
Static RP: A static RP address is specified on all PIM routers

in the PIM domain using the static-rp rp-address command.
Dynamic RP: Several PIM routers in a PIM domain are
ng
configured as candidate-RPs (C-RPs), among which an RP is

ni
elected. Candidate bootstrap routers (C-BSRs) also need to

be configured. A BSR is elected among the C-BSRs.
ar
An RP is the core router in a PIM-SM domain. If a small and simple

Le
network needs to transmit light multicast traffic and one RP is enough,

you can specify the RP address statically on all routers in the PIM-SM
domain. In most cases, PIM-SM networks have a large scale and need
re
to transmit heavy multicast traffic. To reduce loads on each RP and

Mo
optimize shared tree topology, different multicast groups should have

different RPs. Dynamic RP election is required in this condition, and a
BSR is required for RP election.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
During a BSR election, each C-BSR considers itself as the BSR and
sends a Bootstrap message to the entire network. The Bootstrap
ht
message carries the C-BSR address and priority. Each PIM router
receives Bootstrap messages from all C-BSRs and compares C-BSR
information to elect a BSR. The BSR is elected according to the
s:
following rules:
The C-BSR with the highest priority wins (larger priority value,
ce
higher priority).
ur
If C-BSRs have the same priority, the C-BSR with the largest
IP address wins.
so
An RP election process is as follows:

Re
Each C-RP sends an Advertisement message to the BSR. An

Advertisement message carries the C-RP address, the range
ng
of multicast groups the C-RP serves, and the C-RP priority.

The BSR summarizes the C-RP information in an RP-Set,
ni
encapsulates the RP-Set in a Bootstrap message, and

advertises the message all PIM-SM routers on the network.
ar
PIM routers follow the same rules to compare RP information

in the RP-Set and elect an RP from multiple C-RP for the
Le
same group. The RP election rules are as follows:

The C-RP interface with the longest address mask
wins.
re
The C-RP with the highest priority wins (larger priority

Mo
value, lower priority).

en
If C-RPs have the same priority, routers use a hash
m/
algorithm, and the C-RP with the largest hash value
wins.
co
If all the preceding parameters are the same, the C-RP
with the largest IP address wins.
.
All PIM routers use the same RP-Set and election rules, so
ei
they obtain mappings between RPs and multicast groups. The
w
PIM routers save the mappings for subsequent multicast
ua
forwarding.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The SSM model is implemented based on PIM-SM and

IGMPv3/MLDv2. In this model, an SPT can be established from a
ht
multicast source to group members without the need to maintain an RP,

establish an RPT, or register the multicast source.
In the SSM model, hosts can determine the location of the multicast
s:
sources. Therefore, they can specify the multicast sources from which
ce
they want to receive multicast data when joining a multicast group.

After the receiver DR receives the request from a host, it sends a Join
ur
message to the source DR. The Join message is then transmitted

upstream hop by hop. An SPT is then set up from the multicast source
so
to the host.
In the SSM model, PIM-SM uses the following mechanisms: neighbor
Re
discovery, DR election, and SPT setup.
An SPT setup process is as follows:

ng
R3 and R5 learn that hosts in the same multicast group

ni
request data from different multicast sources through IGMPv3.

Therefore, R3 and R5 send Join messages toward the sources.
ar
PIM routers that receive the Join message create (S1, G) and
(S2, G) entries according to the Join message. In this way,
Le
they set up an SPT from S1 to PC1 and an SPT S2 to PC2.

Multicast packets from the two multicast sources are then
forwarded to the respective receivers along the SPTs.
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RPF check
When a router receives a multicast packet, it searches the
ht
unicast routing table for the route to the source address of the
packet. After finding the route, the router checks whether the
outbound interface of the route is the same as the inbound
s:
interface of the multicast packet. If they are the same, the

ce
router considers that the multicast packet is received from a

correct interface. This process is called an RPF check, which
ur
ensures correct forwarding paths for multicast packets.

If multiple equal-cost routes are available, the route with the
so
largest next-hop address is used as the RPF route.

RPF checks can be performed based on unicast routes,
Re
Multiprotocol Border Gateway Protocol (MBGP) routes, or

static multicast routes. The priority order of these routes is
ng
static multicast routes > MBGP routes > unicast routes.

ni
A multicast stream sent from the source 152.10.2.2 arrives at

interface S1 of the router. The router checks the routing table
ar
and finds that the multicast stream from this source should
arrive at interface S0. Therefore, the RPF check fails and the
Le
multicast stream is dropped by the router.

A multicast stream sent from the source 152.10.2.2 arrives at
interface S0 of the router. The router checks the routing table
re
and finds that the RPF interface is also S0. The RPF check
Mo
succeeds, and the multicast stream is correctly forwarded.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Static multicast routing

For R3, the RPF neighbor towards the multicast source
ht
(Source) is R1. Therefore, multicast packets sent from Source

are forwarded along the path Source -> R1 -> R3. If you
configure a multicast static route on R3 and specify R2 as the
s:
RPF neighbor, the transmission path of multicast packets sent

ce
from Source changes to Source-> R1-> R2-> R3. The

multicast path then diverges from the unicast path.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this case, interconnection IP addresses are configured
ht
according to the following rule:

If RTX connects to RTY, their interface IP addresses
used to connect to each other are XY.1.1.X and
s:
XY.1.1.Y, network mask is 24.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The multicast routing-enable command enables the
ht
multicast routing function.

The pim dm command enables PIM-DM on an interface.
The pim hello-option dr-priority command sets the DR priority
s:
for a PIM interface.

The igmp enable command enables IGMP on an interface.
ce
The igmp version command specifies the IGMP version

ur
running on an interface.
Precautions
so
In this network topology, R2 is the IGMP querier, and R3

forwards multicast packets to downstream receivers because
Re
R3 is the assert winner.

The display pim routing-table command displays entries in
ng
the PIM routing table.

The display pim routing-table fsm command displays
ni
detailed information about the finite state machine (FSM) in the

PIM routing table.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
The network topology is the same as that in PIM-DM
ht
configuration. The network runs PIM-SM, and the transmission

scope of Bootstrap messages needs to be limited.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
The pim sm command enables PIM-SM on an interface.
ht
The c-rp command configures a router to notify the BSR that it

is a C-RP.
The c-bsr command configures a C-BSR.
s:
The pim bsr-boundary command configures the BSR

ce
boundary of the PIM-SM domain on an interface.

Precautions
ur
In this network topology, R2 is the IGMP querier, and R3

forwards multicast packets to downstream receivers because
so
R3 is the assert winner.

The display pim routing-table command displays entries in
Re
the PIM routing table.

The display pim routing-table fsm command displays
ng
detailed information about the FSM in the PIM routing table.

ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The method for checking the SPT in a PIM-SM network is similar to the
method for checking the RPT.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The method for checking the SPT in a PIM-SM network is similar to the
method for checking the RPT.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
In this case, interconnection IP addresses are configured
ht
according to the following rules:

If RTX connects to RTY, their interface IP addresses
used to connect to each other are XY.1.1.X and
s:
XY.1.1.Y, network mask is 24.

The loopback interface address of RTX is X.X.X.X/32.
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Pre-configuration
This page provides the basic OSPF configuration. In this case,
ht
R1 is the DR in the FR network.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results:
A Bootstrap message is transmitted from R1 to R2 and fails
ht
the RPF check on R2, so R2 drops the message. To enable

Bootstrap messages to be forwarded by R2, configure a static
multicast route on R2 to change the RPF path.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Results:
A Bootstrap message is transmitted from R1 to R2 and fails
ht
the RPF check on R2, so R2 drops the message. To enable

Bootstrap messages to be forwarded by R2, configure a static
multicast route on R2 to change the RPF path.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
Results:
The ACL restricts the multicast address range.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 characteristics are as follows:

Address space: An IPv6 address is 128 bits long. A 128-bit
ht
address structure allows for 2128 (4.3 billion x 4.3 billion x 4.3
billion x 4.3 billion) possible addresses. The biggest advantage
of IPv6 is its almost infinite address space.
s:
Packet format: IPv6 uses a new protocol header format rather

ce
than increasing the bits in the address field of an IPv4 packet

to 128 bits. The IPv6 data packets carry new packet headers.
ur
An IPv6 packet header includes IPv6 basic and extension

headers. Some optional fields are moved to the extension
so
header following the IPv6 header. This enables intermediate

routers on the network to process IPv6 packet headers more
Re
efficiently.
Autoconfiguration and readdressing: IPv6 provides address
ng
autoconfiguration, which allows hosts to automatically discover

networks and obtain IPv6 addresses. This significantly
ni
improves network manageability.

Hierarchical network structure: A huge address space allows
ar
for the hierarchical network design in IPv6. The hierarchical

network design facilitates route summarization and improves
Le
forwarding efficiency.
End-to-end security support: IPv6 supports IP Security (IPSec)
authentication and encryption at the network layer, so it
re
provides end-to-end security.

Mo
en
Quality of Service (QoS) support: IPv6 defines the Flow Label
m/
field in the packet header. This field enables network routers to
differentiate data flows and provide special processing for the
co
identified data flows. With this field, the routers can identify
data flows without checking the inner data packets being
.
transmitted. In this way, QoS can be implemented even if the
ei
valid payloads of data packets are encrypted.
w
Mobility: With the support for Router header and Destination
ua
option header, IPv6 provides built-in mobility.
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
It should be noted that an IPv6 address can contain only one double
colon (::). Otherwise, a computer cannot determine the number of zeros
ht
in a group when restoring the compressed address to the original 128-

bit address.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If the first 3 bits of an IPv6 unicast address are not 000, the interface ID
must be of 64 bits. If the first 3 bits are 000, there is no such limitation.
ht
IEEE EUI-64 standards

The length of an interface ID is 64 bits. IEEE EUI-64 defines a
method to convert a 48-bit MAC address into a 64-bit IPv6
s:
interface ID. In the MAC address, c bits indicate the vendor ID,
ce
d bits indicate the vendor number ID, and 0 bit indicates a

global/local bit. g specifies whether the interface ID indicates a
ur
single host or a host group. The specific conversion algorithm

is as follows: convert 0 to 1 and insert two bytes (FFFE)
so
between c and d.
The method for converting MAC addresses into IPv6 interface
Re
IDs reduces the configuration workload. When stateless

address autoconfiguration (stateless address
ng
autoconfiguration will be explicated in the following pages) is

used, you only need an IPv6 network prefix before obtaining
ni
an IPv6 address.
The defect of this method is that an IPv6 address can be easily
ar
calculated based on a MAC address.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv4 addresses are classified into unicast, multicast, and broadcast

addresses. Compared to IPv4, IPv6 has no broadcast address and
ht
introduces a new address type: anycast address. IPv6 addresses are

classified into unicast, multicast, and anycast addresses.
An IPv6 unicast address identifies an interface. Packets sent
s:
to an IPv6 unicast address are delivered to the interface

ce
identified by the unicast address.

An IPv6 multicast address identifies a group of interfaces.
ur
Packets sent to an IPv6 multicast address are delivered to all

the interfaces identified by the multicast address.
so
An IPv6 anycast address identifies multiple interfaces. Packets

sent to an anycast address are delivered to the nearest
Re
interface that is identified by the anycast address, depending

on the routing protocols. In fact, anycast addresses and
ng
unicast addresses use the same address space. The router

determines whether to send a packet in unicast mode or
ni
anycast mode.
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Global unicast address

An IPv6 global unicast address is an IPv6 address with a
ht
global unicast prefix, which is similar to an IPv4 public address.

IPv6 global unicast addresses support route prefix
summarization, helping limit the number of global routing
s:
entries.
A global unicast address consists of a global routing prefix,
ce
subnet ID, and interface ID.

ur
Global routing prefix: is assigned by a service provider

to an organization. A global routing prefix is of at least
so
48 bits. Currently, the first 3 bits of all the assigned

global routing prefixes are 001.
Re
Subnet ID: is used by organizations to construct a local

network (site). There are a maximum of 64 bits for both
ng
the global routing prefix and subnet ID. It is similar to

an IPv4 subnet number.
ni
Interface ID: refers to the interface identifier. It can be

used to identify a device (host).
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Link-local address
Link-local addresses have a limited application scope. An IPv6
ht
link-local address can be used only for communication

between nodes on the same link. A link-local address uses a
link-local prefix FE80::/10 as the first 10 bits (1111111010 in
s:
binary) and an interface ID as the last 64 bits.

When IPv6 runs on a node, each interface of the node is
ce
automatically assigned a link-local address that consists of a

ur
fixed prefix and an interface ID in EUI-64 format. This

mechanism enables two IPv6 nodes on the same link to
so
communicate without any additional configuration. Therefore,

link-local addresses are widely used in neighbor discovery and
Re
stateless address autoconfiguration.

Routing devices do not forward IPv6 packets with the link-local
ng
address as a source or destination address to devices on non-

local links.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Unique local address

Unique local addresses are used only within a site. Site-local
ht
addresses are deprecated in RFC 3879 and replaced by

unique local addresses in RFC 4193.
Unique local addresses are similar to IPv4 private addresses.
s:
Any organization that does not obtain a global unicast address

ce
from a service provider can use a unique local address.

Unique local addresses are routable only within a local
ur
network but not on the Internet.

Fields in a unique local address can be described as follows:
so
Prefix: is fixed as FC00::/7.

L: is set to 1 if the address is valid within a local
Re
network. The value 0 is reserved for future expansion.

Global ID: indicates a globally unique prefix, which is
ng
pseudo-randomly allocated (for details, see RFC 4193).

Subnet ID: identifies a subnet within the site.
ni
Interface ID: identifies an interface.

A unique local address has the following characteristics:
ar
Has a globally unique prefix. The prefix is pseudo-

randomly allocated and has a high probability of
Le
uniqueness.
Allows private connections between sites without
creating address conflicts.
re
Mo
en
Has a well-known prefix (FC00::/7) that allows for easy
m/
route filtering by edge routers.
Does not conflict with any other addresses or cause
co
Internet route conflicts if it is leaked outside of the site
through routing.
.
Functions as a global unicast address to upper-layer
ei
applications.
w
Is independent of the Internet Service Provider (ISP).
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Unspecified address
An IPv6 unspecified address is 0:0:0:0:0:0:0:0/128 or ::/128,
ht
indicating that an interface or a node does not have an IP

address. It can be used as the source IP address of some
packets, such as Neighbor Solicitation (NS) message in
s:
duplicate address detection. Devices do not forward the

ce
packets with the source IP address as an unspecified address.

Loopback address
ur
An IPv6 loopback address is 0:0:0:0:0:0:0:1/128 or ::1/128.

Similar to IPv4 loopback address 127.0.0.1, the IPv6 loopback
so
address is used when a node needs to send IPv6 packets to

itself. This IPv6 loopback address is usually used as the IP
Re
address of a virtual interface (a loopback interface for

example). The loopback address cannot be used as the
ng
source or destination IP address of packets that need to be

forwarded.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 multicast address

Like an IPv4 multicast address, an IPv6 multicast address
ht
identifies a group of interfaces, which usually belong to

different nodes. A node may belong to any number of multicast
groups. Packets sent to an IPv6 multicast address are
s:
delivered to all the interfaces identified by the multicast

ce
address.
An IPv6 multicast address is composed of a prefix, flag, scope,
ur
and group ID (global ID):

Prefix: is fixed as FF00::/8 (1111 1111).
so
Flag: is 4 bits long. Currently, only the last bit is used.

The high-order 3 bits are reserved and must be set to
Re
0s. The last bit 0 indicates a permanently-assigned

multicast address allocated by the Internet Assigned
ng
Numbers Authority (IANA). The last bit 1 indicates a

non-permanently-assigned (transient) multicast
ni
address.
Scope: is 4 bits long. It limits the scope where multicast
ar
data flows are sent on the network.

Group ID (global ID): is 112 bits long. It identifies a
Le
multicast group. RFC 2373 does not define all the 112
bits as a group ID but recommends using the low-order
32 bits as the group ID and setting all the remaining 80
re
bits to 0s.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 multicast addresses:

Like an IPv4 multicast address, an IPv6 multicast address
ht
identifies a group of interfaces, which usually belong to

different nodes. A node may belong to any number of multicast
groups. Packets sent to an IPv6 multicast address are
s:
delivered to all the interfaces identified by the multicast

ce
address.
An IPv6 multicast address is composed of a prefix, flag, scope,
ur
and group ID (global ID):

Prefix: is fixed as FF00::/8 (1111 1111).
so
Flag: is 4 bits long. Currently, only the last bit is used.

The high-order 3 bits are reserved and must be set to
Re
0s. The last bit 0 indicates a permanently-assigned

multicast address allocated by the Internet Assigned
ng
Numbers Authority (IANA). The last bit 1 indicates a

non-permanently-assigned (transient) multicast
ni
address.
Scope: is 4 bits long. It limits the scope where multicast
ar
data flows are sent on the network.

Group ID (global ID): is 112 bits long. It identifies a
Le
multicast group. RFC 2373 does not define all the 112
bits as a group ID but recommends using the low-order
32 bits as the group ID and setting all the remaining 80
re
bits to 0s.
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 anycast address

Anycast addresses are exclusive to IPv6. An anycast address
ht
identifies a group of interfaces, and this group of interfaces

often belong to different nodes. Packets sent to an anycast
address are delivered to the nearest interface that is identified
s:
by the anycast address, depending on the routing protocols.

ce
The IPv6 anycast addresses can be used in One-to-One-of-

Many communications. The receiver can be one interface of a
ur
group. For example, a mobile subscriber needs to connect to

the nearest receive station. Using anycast addresses, the
so
mobile subscriber is not limited by physical locations.

Anycast addresses are allocated from the unicast address
Re
space, using any of the unicast address formats. Thus,

anycast addresses are syntactically indistinguishable from
ng
unicast addresses. The nodes to which an anycast address is

assigned must be explicitly configured to know that it is an
ni
anycast address. Currently, anycast addresses are used only

as destination addresses, and are assigned to only routers.
ar
A subnet-router anycast address is predefined in RFC 3513.

The interface ID of a subnet-router anycast address is all 0s.
Le
Packets addressed to a subnet-router anycast address are

delivered to a certain router (the nearest router that is
identified by the address) in the subnet specified by the prefix
re
of the address. The nearest router is defined as being closest

Mo
in terms of routing distance.

en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An IPv6 packet has three parts: an IPv6 basic header, one or more
IPv6 extension headers, and an upper-layer protocol data unit (PDU).
ht
IPv6 basic header

Each IPv6 packet must have an IPv6 basic header,
which is fixed as 40 bytes long.
s:
The IPv6 basic header provides basic packet

ce
forwarding information and will be parsed by all routers

on the forwarding path.
ur
Extension headers
An IPv6 extension header is an optional header that
so
may follow the IPv6 basic header. An IPv6 packet may

carry zero, one, or more extension headers. The
Re
extension headers may be different in lengths. The

IPv6 header and IPv6 extension header replace the
ng
IPv4 header and its options. The IPv6 extension

header enhances IPv6 functions and has great
ni
extensibility. Unlike the Options of an IPv4 header, the

maximum length of an IPv6 extension header is not
ar
limited. Therefore, an IPv6 extension header can

contain all the extension data required by IPv6
Le
communications.
The extension information about packet forwarding in
an IPv6 extension header is not parsed by all the
re
routers on the path, and is generally parsed by only the

Mo
destination router.
en
Upper-layer protocol data unit
m/
An upper-layer PDU is composed of the upper-layer
protocol header and its payload such as an ICMPv6
co
packet, a TCP packet, or a UDP packet.
.
Fields in an IPv6 packet header are described as follows:
ei
Version: is 4 bits long. In IPv6, the Version field value is 6.
w
Traffic Class: is 8 bits long. It indicates the class or priority of
ua
an IPv6 packet. The Traffic Class field is similar to the TOS
field in an IPv4 packet and is mainly used in QoS control.
.h
Flow Label: is 20 bits long. This field is added in IPv6 to
differentiate traffic. A flow label and source IP address identify
g
a data flow. Intermediate network devices can effectively
in
differentiate data flows based on this field.
Payload Length: is 16 bits long, which indicates the length of
rn
the IPv6 payload. The payload is the rest of the IPv6 packet
ea
following the basic header, including the extension header and
upper-layer PDU. This field indicates only the payload with the
/l
maximum length of 65535 bytes. If the payload length exceeds
65535 bytes, the field is set to 0. The payload length is
:/
expressed by the Jumbo Payload option in the Hop-by-Hop
Options header.
tp
Next Header: is 8 bits long. This field identifies the type of the
ht
first extension header that follows the IPv6 basic header or the
protocol type in the upper-layer PDU.
Hop Limit: is 8 bits long. This field is similar to the Time to Live
s:
field in an IPv4 packet, defining the maximum number of hops

that an IP packet can pass through. The field value is
ce
decremented by 1 by each router that forwards the IP packet.

When the field value becomes 0, the packet is discarded.
ur
Source Address: is 128 bits long, which indicates the address

of the packet originator.
so
Destination Address: is 128 bits long, which indicates the

Re
address of the packet recipient.

ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 extension header

An IPv4 packet header has an optional field (Options), which
ht
includes security, timestamp, and record route options. The

variable length of the Options field makes the IPv4 packet
header length range from 20 bytes to 60 bytes. When routers
s:
forward IPv4 packets with the Options field, many resources

ce
need to be used. Therefore, these IPv4 packets are rarely

used in practice.
ur
IPv6 uses extension headers to replace the Options field in the

IPv4 header. Extension headers are placed between the IPv6
so
basic header and upper-layer PDU. An IPv6 packet may carry

zero, one, or more extension headers. The sender of a packet
Re
adds one or more extension headers to the packet only when

the sender requests other routers or the destination device to
ng
perform special handling. Unlike IPv4, IPv6 has variable-length

extension headers, which are not limited to 40 bytes. This
ni
facilitates further extension. To improve extension header

processing efficiency and transport protocol performance, IPv6
ar
requires that the extension header length be an integer

multiple of 8 bytes.
Le
When multiple extension headers are used, the Next Header

field of an extension header indicates the type of the next
header following this extension header.
re
Mo
en
An IPv6 extension header contains the following fields:
m/
Next Header: is 8 bits long. It is similar to the Next Header field
in the IPv6 basic header, indicating the type of the next
co
extension header (if existing) or the upper-layer protocol type.
Extension Header Len: is 8 bits long, which indicates the
.
extension header length excluding the Next Header field.
ei
Extension Head Data: is of variable lengths. It includes a
w
series of options and the padding field.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Each extension header can only occur once in an IPv6 packet, except
for the Destination Options header. The Destination Options header
ht
may occur at most twice (once before a Routing header and once
before the upper-layer header).
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The Internet Control Message Protocol version 6 (ICMPv6) is one of the

basic IPv6 protocols.
ht
In IPv4, ICMP reports IP packet forwarding information and

errors to the source node. ICMP defines certain messages
such as Destination Unreachable, Packet Too Big, Time
s:
Exceeded, and Echo Request or Echo Reply to facilitate fault

ce
diagnosis and information management. In addition to the

common functions provided by ICMPv4, ICMPv6 provides
ur
mechanisms such as Neighbor Discovery (ID), stateless

address configuration including duplicate address detection,
so
and Path Maximum Transmission Unit (PMTU) discovery.

The protocol number of ICMPv6, namely, the value of the Next
Re
Header field in an IPv6 packet is 58.

Some fields in the packet are described as follows:
Type: specifies the message type. Values 0 to 127
ng
indicate the error message type, and values 128 to 255

ni
indicate the informational message type.

Code: indicates a specific message type.
ar
Checksum: indicates the checksum of an ICMPv6

packet.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Destination Unreachable message

When a data packet fails to be sent to the destination node
ht
or the upper-layer protocol, the router or destination node

sends an ICMPv6 Destination Unreachable message to the
source node. In an ICMPv6 Destination Unreachable message,
s:
the value of the Type field is 1. The value of the Code field can
be 0, 1, 2, 3, and 4. Each value has a specific meaning
ce
(defined in RFC2463)
Code=0: No route to the destination device.
ur
Code=1: Communication with the destination device is

so
administratively prohibited.
Code=2: Not assigned.
Re
Code=3: Destination IP address is unreachable.

Code=4: Destination port is unreachable.
Packet Too Big message
ng
If a data packet cannot be sent to the destination node

ni
because the size of the packet exceeds the link MTU of the
outbound interface, the router sends an ICMPv6 Packet Too
ar
Big message to the source node. The link MTU of the

outbound interface is carried in the message. PMTU discovery
Le
is implemented based on Packet Too Big messages. In a

Packet Too Big message, the value of the Type field is 2 and
the value of the Code field is 0.
re
Mo
en
Time Exceeded message
m/
If a router receives a packet with the hop limit being 0, it
discards the data packet and sends an ICMPv6 Time
co
Exceeded message to the source node. In a Time Exceeded
message, the value of the Type field is 3. The value of the
.
Code field can be 0 or 1.
ei
Code=0: Hop limit exceeded in packet transmission
Code=1: Fragment reassembly timeout
w
ua
Parameter Problem message
If an IPv6 node detects an error in the IPv6 packet header or
.h
extension header, the IPv6 node discards the data packet and
sends an ICMPv6 Parameter Problem message to the source
g
node, specifying the location and type of the error. In a
in
Parameter Problem message, the value of the Type field is 4.
The value of the Code field can be 0, 1, or 2. The 32-bit Point
rn
field indicates the location of the error. The Code field is
ea
defined as follows:
Code=0: A field in the IPv6 basic header or extension
header is incorrect.
/l
Code=1: The Next Header field in the IPv6 basic
:/
header or extension header cannot be identified.
Code=2: Unknown options exist in the extension
tp
header.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Echo Request messages

Echo Request messages are sent to destination nodes. After
ht
receiving an Echo Request message, the destination node

responds with an Echo Reply message. In an Echo Request
message, the value of the Type field is 128 and the value of
s:
the Code field is 0. The Identifier and Sequence Number fields

ce
are configured by the source host to match the Echo Reply

messages and Echo Request messages.
ur
Echo Reply messages

After receiving an Echo Request message, the destination
so
ICMPv6 node responds with an Echo Reply message. In an

Echo Reply message, the value of the Type field is 129 and
Re
the value of the Code field is 0. The Identifier and Sequence

Number fields in the Echo Reply message are assigned the
ng
same values as those in the Echo Request message.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 address resolution is completed at Layer 3. Layer 3 address

resolution brings the following advantages:
ht
Layer 3 address resolution enables Layer 2 devices to use the

same address resolution protocol.
Layer 3 security mechanisms, for example, IPSec, are used to
s:
prevent address resolution attacks.

Request packets are sent in multicast mode, reducing
ce
performance requirements on Layer 2 networks.

ur
Neighbor Solicitation (NS) packets and Neighbor Advertisement (NA)

so
packets are used during address resolution.

In an NS packet, the value of the Type field is 135 and the
Re
value of the Code field is 0. An NS packet is similar to the ARP

Request packet in IPv4.
In an NA packet, the value of the Type field is 136 and the
ng
value of the Code field is 0. An NA packet is similar to the ARP

ni
Reply packet in IPv4.

ar
The address resolution process is as follows:

PC1 needs to parse the link-layer address of PC2 before
Le
sending packets to PC2. Therefore, PC1 sends an NS

message on the network.
re
Mo
en
In the NS message, the source IP address is the IPv6 address
m/
of PC1, and the destination IP address is the multicast address
of PC2 (this multicast address is called a solicited-node
co
multicast address composed of the prefix FF02::1:FF00:0/104
and the last 24 bits of the corresponding unicast address).
.
The destination IP address to be parsed is the IPv6 address of
ei
PC2. This indicates that PC1 wants to know the link-layer
w
address of PC2. The Options field in the NS message carries
ua
the link-layer address of PC1.
After receiving the NS message,PC2 replies with an NA
.h
message. In the NA reply message, the source address is the
IPv6 address of PC2, and the destination address is the IPv6
g
address of PC1 (the NS message is sent to PC1 in unicast
in
mode using the link-layer address of PC1). The Options field
rn
carries the link-layer address of PC2. This is the whole
address resolution process.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An IPv6 unicast address that is assigned to an interface but has not

been verified by DAD is called a tentative address. An interface cannot
ht
use the tentative address for unicast communication but will join two
multicast groups: ALL-nodes multicast group and Solicited-node
multicast group.
s:
ce
IPv6 DAD is similar to IPv4 gratuitous ARP. A node sends an NS

message that requests the tentative address as the destination address
ur
to the Solicited-node multicast group. If the node receives an NA Reply

message, the tentative address is being used by another node. This
so
node will not use this tentative address for communication.

DAD process
Re
An IPv6 address 2000::1 is assigned to PC1 as a tentative

IPv6 address. To check the validity of 2000::1, PC1 sends an
ng
NS message to the Solicited-node multicast group to which

2000::1 belongs. The NS message contains the requested
ni
address 2000::1. Since 2000::1 is not specified, the source

address of the NS message is an unspecified address. After
ar
receiving the NS message, PC2 processes the message in the

following ways:
Le
If 2000::1 is one tentative address of PC2, PC2 will not

use this address as an interface address and not send
the NA message.
re
Mo
en
If 2000::1 is being used on PC2, PC2 sends an NA
m/
message to the All-nodes multicast group to which the
address belongs. The NA message carries IP address
co
2000::1. In this way, PC1 can find that the tentative
address is duplicate after receiving the message and
.
will not use the address.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 supports stateless address autoconfiguration. Hosts obtain IPv6

prefixes and automatically generate interface IDs. Router Discovery is
ht
the basics for IPv6 address autoconfiguration and is implemented

through the following two messages:
Router Advertisement (RA) message: Each router periodically
s:
sends multicast RA messages that carry network prefixes and

ce
identifiers on the network to declare its existence to Layer 2

hosts and routers. An RA message has a value of 134 in the
ur
Type field.
Router Solicitation (RS) message: After being connected to the
so
network, a host immediately sends an RS message to obtain

network prefixes. Routers on the network reply with an RA
Re
message. An RS message has a value of 133 in the Type field.
Address autoconfiguration
ng
The process of IPv6 stateless autoconfiguration is as follows:

ni
A host automatically configures the link-local address

based on the interface ID.
ar
The host sends an NS message for duplicate address

detection.
Le
If address conflict occurs, the host stops address

autoconfiguration. Then, the host address needs to be
configured manually.
re
Mo
en
If addresses do not conflict, the link-local address takes
m/
effect. The host is connected to the network and can
communicate with the local node.
co
The host sends an RS message or receives RA
messages routers periodically send.
.
The host obtains the IPv6 address based on the prefix
ei
carried in the RA message and the interface ID
w
generated in EUI-64 format.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To choose an optimal gateway router, the gateway router sends a

Redirection message to notify the sender that packets can be sent from
ht
another gateway router. A Redirection message is contained in an

ICMPv6 message. A Redirection message has the value of 137 in the
Type field and carries a better next hop address and destination
s:
address of packets that need to be redirected.

ce
The process of redirecting packets is as follows:

ur
PC1 needs to communicate with PC2. By default, packets sent

from PC1 to PC2 are sent through R1. After receiving packets
so
from PC1, R1 finds that sending packets to R2 is much better.

R1 sends a Redirection message to PC1 to notify PC1 that R2
Re
is a better next hop address. The destination address of PC2

is carried in the ICMPv6 Redirection message. After receiving
ng
the Redirection message, PC1 adds a host route to the default

routing table. Packets sent to PC2 will be directly sent to R2.
ni
A router sends a Redirection message in the following situations:

ar
The destination address of the packet is not a multicast

address.
Le
Packets are not routed to the router.

After route calculation, the outbound interface of the next hop
is the interface that receives the packets.
re
Mo
en
The router finds that a better next hop IP address of the packet
m/
is on the same network segment as the source IP address of
the packet.
co
After checking the source address of the packet, the router
finds a neighboring device in the neighbor entries that uses
.
this address as the global unicast address or the link-local
ei
address.
w
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
In IPv6, packets are fragmented on the source node to reduce the

pressure on the transit device.
ht
The PMTU protocol is implemented through ICMPv6 Packet Too Big

messages. A source node first uses the MTU of its outbound interface
as the PMTU and sends a probe packet. If a smaller PMTU exists on
s:
the transmission path, the transit device sends a Packet Too Big
ce
message to the source node. The Packet Too Big message contains
the MTU value of the outbound interface on the transit device. After
ur
receiving the message, the source node changes the PMTU value to
the received MTU value and sends packets based on the new MTU.
so
This process is repeated until packets are sent to the destination

address. Then, the source node obtains the PMTU of the destination
Re
address.
The process of PMTU discovery.

ng
Packets are transmitted through four links. The MTU values of

ni
the four links are 1500, 1500, 1400, and 1300 bytes
respectively. Before sending a packet, the source node
ar
fragments the packet based on PMTU 1500. When the packet

is sent to the outbound interface with MTU 1400, the router
Le
returns a Packet Too Big message that carries MTU 1400.

After receiving the message, the source node fragments the
packet based on MTU 1400 and sends the fragmented packet
re
again.
Mo
en
When the packet is sent to the outbound interface with MTU
m/
1300, the router returns another Packet Too Big message that
carries MTU 1300. The source node receives the message
co
and fragments the packet based on MTU 1300. In this way, the
source node sends the packet to the destination address and
.
discovers the PMTU of the transmission path.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RIPng made the following modifications to RIP:

RIPng uses UDP port 521 (RIP uses UDP port 520) to send
ht
and receive routing information.

RIPng uses the destination addresses with 128-bit prefixes
(mask length).
s:
RIPng uses 128-bit IPv6 addresses as next hop addresses.

RIPng uses the link-local address FE80::/10 as the source
ce
address to send RIPng Update packets.

ur
RIPng periodically sends routing information in multicast mode

and uses FF02::9 as the multicast address.
so
A RIPng packet consists of a header and multiple route table

entries (RTEs). In a RIPng packet, the maximum number of
Re
RTEs depends on the MTU on the interface.

ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
OSPFv3 is based on links rather than network segments.

OSPFv3 runs on IPv6, which is based on links rather than
ht
network segments.
Therefore, you do not need to configure OSPFv3 on the
interfaces in the same network segment. It is only required that
s:
the interfaces enabled with OSPFv3 are on the same link. In

ce
addition, the interfaces can set up OSPFv3 sessions without

IPv6 global addresses.
ur
OSPFv3 does not depend on IP addresses.

This is to separate topology calculation from IP addresses.
so
That is, OSPFv3 can calculate the OSPFv3 topology without

knowing the IPv6 global address, which only applies to virtual
Re
link interfaces for packet forwarding.

OSPFv3 packets and LSA format change.
OSPFv3 packets do not contain IP addresses.
ng
OSPFv3 router LSAs and network LSAs do not contain IP

ni
addresses, which are advertised by Link LSAs and Intra Area

Prefix LSAs.
ar
In OSPFv3, Router IDs, area IDs, and LSA link state IDs no
longer indicate IP addresses, but the IPv4 address format is
Le
still reserved.
Neighbors are identified by Router IDs instead of IP addresses
in broadcast, NBMA, or P2MP networks.
re
Information about the flooding scope is added in LSAs of OSPFv3.

Mo
en
Information about the flooding scope is added in the LSA Type
m/
field of LSAs of OSPFv3. Thus, OSPFv3 routers can process
LSAs of unidentified types, which makes the processing more
co
flexible.
OSPFv3 can store or flood unidentified packets,
.
whereas OSPFv2 just discards unidentified packets.
ei
OSPFv3 floods packets in an OSPF area or on a link. It
w
sets the U flag bit of packets (the flooding area is
ua
based on the link local) so that unidentified packets are
stored or forwarded to the stub area.
.h
OSPFv3 supports multi-process on a link.
Only one OSPFv2 process can be configured on an OSPFv2
g
physical interface. In OSPFv3, one physical interface can be
in
configured with multiple processes that are identified by
rn
different instance IDs.
OSPFv3 uses IPv6 link-local addresses.
ea
As a routing protocol running on IPv6, OSPFv3 also uses link-
local addresses to maintain neighbor relationships and update
/l
LSDBs. Except Vlink interfaces, all OSPFv3 interfaces use
link-local addresses as the source address and that of the next
:/
hop to transmit OSPFv3 packets. The advantages are as
follows:
tp
The OSPFv3 can calculate the topology without

ht
knowing the global IPv6 addresses so that topology

calculation is independent of IP addresses.
The packets flooded on a link are not transmitted to
s:
other links, which prevents unnecessary flooding and

saves bandwidth.
ce
OSPFv3 packets do not contain authentication fields.

OSPFv3 directly adopts IPv6 authentication and security
ur
measures. Thus, OSPFv3 does not need to perform

authentication. It only focuses on the processing of packets.
so
OSPFv3 supports two new LSAs.

Re
Link LSA: A router floods a link LSA on the link where it

resides to advertise its link-local address and the configured
global IPv6 address.
ng
Intra Area Prefix LSA: A router advertises an intra-area prefix

LSA in the local OSPF area to inform the other routers in the
ni
area or the network, which can be a broadcast network or an

NBMA network, of its IPv6 global address.
ar
OSPFv3 identifies neighbors based on router IDs only.

Le
On broadcast, NBMA, and P2MP networks, OSPFv2 identifies

neighbors based on IPv4 addresses of interfaces.
re
Mo
en
OSPFv3 identifies neighbors based on router IDs only. Thus,
m/
even if global IPv6 addresses are not configured or they are
configured in different network segments, OSPFv3 can still
co
establish and maintain neighbor relationships so that topology
calculation is not based on IP addresses.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Extended IS-IS for IPv6 is defined in the draft-ietf-isis-ipv6-05 of the

IETF. To process and calculate IPv6 routes, IS-IS uses two new TLVs
ht
and one network layer protocol identifier (NLPID).
The two TLVs are as follows:

s:
TLV 236 (IPv6 Reachability): describes network reachability by

ce
defining the route prefix and metric.

TLV 232 (IPv6 Interface Address): is similar to the IP Interface
ur
Address TLV of IPv4, except that it changes a 32-bit IPv4

address to a 128-bit IPv6 address.
so
The NLPID is an 8-bit field that identifies the protocol packets of the
Re
network layer. The NLPID of IPv6 is 142 (0x8E). If IS-IS supports IPv6,
it advertises routing information through the NLPID value.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To support multiple network layer protocols, BGP requires NLRI and

Next_Hop attributes to carry information about network layer protocols.
ht
Therefore, MP-BGP uses the following new optional non-transitive

attributes:
MP_REACH_NLRI: indicates the multiprotocol reachable NLRI.
s:
It is used to advertise reachable routes and next hop

ce
information.
MP_UNREACH_NLRI: indicates the multiprotocol unreachable
ur
NLRI. It is used to withdraw unreachable routes.

so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Multicast Listener Discovery (MLD) is a protocol that manages IPv6

multicast members. It has similar principles and functions as IGMP.
ht
MLD is used to enable each IPv6 router to discover their directed

connected multicast listeners (nodes that expect to receive multicast
data) and learn the multicast addresses that the neighbor nodes are
s:
interested in. Then, MLD delivers the learnt information to the multicast
ce
routing protocols used by the routers to ensure that multicast data can
be sent to all links where the receivers reside.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Querier election mechanism

The working mechanism is similar to IGMPv2:
ht
Each MLD router considers itself as a querier when it

starts and sends a General Query message with
destination address FF02::1 to all hosts and routers on
s:
the local network segment.

When the routers receive a General Query message,
ce
they compare the source IPv6 address of the message

ur
with their own interface IPv6 address. The router with

the smallest IPv6 address becomes the querier, and
so
the other routers are considered non-queriers.

All non-queriers start a timer (Other Querier Present
Re
Timer). If non-queriers receive a Query message from

the querier before the timer expires, they reset the
ng
timer. If non-queriers receive no Query message from

the querier when the timer expires, they trigger election
ni
of a new querier.
ar
Member join mechanism

PC2 and PC3 need to receive IPv6 multicast data destined for
Le
IPv6 multicast group G1, and PC1 needs to receive IPv6

multicast data destined for IPv6 multicast group G2. The hosts
need to join their respective multicast groups, and then the
re
MLD querier (R1) needs to maintain IPv6 group memberships.

Mo
en
The query and report process is as follows:
m/
Hosts send Multicast Listener Report messages to the
IPv6 multicast groups that they want to join without
co
waiting to receive a Query message from the MLD
querier.
.
The MLD querier (R1) periodically multicasts General
ei
Query messages with destination address FF02::1 to
w
all hosts and routers on the local network segment.
ua
After PC2 and PC3 receive the Query message, the
host whose delay timer expires first sends a Report
.h
message to G1. If the delay timer of PC2 expires first,
PC2 multicasts a Report message to G1, declaring that
g
it belongs to G1. All hosts on the local network
in
segment can receive the Report message sent from
rn
PC2 to G1. When PC3 receives this Report message, it
does not send the same Report message to G1
ea
because MLD routers (R1 and R2) have known that G1
has members on the local network segment. This
/l
mechanism suppresses duplicate Report messages,
reducing information traffic on the local network
:/
segment.
PC1 still needs to multicast a Report message to G2,
tp
declaring that it belongs to G2.

After receiving the Report messages, MLD routers
ht
know that multicast groups G1 and G2 have members

on the local network segment. Then the routers use
s:
IPv6 multicast routing protocols (such as IPv6 PIM) to

create (*, G1) and (*, G2) entries for multicast data
ce
forwarding, in which * stands for any multicast source.

When IPv6 multicast data sent from an IPv6 multicast
ur
source reaches the MLD routers through multicast

routes, the MLD routers forward the received multicast
so
data to the local network segment because they have

Re
(*, G1) and (*, G2) entries. Subsequently, receiver

hosts can receive the IPv6 multicast data.
ng
Member Leave Mechanism

The host sends a Done message with destination
ni
address FF02::2 to all IPv6 multicast routers on the

local network segment.
ar
Le
re
Mo
en
When the MLD querier receives the Done message, it
m/
sends a Multicast-Address-Specific Query message to
the IPv6 multicast group that the host wants to leave.
co
The destination address and group address of the
Query message are the address of this IPv6 multicast
.
group.
ei
If the IPv6 multicast group has other members on the
w
network segment, the members send a Report
ua
message within the maximum response time.
If the querier receives the Report messages from other
.h
members within the maximum response time, the
querier continues to maintain memberships of the IPv6
g
multicast group. Otherwise, the querier considers that
in
the IPv6 multicast group has no member on the local
rn
network segment and stops maintaining memberships
of the IPv6 multicast group.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 multicast source filtering

MLDv2 supports IPv6 multicast source filtering and defines two
ht
filter modes: INCLUDE and EXCLUDE. When a host joins an

IPv6 multicast group G, the host can choose to accept or reject
IPv6 multicast data from a specific source S. When a host
s:
joins an IPv6 multicast group:

If the host only needs to receive data sent from sources
ce
S1, S2, and so on, the host can send a Report

ur
message with an INCLUDE Sources (S1, S2,) entry.

If the host wants to reject data sent from sources S1,
so
S2, and so on, the host can send a Report message

with an EXCLUDE Sources (S1, S2,) entry.
Re
IPv6 Multicast Group Status Tracking

Multicast routers running MLDv2 keep IPv6 multicast group
ng
state based on per multicast address per attached link. The

ni
IPv6 multicast group state includes:

Filter mode: The MLD querier tracks the INCLUDE or
ar
EXCLUDE state.
Source list: The MLD querier tracks the sources that
Le
are added or deleted.

Timers: include a filter timer when the MLD querier
switches to the INCLUDE mode after its IPv6 multicast
re
address expires and a source timer about source

Mo
records.
en
Receiver Host Status Listening
m/
Multicast routers running MLDv2 listen to the receiver host
status to record and maintain information about hosts that join
co
IPv6 multicast groups on the local network segment.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPv4/IPv6 dual stack is an efficient technology that implements IPv4-to-

IPv6 transition. In IPv4/IPv6 dual stack, network devices support both
ht
the IPv4 protocol stack and IPv6 protocol stack. The source device
selects a protocol stack according to the IP address of the destination
device. Network devices between the source and destination devices
s:
select a protocol stack to process and forward packets according to the

ce
packet protocol type. IPv4/IPv6 dual stack can be implemented on a

single device or on a dual-stack backbone network. On a dual-stack
ur
backbone network, all devices must support the IPv4/IPv6 dual stack,
and interfaces connected to the dual-stack network must have both
so
IPv4 and IPv6 addresses configured.

Re
The topology is described as follows:

The host sends a DNS request to the DNS server for the IP
ng
address of domain name www.huawei.com. The DNS server

replies with the requested IP address of the domain name. The
ni
IP address may be 10.1.1.1 or 3ffe:yyyy::1. If the host sends a

class-A query, the DNS server replies with the IPv4 address of
ar
the domain name. When the host sends a class-AAAA query,

the DNS server replies with the IPv6 address of the domain
Le
name.
re
Mo
en
The R1 in the figure supports IPv4/IPv6 dual stack. If the host
m/
needs to access network server at IPv4 address 10.1.1.1, the
host can access the network server through the IPv4 protocol
co
stack of R1.If the host needs to access the network server at
IPv6 address 3ffe:yyyy::1, the host can access the network
.
server through the IPv6 protocol stack of R1.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
During early transition, IPv4 networks are widely deployed, while IPv6
networks are isolated islands. IPv6 over IPv4 tunneling allows IPv6
ht
packets to be transmitted on an IPv4 network, interconnecting all IPv6

islands.
s:
Principles are as follows:

IPv4/IPv6 dual stack is enabled and an IPv6 over IPv4 tunnel
ce
is deployed on edge routing devices.

ur
After an edge routing device receives a packet from the IPv6

network, the device appends an IPv4 header to the IPv6
so
packet to encapsulate the IPv6 packet as an IPv4 packet if the

destination address of the packet is not the device and the
Re
outbound interface of the packet is a tunnel interface.

On the IPv4 network, the encapsulated packet is transmitted to
ng
the remote edge routing device.

The remote edge routing device decapsulates the packet,
ni
removes the IPv4 header, and then sends the decapsulated

IPv6 packet to the connected IPv6 network.
ar
The IPv4 address of the source end of an IPv6 over IPv4 tunnel must
Le
be manually configured, but the IPv4 address of the destination end

can be manually configured or automatically obtained. An IPv6 over
IPv4 tunnel can be a manual or an automatic tunnel depending on how
re
the destination end of the tunnel obtains its IPv4 address.

Mo
en
Manual tunnel: The edge routing device cannot automatically
m/
obtain the IPv4 address of the destination end, which must be
manually configured so that the packets can be correctly
co
forwarded to the tunnel end.
Automatic tunnel: The edge routing device can automatically
.
obtain the IPv4 address of the destination end and does not
ei
require you to manually configure an IPv4 address for the
w
destination end. In most cases, two interfaces on both ends of
ua
an automatic tunnel use IPv6 addresses that contain
embedded IPv4 addresses so that the destination IPv4
.h
address can be extracted from the destination IPv6 address of
IPv6 packets.
g
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If an edge routing device needs to set up a manual tunnel with multiple

devices, multiple tunnels must be configured on the edge routing
ht
device. Such configuration is complex. To simplify the configuration, a

manual tunnel is often set up between two edge routing devices to
connect two IPv6 networks.
s:
ce
The manual tunnel has advantages and disadvantages:

Advantage: applies to any environment in which IPv6
ur
traverses IPv4.
Disadvantage: must be manually configured.
so
Packets are transmitted in an IPv6 over IPv4 manual tunnel as follows:

Re
When an edge device of the tunnel receives an IPv6 packet

from an IPv6 network, the device searches for the IPv6 routing
ng
table according to the destination address of the IPv6 packet.

If the packet is forwarded from the virtual tunnel interface, the
ni
device encapsulates the packet according to the tunnel source

and destination IPv4 addresses configured for the tunnel
ar
interface. The encapsulated packet becomes an IPv4 packet,

which is then processed by the IPv4 protocol stack. The IPv4
Le
packet is forwarded to the destination end of the tunnel over

an IPv4 network. After the destination end of the tunnel
receives the IPv4 packet, it decapsulates the packet and
re
sends the decapsulated packet to the IPv6 protocol stack.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An IPv6 over IPv4 GRE tunnel uses standard GRE tunneling

technology to provide a point-to-point connection and requires tunnel
ht
endpoint addresses to be manually configured. GRE tunnels have no

limitations on the encapsulation protocol and transport protocol, which
can be any protocol such as IPv4, IPv6, OSI, or Multiprotocol Label
s:
Switching (MPLS).
ce
Packet forwarding on an IPv6 over IPv4 GRE tunnel is similar to that on

ur
an IPv6 over IPv4 manual tunnel.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The destination address of IPv6 packets transmitted over an automatic

IPv4-compatible IPv6 tunnel is an IPv4-compatible IPv6 address (the
ht
special address used by the automatic tunnel). An IPv4-compatible

IPv6 address is an IPv6 unicast address that has zeros in the high-
order 96 bits and an IPv4 address in the low-order 32 bits.
s:
Disadvantages of an automatic IPv4-compatible IPv6 tunnel:

An automatic IPv4-compatible IPv6 tunnel requires that each
ce
host on both ends should have a valid IP address and support

ur
IPv4/IPv6 dual stack and automatic IPv4-compatible IPv6

tunnels. Therefore, automatic IPv4-compatible IPv6 tunnels
so
cannot be deployed in a large scale. Currently, automatic IPv4-

compatible IPv6 tunnels have been replaced by automatic
Re
6to4 tunnels.
Packet forwarding process is as follows:
After R1 receives an IPv6 packet destined for R2, R1 searches
ng
for an IPv6 route according to destination address ::2.1.1.1,

ni
and finds that the next hop is a tunnel interface. The tunnel
configured on R1 is an automatic IPv4-compatible IPv6 tunnel.
ar
Therefore, R1 encapsulates the IPv6 packet into an IPv4

packet. In the IPv4 packet, the source address is the tunnel
Le
source address 1.1.1.1, and the destination address is the low-

order 32 bits of IPv4-compatible IPv6 address ::2.1.1.1,
namely, 2.1.1.1. The IPv4 packet is forwarded by the tunnel
re
interface on R1 over the IPv4 network to R2 at 2.1.1.1.

Mo
en
After R2 receives the IPv4 packet, it decapsulates the IPv4
m/
packet to obtain the IPv6 packet and sends the IPv6 packet to
the IPv6 protocol stack for processing. An IPv6 packet is sent
co
from R2 to R1 following a similar process.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An automatic 6to4 tunnel is also a kind of automatic tunnel and set up

using the IPv4 address embedded in an IPv6 address. Unlike an
ht
automatic IPv4-compatible IPv6 tunnel, the 6to4 automatic tunnel can

be set up from a router to a router, from a host to a router, from a router
to a host, and from a host to a host.
s:
ce
The address format is as follows:

FP: is the format prefix of aggregatable global unicast
ur
addresses and fixed as 001.

TLA: is short for top level aggregator and fixed as 0x0002.
so
SLA: is short for site level aggregator.

Re
A 6to4 address starts with the prefix 2002::/16 in the format of

2002:IPv4-address::/48. A 6to4 address has a 64-bit network prefix, in
which the first 48 bits (2002:a.b.c.d) are the IPv4 address assigned to a
ng
router interface and cannot be changed, and the last 16 bits (SLA) can
ni
be configured by the user.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
An IPv4 address can only be used as the source address of one 6to4
tunnel. If one edge router connects to multiple 6to4 networks and uses
ht
the same IPv4 address as the tunnel source address, SLA IDs in 6to4
addresses are used to differentiate the 6to4 networks. These 6to4
networks, however, share the same 6to4 tunnel.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Common IPv6 networks need to communicate with 6to4 networks over

IPv4 networks. This requirement can be met through 6to4 relays. A
ht
6to4 relay is a next-hop device that forwards IPv6 packets of which the
destination address is not a 6to4 address but the next-hop address is a
6to4 address. The tunnel destination IPv4 address is obtained from the
s:
next-hop 6to4 address.

ce
If a host on 6to4 network 2 needs to communicate with devices on the

ur
IPv6 network, a route must be configured on the edge router, and the
next-hop address of the route to the IPv6 network is specified as the
so
6to4 address of the 6to4 relay. The 6to4 address of the relay matches
the source address of the 6to4 tunnel. Packets to be sent from 6to4
Re
network 2 to the IPv6 network are first sent to the 6to4 relay according
to the next hop specified in the routing table. The 6to4 relay then
forwards the packet to the IPv6 network. When a packet needs to be
ng
sent from the IPv6 network to 6to4 network , the 6to2 relay
ni
encapsulates the packet as an IPv4 packet according to the destination

address (a 6to4 address) of the packet so that the packet can be
ar
successfully sent to 6to4 network.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) is another

automatic tunneling mechanism. An ISATAP tunnel uses an IPv6
ht
address with an embedded IPv4 address. An ISATAP address uses an

IPv4 address as the interface identifier, while a 6to4 address uses an
IPv4 address as the network prefix.
s:
ce
The address is described as follows:

If the IPv4 address is globally unique, the u bit is 1. Otherwise,
ur
the u bit is 0. The g bit indicates whether the IPv4 address is

unicast or multicast. An ISATAP address can be a global
so
unicast address, link-local address, unique local address, or

multicast address. The first 64 bits of an ISATAP address are
Re
obtained through a request sent to an ISATAP router and can

be automatically configured. The Neighbor Discovery (ND)
ng
protocol can run between edge devices on both ends of an

ISATAP tunnel. An ISATAP tunnel regards an IPv4 network as
ni
a non-broadcast multi-access (NBMA) network.

ar
The forwarding process is described as follows:

The IPv4 network has two dual-stack hosts PC2 and PC3,
Le
each of which has a private IPv4 address. To implement the

ISATAP function, perform the following operations:
Configure ISATAP tunnel interfaces. The hosts
re
generate ISATAP interface IDs according to their IPv4

Mo
addresses.
en
The hosts then generate a link-local IPv6 address
m/
according to the ISATAP interface identifier. Then the
two hosts have IPv6 communication capabilities on the
co
local link.
The hosts perform address autoconfiguration and
.
obtain IPv6 global unicast addresses and ULA
ei
addresses.
w
The host obtains an IPv4 address from the next hop
ua
IPv6 address as the destination address, and forwards
packets through the tunnel interface to communicate
.h
with another IPv6 host. If the destination host is within
the local site, the next hop is the destination host. If the
g
destination host is in a different site, the next hop
in
address is the address of the ISATAP router.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
During a later stage of IPv4-to-IPv6 transition, IPv6 networks are widely

deployed, while IPv4 networks are isolated islands over the world. You
ht
can create a tunnel on an IPv6 network to connect isolated IPv4 sites

so that isolated IPv4 sites can access other IPv4 networks through the
IPv6 public network.
s:
ce
The forwarding process is described as follows:

IPv4/IPv6 dual stack is enabled and an IPv4 over IPv6 tunnel
ur
is deployed on edge routing devices.

After the edge routing device receives a packet from the
so
connected IPv4 network, it adds an IPv6 header to the IPv4

packet to encapsulate the IPv4 packet as an IPv6 packet if the
Re
destination address of the packet is not the routing device.

On the IPv6 network, the encapsulated packet is transmitted to
ng
the remote edge routing device.

The remote edge routing device decapsulates the packet,
ni
removes the IPv6 header, and then sends the decapsulated

IPv4 packet to the connected IPv4 network.
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Example description:
The device addresses are determined as follows:
ht
If RTX connects to RTY, the addresses of the two

devices are 2001:XY::X/64 and 2001:XY::Y/64
respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The commands and their functions are as follows:

ripng: creates an RIPng process.
ht
ripng enable: enable RIPng on an interface.

ripng metricout: sets the metric that is added to the RIPng
route sent by an interface.
s:
import-route: configures RIPng to import routes from other

ce
routing protocols. You can use the route-policy parameter to

filter routes to be imported and configure route properties.
ur
Precautions:
The policy usage is similar to that in IPv4.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht

respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

router-id: configures the ID of the router running OSPFv3.
ht
ospfv3 area: enables the OSPFv3 process on an interface

and specifies the area the process belongs to.
nssa: configures an OSPFv3 area as an NSSA.
s:
undo ipv6 nd ra halt: enables the system to send RA packets.

ipv6 address auto global: enables a device to automatically
ce
generate a global IPv6 address through stateless

ur
autoconfiguration.
so
Precautions:
OSPFv3 has similar features as OSPFv2.
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht

respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

ipv6 enable: enables the IPv6 capability of an IS-IS process.
ht
ipv6 nd ra prefix: configures the prefix in an RA packet.

isis ipv6 enable: enables the IS-IS IPv6 capability for an
interface and specifies the ID of the IS-IS process to be
s:
associated with the interface.

ipv6 import-route isis level-2 into level-1: configures IPv6
ce
route importing from Level-2 areas to Level-1 areas.

ur
Precautions:
so
IS-IS IPv6 has similar features as IS-IS IPv4.

Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht

respectively.
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

peer{ipv6-address | group-name } as-number as-number:
ht
creates a peer or configures an AS number for a specified

peer group.
ipv6-family: displays the IPv6 address family view of BGP.
s:
peer enable: enables a BGP device to exchange routes with a

ce
specified peer or peer group in the address family view.

peer connect-interface: specifies a source interface from
ur
which BGP packets are sent, and a source address used for
initiating a connection.
so
peer password: enables a BGP device to implement MD5

authentication for BGP messages exchanged during the
Re
establishment of a TCP connection with a peer.
Precautions:
ng
BGP4+ has similar features as BGP.

ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 and IPv4 addresses have been specified.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

interface tunnel: creates a tunnel interface and displays the
ht
tunnel interface view.

tunnel-protocol ipv6-ipv4: sets the tunnel mode to IPv6 over
IPv4 manual tunnel.
s:
source { ipv4-address | interface-type interface-number }:

ce
specifies the source interface of a tunnel.

destination { ipv4-address }: specifies the destination
ur
interface of a tunnel.
ipv6 address { ipv6-address prefix-length }: configures IPv6
so
addresses for tunnel interfaces.

Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
IPv6 and IPv4 addresses have been specified.
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp

interface tunnel: creates a tunnel interface and displays the
ht
tunnel interface view.

tunnel-protocol gre: sets the tunnel mode to IPv6 over IPv4
GRE tunnel.
s:
source { ipv4-address | interface-type interface-number }:

ce
specifies the source interface of the tunnel.

destination { ipv4-address }: specifies the destination
ur
interface of a tunnel.
ipv6 address { ipv6-address prefix-length }: configures IPv6
so
addresses for tunnel interfaces.

Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
MPLS VPN overview

A BGP/MPLS IP VPN is a Layer 3 Virtual Private Network
ht
(L3VPN). It uses the Border Gateway Protocol (BGP) to

advertise VPN routes and uses Multiprotocol Label Switching
(MPLS) to forward VPN packets on the backbone network of
s:
the Service Provider (SP). This technology is called IP VPN

ce
because IP packets are transmitted on VPNs.

The BGP/MPLS IP VPN model consists of the following
ur
entities:
Customer Edge (CE): a device that is deployed at the
so
edge of a customer network and has interfaces directly

connected to the SP network. A CE device can be a
Re
router, switch, or host. Generally, CE devices cannot

detect VPNs and do not need to support MPLS.
Provider Edge (PE): a device that is deployed at the
ng
edge of an SP network and directly connected to a CE

ni
device. On an MPLS network, PE devices process all

VPN services and must have high performance.
ar
Provider (P): a backbone device that is deployed on an

SP network and is not directly connected to CE devices.
Le
P devices only need to provide basic MPLS forwarding

capabilities and do not maintain VPN information.
PE and P devices are managed by SPs. CE devices are
re
managed by customers unless customers authorize SPs to

Mo
manage their CE devices.

en
A PE device can connect to multiple CE devices. A CE device
m/
can connect to multiple PE devices of the same SP or different
SPs.
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Site
A site is a group of IP systems with IP connectivity, which can
ht
be achieved independent of ISP networks.

Sites are configured based on topologies between devices but
not their geographic locations, although devices in a site are
s:
geographically adjacent to each other in most situation.

The devices in a site may belong to multiple VPNs. That is, a
ce
site may belong to more than multiple VPNs.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
eiw
ua
g .h
in
rn
ea
/l
:/
tp
Different VPN sites can use overlapping address spaces.

ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A PE device establishes and maintains a VPN instance for each

directly connected site. A VPN instance contains VPN member
ht
interfaces and routes of the corresponding site. Specifically, information

in a VPN instance includes the IP routing table, label forwarding table,
interface bound to the VPN instance, and VPN instance management
s:
information. VPN instance management information includes the route

ce
distinguisher (RD), route filtering policy, and member interface list of

the VPN instance.
ur
A public routing and forwarding table and a VRF differ in the following
so
aspects:
A public routing table contains IPv4 routes of all the PE and P
Re
devices. The routes are static routes or dynamic routes

generated by routing protocols on the backbone network.
A VPN routing table contains routes of all sites that belong to a
ng
VPN instance. The routes are obtained through the exchange

ni
of VPN routing information between PE devices or between

CE and PE devices.
ar
Information in a public forwarding table is extracted from the

public routing table according to route management policies,
Le
whereas information in a VPN forwarding table is extracted

from the corresponding VPN routing table.
re
Mo
en
VPN instances on a PE device are independent of each other
m/
and maintain a VRF independent of the public routing and
forwarding table. Each VPN instance can be considered as a
co
virtual device, which maintains an independent address space
and connects to VPNs through interfaces.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The PE devices use Multiprotocol Extensions for BGP-4 (MP-BGP) to

advertise VPN routes and use the VPN-IPv4 address family to solve
ht
the problem that BGP cannot distinguish VPN routes with the same IP
address prefix.
RDs distinguish the IPv4 prefixes with the same address space. The
s:
RD format enables SPs to allocate RDs independently. When CE

ce
devices are dual-homed to PE devices, RD must be globally unique to

ensure correct routing.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
A VPN target, also called the route target (RT), is a 32-bit BGP
extension community attribute. BGP/MPLS IP VPN uses VPN targets to
ht
control VPN routes advertisement.
A VPN instance is associated with one or more VPN target attributes.

s:
VPN target attributes are classified into the following types:

Export target: After a PE device learns IPv4 routes from
ce
directly connected sites, it converts the routes to VPN-IPv4

ur
routes and sets the export target attribute for those routes. The
export target attribute is advertised with the routes as a BGP
so
extended community attribute.

Import target: After a PE device receives VPN-IPv4 routes
Re
from other PE devices, it checks the export target attribute of

the routes. If the export target is the same as the import target
ng
of a VPN instance on the local PE device, the local PE device

adds the route to the VPN routing table.
ni
A VPN target defines which sites can receive a VPN route and which
VPN routes of which sites can be received by a PE device.
ar
The reasons for using the VPN target instead of the RD as the
Le
extended community attribute is as follows:

re
Mo
en
A VPN-IPv4 route has only one RD, but can be associated
m/
with multiple VPN targets. With multiple extended community
attributes, BGP can greatly improve the flexibility and
co
expansibility of a network.
VPN targets can be used to control route advertisement
.
between different VPNs on a PE device. With properly
ei
configured VPN targets, different VPN instances on a PE
w
device can import routes from each other.
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Traditional BGP-4 defined in RFC 1771 can manage only the IPv4
routes but cannot process VPN routes that have overlapping address
ht
spaces.
To correctly process VPN routes, VPNs use MP-BGP defined in RFC
2858 (Multiprotocol Extensions for BGP-4). MP-BGP supports multiple
s:
network layer protocols. Network layer protocol information is contained

ce
in the Network Layer Reachability Information (NLRI) field and the Next
Hop field of an MP-BGP Update message.
ur
MP-BGP uses the address family to differentiate network layer

protocols. An address family can be a traditional IPv4 address family or
so
any other address family, such as a VPN-IPv4 address family or an

IPv6 address family. For the values of address families, see RFC 1700
Re
(Assigned Numbers).
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The PE and CE devices exchange routing information through standard

BGP, OSPF, IS-IS, RIP or static routes. During the process, the PE
ht
device needs to store routes received from the CE devices to different

VRFs. Other operations are the same as those for common route
exchange. You can configure the same routing protocol for all the CE
s:
devices. However, you must configure different instances for each VRF
ce
of a PE device. The instances do not interfere with each other.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
After a PE1 device receives an IPv4 route from a CE1 device, the PE
device adds the manually configured RD of the VRF to the route to
ht
change the IPv4 route into a VPNv4 route. Then the PE device
changes the Next_Hop attribute in the Route Advertisement message
to its own Loopback address and adds a VPN label (randomly
s:
generated by MP-IBGP) to the route. After that, the PE device adds the
ce
Export Route Target attribute to the route and sends the route to all the
PE neighbors. In VRP5.3, after MPLS is enabled on PE1, PE1 uses
ur
MP-BGP to allocate VPN labels to private network routes. PE devices

can then correctly exchange VPN routes.
so
When multiple CE devices in a VPN site connect to different PE

devices, VPN routes advertised from the CE devices to the PE devices
Re
may be sent back to the VPN site after the routes traverse the
backbone network. This may cause routing loops in the VPN site. The
Site or Origin (SOO) specifies the source site and prevents routing
ng
loops.
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
After PE2 receives a VPNv4 route advertised by PE1, PE2 converts the
VPNv4 route into an IPv4 route and adds the IPv4 route to the
ht
corresponding VRF based on the import target attribute of the route.

The VPN label of the route is retained for packet forwarding. PE2
forwards the IPv4 route to the corresponding CE device through the
s:
routing protocol between the PE and CE devices. The next hop in the
ce
route is the IP address of PE2's interface.

ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The data to be exchanged to VPNs needs to be forwarded through the

MPLS backbone network based on MPLS labels. The process for
ht
allocating public network labels (outer labels) is as follows:

The PE and P routers learn BGP next hop IP addresses using an IGP,
assign outer labels using LDP, and establish LSPs. A label stack is
s:
used for packet forwarding. An outer label directs packets to the BGP
ce
next hop. An inner label indicates the outbound interface for the packet
or the VPN instance to which the packet belongs. MPLS forwarding is
ur
based on only outer labels and is irrelevant to the inner labels.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
CE2 sends an IP packet destined for CE1. After receiving the packet,
PE2 encapsulates an inner label 15362 and then an outer label 1024 to
ht
the packet and forwards the packet to the P device. After receiving the
packet, the penultimate hop P pops out the outer label, retains the inner
label, and forwards the packet to PE1 based on the outer label. PE1
s:
determines the VPN site to which the packet belongs based on the
ce
inner label, removes the inner label, and forwards the packet to CE1.
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
If RTX interconnects with RTY, the addresses are
XY.1.1.X and XY.1.1.Y, network mask is 24.
s:
Assume that PE1 is RT1, PE2 is RT2, P is RT3.

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ip binding vpn-instance: binds the current AC interface to a
ht
specified VPN instance.

ipv4-family: enters the IPv4 address family view of BGP.
s:
Precautions
After a VPN instance is bound to or unbound from an interface,
ce
Layer 3 features such as IP address and routing protocol are

ur
deleted from the interface. If such features are required, you

need to re-configure them.
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
s:

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

s:
Precautions
Specify a VPN instance for each RIP process on the PE
ce
device.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
s:

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

s:
Precautions
Specify a VPN instance for each IS-IS process on the PE
ce
device.
ur
Deleting a VPN instance or disabling a VPN instance IPv4

address family will delete all the IS-IS processes bound to the
so
VPN instance or the VPN instance IPv4 address family on the

PE.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
s:

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

Precautions
s:
Specify a VPN instance for each OSPF process on the PE

ce
device.
Deleting a VPN instance or disabling a VPN instance IPv4
ur
address family will delete all the OSPF processes bound to the
VPN instance or the VPN instance IPv4 address family on the
so
PE.
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Case description
ht
follows:
s:

ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Command usage
ht

peer substitute-as: replaces the AS number of the peer
specified in the AS_Path attribute with the local AS number.
s:
ce
Precautions
VPN sites in the same AS or with different private AS numbers
ur
can communicate over the BGP MPLS/IP VPN backbone

network. Sites in the same VPNs have the same AS number.
so
When a local CE device establishes an EBGP neighbor

relationship with a PE device, you need to run the peer
Re
substitute-as command to enable AS number substitution on

the PE device. If AS number substitution is disabled, the local
ng
CE device discards VPN routes with the local AS number. As

a result, VPN users cannot communicate with each other.
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
To improve the HA of a device, increase MTBF and reduce MTTR.

ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Concepts
Two network devices establish a BFD session to detect the
ht
bidirectional forwarding paths between them and serve upper-layer

applications. BFD does not provide the neighbor discovery mechanism.
Instead, BFD obtains neighbor information from the upper-layer
s:
applications BFD serves. After the BFD session is established, the

ce
local device periodically sends BFD packets. If the local device does
not receive a response from the peer device within the detection time, it
ur
considers the forwarding path faulty. BFD then notifies the upper-layer
application for processing.
so
BFD control messages are encapsulated in UDP packets. The

destination port number is 3784 and source port number is a random
Re
value from 49152 to 65535.
BFD session establishment process

ng
OSPF discovers neighbors using the hello mechanism and sets up

ni
connections to neighbors.
After setting up a neighbor relationship, OSPF notifies neighbor
ar
information (including destination and source addresses) to BFD.

BFD sets up a session by using the received neighbor information.
Le
After the BFD session is set up, BFD starts to detect link faults and
rapidly responds to link faults.
re
BFD session establishment process

Mo
A link fault is detected.

en
BFD detects the link fault and changes the BFD session status to
m/
Down.
BFD notifies the local OSPF device that the BFD peer is unreachable.
co
Local OSPF process tears down the connection with the OSPF
neighbor.
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The BFD sessions have the following status: Down, Init, Up, and Down.
Down: indicates that a BFD session is in the Down state or has just
ht
been set up.

Init: indicates that the local system can communicate with the peer
system, and the local system expects to make the session Up.
s:
Up: indicates that a session is established successfully.

AdminDown: indicates that a session is in the AdminDown state.
ce
ur
BFD session status transition:

R1 and R2 start BFD state machines respectively. The initial state of
so
BFD state machine is Down. R1 and R2 send BFD control messages

with the State field as Down.
Re
After receiving the BFD message with the State field as Down from
R1, R2 switches the session status to Init and sends a BFD message
with State field as Init.
ng
After the local BFD session status of R2 changes to Init, R2 no longer

ni
processes the received BFD messages with the State field as Down.
The BFD session status change on R1 is the same as that on R2.
ar
After receiving the BFD message with the State field as Init, R2
changes the local BFD session status to Up.
Le
The BFD session status change on R1 is the same as that on R2.

re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Common Commands
Single-hop detection and multi-hop detection
ht
Single-hop or multi-hop detection:

The bfd command enables the global BFD and
displays the BFD view.
s:
The bfd bind peer-ip command creates a BFD

binding and establishes a BFD session.
ce
The discriminator command sets the local and

remote discriminators for the current BFD
ur
session.
The commit command submits the
so
configurations of a BFD session.

Association between BFD and interface status
Re
The bfd command enables the global BFD and

displays the BFD view.
The bfd bind peer-ip default-ip command binds the
ng
physical status of a physical link to the BFD session.

The discriminator command sets the local and remote
ni
discriminators for the current BFD session.

The process-interface-status command associates
ar
the status of the current BFD session with the status of

the interface to which the session is bound.
Le
The configuration is similar to the configuration of BFD and route

association, and is omitted here.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
When a router fails, neighbors at the routing protocol layer detect that
their neighbor relationships are Down and then become Up again after
ht
a period of time. This is the flapping of neighbor relationships. The

flapping of neighbor relationships causes route flapping, which leads to
black hole routes on the restarted router or causes data services from
s:
the neighbors to be transmitted bypass the restarted router. This

ce
decreases the reliability on the network.

NSF is thus introduced to address route flapping issue. The following
ur
requirements must be met:

Hardware: Dual control boards must be configured with redundant
so
RP. One is the active board and the other is the standby board. If the
active board restarts, the standby board becomes the active one. The
Re
distributed structure is used. That is, data forwarding and control are
separated, and LPUs are responsible for data forwarding.
System software: When the active control board is running, it
ng
synchronizes configuration and interface state information to the

ni
standby control board. When an active/standby switchover occurs,

LPUs do not reset or withdraw forwarding entries, and the interfaces
ar
remain Up.
Protocols: Graceful restart (GR) must be supported for related
Le
network protocols, such as routing protocols OSPF, IS-IS, and BGP,

and other protocols such as Label Distribution Protocol (LDP) and
Resource Reservation Protocol (RSVP).
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Graceful Restart (GR) is a mechanism that ensures nonstop service

data forwarding during an active/standby switchover or a protocol
ht
restart. When a device is performing a protocol restart, it notifies

neighboring devices of its restart so that the neighboring relationships
and routes are stably maintained in a certain period. After the protocol
s:
restart is complete, the neighboring devices synchronize configurations

ce
(including the topologies, routes, and sessions maintained by the GR-

related protocols) to the GR Restarter. The configurations on the GR
ur
Restarter are quickly restored. During the protocol restart, route

flapping will not occur and packet forwarding path is not changed. The
so
entire system continuously works.

Re
OSPF GR terms:
GR Restarter: indicates the GR-capable device where protocol restart
occurs.
ng
GR Helper: indicates a device neighboring with the GR Restarter and

ni
helping complete the GR process.

GR Session: indicates the process of GR capability negotiation
ar
performed during OSPF neighbor relationship establishment. The

negotiated content includes whether the two parties have the GR
Le
capability. If the GR capability negotiation is successful, the GR

process starts when the protocol restart occurs.
re
Assume that R1 and R2 have a stable OSPF neighbor relationship and

Mo
GR capability is enabled on R1 and R2. When R1 restarts, the GR

en
After R1 restarts, it sends a Grace LSA to R2.
m/
When R2 receives the Grace LSA sent by R1, it maintains the
neighbor relationship with R1.
co
R1 and R2 exchange hello and DD packets and synchronize LSDB to
each other. LSAs are not generated during GR; therefore, if R1
.
receives its own LSAs from R2 during LSDB synchronization, it stores
ei
them and adds the Stable tag.
w
After LSDB synchronization is complete, R1 sends Grace LSA to
ua
notify R2 that the GR is finished. R1 starts the OSPF process and
regenerates LSAs, and then deletes the LSAs that are tagged Stable
.h
and not regenerated.
After restoring all routing entries, R1 starts to recalculate routes and
g
updates the FIB table.
in
OSPF GR commands:
The opaque-capability enable command enables the Opaque-LSA
rn
capability. After Opaque-LSA capability is enabled, an OSPF process
ea
can generate Opaque-LSAs and receive Opaque-LSAs from
neighboring devices.
/l
The graceful-restart command enables OSPF GR.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IS-IS GR also uses the concepts of GR Restarter, GR Helper, and GR

Session, which are the same as that used in OSPF GR.
ht
To support the GR feature, IS-IS adds the Restart TLV field to hello
packets and defines three timers.
T1 timer is similar to the IIH timer used in the IS-IS protocol. When a
s:
device restarts, it creates a T1 timer on each interface and periodically

ce
sends hello packets. The T1 timer on an interface is deleted only when

the interface receives all hello ACK packets and CSNP packets.
ur
T2 defines the timeout period of LSDB synchronization after a device

restarts. The T2 timer of a Level is deleted only when the LSDB of this
so
Level completes synchronization. If LSDB synchronization is not

complete when the T2 timer expires, the T2 timer is deleted and GR
Re
fails.
T3 defines the maximum time during which the GR Restarter
performs GR. If LSDB synchronization is not complete when the T3
ng
timer expires, the T3 timer is deleted and GR fails.

ni
Assume that R1 and R2 have a stable IS-IS neighbor relationship and

ar
GR capability is enabled on R1 and R2. When R1 restarts, the GR

Le
T2 and T3 timers start when the IS-IS protocol on R1 is globally

enabled again. When the interface of R1 goes Up again and enables
the IS-IS protocol, the T1 timer starts on the interface and the interface
re
sends a hello packet.

Mo
en
When R2 receives the hello packet from R1, it maintains the neighbor
m/
relationship with R1 and sends a hello packet. Then R2 sends a CSNP
packet and an LSP packet to R1 to help LSDB synchronization.
co
When the interface of R1 receives the hello packet and all CSNP
packets, R1 deletes the T1 timer; otherwise, R1 periodically sends hello
.
packets until it receives all hello packets and CSNP packets. If the
ei
number of times the T1 timer expires reaches the maximum value, the
w
T1 timer is also deleted.
ua
When the LSDB synchronization is complete, R1 deletes the T2 timer.
After all T2 timers are deleted, R1 starts to delete T3 timers. When
.h
the GR is complete, R1 starts the IS-IS process. IIH timer is started on
all interfaces, and then R1 can periodically send hello packets.
g
After restoring all routing entries, R1 starts to recalculate routes and
in
updates the FIB table.
rn
IS-IS GR command:
The graceful-restart command enables IS-IS GR.
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
LAND attack
Because of the vulnerability in the 3-way handshake mechanism of
ht
TCP, a LAND attacker sends SYN packets of which the source address
and port of a device are the same as the destination address and port
respectively. After receiving the SYN packet, the target host creates a
s:
null TCP connection with the source and destination addresses as the
ce
address of the target host. The connection is kept until expiration. The
target host will create many null TCP connections, wasting resources or
ur
causing device breakdown.

After defense against malformed packet attacks is enabled, the
so
device checks source and destination addresses in TCP SYN packets

to prevent LAND attacks. The device considers TCP SYN packets with
Re
the same source and destination addresses as malformed packets and

discards them.
ng
Commands for configuring defense against malformed packet attacks

ni
The anti-attack abnormal enable command configures defense

against malformed packets. After the command is executed, the device
ar
discards malformed packets.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
TCP SYN attack

The TCP SYN attack takes advantage of the vulnerability in 3-way
ht
handshake of TCP. During the 3-way handshakes of TCP, when

receiving the initial SYN message from the client, the server sends
back an SYN+ACK packet. When the server is waiting for the final ACK
s:
packet from the client, the connection stays in half-connected mode. If

ce
the server fails to receive the ACK packet, it resends a SYN+ACK

packet to the client. If the server still cannot receive ACK packets, the
ur
server closes the connection and updates the session status in memory.
The interval from the sending of initial SYN+ACK packet to connection
so
closing is about 30 seconds.

During this interval, the attacker may send more than 100 thousands
Re
of SYN packets to the open interfaces and does not respond to the
SYN+ACK packets from the server. Then, memory of the server is
overloaded and cannot accept new connection requests. As a result,
ng
the server closes all active connections.

ni
After defense against TCP SYN flood attacks is enabled, the device
limits the rate of TCP SYN packets so that system resources will not be
ar
exhausted by attacks.
Le
Commands for configuring defense against malformed packet attacks

The anti-attack udp-flood enable command enables the TCP SYN
Flood attack defense.
re
The anti-attack tcp-syn car command configures the rate limit for
Mo
TCP SYN packets. If the rate of received TCP SYN flood packets
exceeds the limit, the device discards excess packets to ensure normal
working of CPU.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Two modes of URPF:

Strict mode
ht
In this mode, packets can pass the check only when

the forwarding table contains the related entries and
the interface of the default route matches the inbound
s:
interface of the packet.

If route symmetry is ensured, you are advised to use
ce
the URPF strict check. For example, if there is only one

ur
path between two network edge devices, URPF strict

check can be used to ensure network security.
so
Loose mode
In this mode, packets pass the check as long as the
Re
source IP addresses of the packets match the entries

in the routing table.
If route symmetry is not ensured, you are advised to
ng
use the URPF loose check. For example, if there are

ni
multiple paths between two network edge devices,

URPF loose check can be used to ensure network
ar
security.
Le
A bogus packet with source IP address 2.1.1.1 is sent by the attacker

to S1. After receiving the bogus packet, S1 sends a response packet to
the destination device at 2.1.1.1. In this situation, both S1 and PC1 are
re
attacked by the bogus packets. If URPF is enabled on S1, when S1

Mo
receives the bogus packet with source IP address 2.1.1.1, URPF

discards the packet because the interface corresponding to the source
address of the packet does not match the interface receiving the packet.
en
URPF command
m/
The urpf command enables URPF on an interface and set the URPF
mode.
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
IPSG principles
IPSG matches IP packets against static or dynamic DHCP binding
ht
table. Before a network device forwards an IP packet, it compares the

source IP address, source MAC address, interface, and VLAN
information in the IP packet with entries in the binding table. If a
s:
matching entry is found, the device considers the IP packet valid and
ce
forwards it. Otherwise, the device considers the IP packet as an attack

packet and discards it.
ur
Working process
so
After IPSG is configured on S1, S1 checks the incoming IP packets

against the binding table. When the packet information matches the
Re
binding table, the packets are forwarded; otherwise, the packets are
discarded.
ng
IPSG commands
ni
The binding table can be generated through DHCP or manually

configured through static IP addresses (the user-bind static command
ar
is used to configure static table).

The ip source check user-bind enable command enables the IPSG
Le
function on an interface to check the received IP packets.

The ip source check user-bind check-item command configures
VLAN- or interface-based IP packet check items. This command is only
re
valid to dynamic binding table.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
The figure shows a scenario of the MITM attack. The attacker sends a
ht
bogus ARP packet using the PC3's address as the source address to
PC1. PC1 records incorrect address mapping relationship of PC3 in the
ARP table. The attacker thus obtains the data sent by PC1 to PC3 and
s:
sent by PC3 to PC1. Therefore, information between PC1 and PC3

ce
leaks.
To prevent MITM attacks, configure DAI on S1.
ur
When an attacker connects to S1 and attempts to send bogus ARP

packet to S1, S1 detects the attack behavior according to the DHCP
so
snooping binding table and discards the ARP packet. If the ARP
discarding alarm is enabled on S1, when the number of discarded ARP
Re
packets exceeds the alarm threshold, S1 sends an alarm to notify the

administrator.
ng
DAI uses DHCP snooping binding table to defend against MITM attacks.
ni
Before a device forwards an ARP packet, it compares the source IP

address, source MAC address, interface, and VLAN information in the
ar
ARP packet with entries in the binding table. If an entry is matched, the
device considers the packet valid and forwards it; otherwise, the device
Le
considers the packet as an attack packet and discards it.
DAI command
re
The arp anti-attack check user-bind enable command enables DAI

Mo
on an interface or in a VLAN. That is, the device checks ARP packets

against the binding table.
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
QoS provides differentiated service qualities for different applications,

for example, dedicated bandwidth, decreased packet loss ratio, short
ht
packet transmission delay, and decreased delay and jitter.
Best-effort service model

s:
Routers and switches are packet switching devices. They

ce
select transmission path for each packet based on TCP/IP and

use the statistics multiplexing method, but do not use the
ur
dedicated connections like TDM. Traditionally, IP provides only

one service model (Best-Effort). In this model, all packets
so
transmitted on a network have the same priority. Best-Effort

means that the IP network tries best to transmit all packets to
Re
the correct destination addresses completely and ensure that

the packets are not discarded, damaged, repeated, or loss of
ng
sequence during transmission. However, the Best-Effort model

does not guarantee any transmission indicators, such as delay
ni
and jitter.
Best-Effort is not belongs to the QOS technical in strict, but is
ar
the major service model used by today's Internet. So we need

know about it.
Le
Due to the Best-Effort model, the Internet has made a lot of

achievements. However, with the development of the Internet,
the Best-Effort model cannot meet increasing requirements of
re
emerging applications. Therefore, the SPs have to provide

Mo
more types of service based on the Best-Effort model, to meet

requirements of each application.
en
IntServ model
m/
The IntServ model, developed by IETF in 1993, supports
various types of service on IP networks. It provides both real-
co
time service and best-effort service on IP networks. The
IntServ model reserves resources for each information flow.
.
The source and destination hosts exchange RSVP messages
ei
to establish packet categories and forwarding status on each
w
node along the transmission path. The model maintains a
ua
forwarding state for each flow, so it has a poor extensibility.
There are millions of flows on the Internet, which consume a
.h
large number of device resources. Therefore, this model is not
widely used. In recent years, IETF has modified the RSVP
g
protocol, and defines that RSVP can be used together with the
in
DiffServ model, especially in the MPLS VPN field. Therefore,
rn
RSVP has a new improvement. However, this model still has
not been widely used. THe DiffServ model addresses
ea
problems in the IntServ mode, so the DiffServ model is a
widely used QoS technology.
DiffServ model
/l
:/
The IntServ has a poor extensibility. After 1995, SPs and
research organizations developed a new mechanism that
tp
supports various services. This mechanism has a high

ht
extensibility. In 1997, IETF recognized that the service model

in use is not applicable to network operation, and there should
be a way to classify information flows and provide
s:
differentiated service for users and applications. Therefore,

IETF developed the DiffServ model, which classifies flow on
ce
the Internet and provides differentiated service for them. The

DiffServ model supports various applications and is applicable
ur
to many business models.

so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Precedence field
The 8-bit Type of Service (ToS) field in an IP packet header
ht
contains a 3-bit IP precedence field.

Bits 0 to 2 constitute the Precedence field, representing
precedence values 7, 6, 5, 4, 3, 2, 1 and 0 in descending order
s:
of priority. The highest priorities (values 7 and 6) are reserved

ce
for routing and network control communication updates. User-

level applications can use only priority values 0 to 5. Bits 6 and
ur
7 are reserved.
Apart from the Precedence field, a ToS field also contains the
so
D, T, and R sub-fields:
Bit D indicates the delay. The value 0 represents a
Re
normal delay and the value 1 represents a short delay.

Bit T indicates the throughput. The value 0 represents
ng
normal throughput and the value 1 represents high

throughput.
ni
Bit R indicates the reliability. The value 0 represents

normal reliability and the value 1 represents high
ar
reliability.
Le
DSCP field
RFC 2474 redefines the TOS field. The right-most 6 bits
identify service type and the left-most 2 bits are reserved.
re
DSCP can classify traffic into 64 categories.

Mo
en
Each DSCP value matches a Behavior Aggregate (BA) and
m/
each BA matches a PHB (such as forward and discard), and
then the PHB is implemented using some QoS mechanisms
co
(such as traffic policing and queuing technologies).
DiffServ network defines four types of PHB: Expedited
.
Forwarding (EF), Assured Forwarding (AF), Class Selector
ei
(CS), and Default PHB (BE PHB). EF PHB is applicable to the
w
services that have high requirements on delay, packet loss,
ua
jitter, and bandwidth. AF PHBs are classified into four
categories and each AF PHB category has three discard
.h
priorities to specifically classify services. The performance of
AF PHB is lower than the performance of EF PHB. CS PHBs
g
originate from IP TOS, and are classified into 8 categories. BE
in
PHB is a special type in CS PHB, and does not provide any
rn
guarantee. Traffic on IP networks belongs to this category by
default.
ea
Priority mapping configuration
/l
Configure the trusted packet priorities: Run the trust command
to specify the packet priority to be mapped.
:/
Configure the priority mapping table: Run the qos map-table
command to enter the 802.1p or DSCP mapping table view,
tp
and run the input command to set the priority mappings.

ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Token bucket
A token bucket with a certain capacity stores tokens. The
ht
system places tokens into a token bucket at the configured

rate. When the token bucket is full, excess tokens overflow
and no token is added.
s:
A token bucket forwards packets according to the number of

tokens in the token bucket. If there are sufficient tokens in the
ce
token bucket for forwarding packets, the traffic rate is within

the rate limit. Otherwise, the traffic rate is not within the rate
ur
limit.
so
Single-rate-single-bucket
A token bucket is called bucket C. Tc indicates the number of
Re
tokens in the bucket. Single-rate-single-bucket has two

parameters:
Committed Information Rate (CIR): indicates the rate of
ng
putting tokens into bucket C, that is, the average traffic

rate permitted by bucket C.
ni
Committed Burst Size (CBS): indicates the capacity of

bucket C, that is, the maximum volume of burst traffic
ar
allowed by bucket C each time.

The system places tokens into the bucket at the CIR. If Tc is
Le
smaller than the CBS, Tc increases; otherwise, Tc does not

increase.
B indicates the size of an arriving packet:
re
If B is smaller than or equal to Tc, the packet is colored

green, and Tc decreases by B.
Mo
If B is greater than Tc, the packet is colored red, and

Tc remains unchanged.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Single-Rate-Double-Bucket
Two token buckets are available: bucket C and bucket E. Tc and Te
ht
indicate the number of tokens in the bucket. Single-rate-double-bucket

has three parameters:
s:

ce

ur

so
Excess Burst Size (EBS): indicates the capacity of

bucket E, that is, the maximum volume of excess burst
Re
traffic allowed by bucket E each time.

The system places tokens into the buckets at the CIR:
If Tc is smaller than the CBS, Tc increases.
ng
If Tc is equal to the CBS and Te is smaller than the

ni
EBS, Te increases.
If Tc is equal to the CBS and Te is equal to the EBS,
ar
Tc and Te do not increase.

Le

green, and Tc decreases by B.
If B is greater than Tc and smaller than or equal to Te,
re
the packet is colored yellow and Te decreases by B.

Mo
If B is greater than Te, the packet is colored red, and

Tc and Te remain unchanged.
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Double-Rate-Double-Bucket
Two token buckets are available: bucket P and bucket C. Tp and Tc
ht
indicate the number of tokens in the bucket. Double-rate-double-bucket

has four parameters:
Peak information rate (PIR): indicates the rate at which
s:
tokens are put into bucket P, that is, the maximum

ce
traffic rate permitted by bucket P. The PIR must be

greater than the CIR.
ur

so

Peak Burst Size (PBS): indicates the capacity of bucket
Re
P, that is, the maximum volume of burst traffic allowed

by bucket P each time. PBS is greater than CBS.
ng

ni

The system places tokens into bucket P at the rate of PIR and
ar
places tokens into bucket C at the rate of CIR:

If Tp is smaller than the PBS, Tp increases. If Tp is
Le
greater than or equal to the PBS, Tp remains

unchanged.
If Tc is smaller than the CBS, Tc increases. If Tc is
re
greater than or equal to the CBS, Tc remains

Mo
unchanged.
en
m/
If B is greater than Tp, the packet is colored red.
If B is greater than Tc and smaller than or equal to Tp,
co
the packet is colored yellow and Tp decreases by B.
.
green, and Tp and Tc decrease by B.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Traffic policing discards excess traffic to limit traffic within a proper

range and to protect network resources and enterprises' interests.
ht
Traffic policing consists of:

Meter: measures the network traffic using the token bucket
s:
mechanism and sends the measurement result to the marker.

Marker: colors packets in green, yellow, or red based on the
ce
measurement result received from the meter.

Action: takes actions based on packet coloring results (packets in
ur
green or yellow are forwarded and packets in red are discarded by

default) received from the marker. The following actions are defined:
so
Pass: forwards the packets that meet network

requirements.
Re
Remark + pass: changes the local priorities of packets

and forwards them.
Discard: discards the packets that do not meet network
ng
requirements.
ni
If the rate of a type of traffic exceeds the threshold, the device lowers
the packet priority and then forwards or directly discards the packets.
ar
By default, these packets are discarded.

Le
Traffic policing commands:

Configure interface-based traffic policing: Run the qos car
command to create a QoS CAR profile and configure QoS CAR
re
parameters. The parameters in the command vary when the command

is executed on a WAN interface and a LAN interface.
Mo
Configure rate limiting on WAN interface: Run the qos lr command

to set the ratio of packet rate sent by a physical interface to the total
interface bandwidth.
en
m/
co
.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Traffic policing discards excess traffic to limit traffic within a proper

range and to protect network resources and enterprises' interests.
ht
Traffic shaping process:

When packets arrive, the device classifies packets into different
s:
types and places them into different queues.

If the queue that packets enter is not configured with traffic shaping,
ce
the packets are immediately sent. Packets requiring queuing proceed

ur
to the next step.

The system places tokens to the bucket at the specified rate (CIR):
so
If there are sufficient tokens in the bucket, the device

forwards the packets and the number of tokens
Re
decreases.
If there are insufficient tokens in the bucket, the device
ng
places the packets into the buffer queue. When the

buffer queue is full, packets are discarded.
ni
When there are packets in the buffer queue, the system extracts the
packets from the queue and sends them periodically. Each time the
ar
system sends a packet, it compares the number of packets with the

number of tokens till the tokens are insufficient to send packets or all
Le
the packets are sent.
Traffic shaping commands:

re
Configure interface-based traffic shaping: Run the qos gts

Mo
command to configure traffic shaping on the interface.

en
Configure queue-based traffic shaping.
m/
Run the qos queue-profile queue-profile-name
command to create a queue profile and display the
co
queue profile view.
Run the queue { start-queue-index [ to end-queue-
.
index ] } &<1-10> length { bytes bytes-value | packets
ei
packets-value } command to set the length of each
w
queue.
ua
Run the queue { start-queue-index [ to end-queue-
index ] } &<1-10> gts cir cir-value [ cbs cbs-value ]
.h
command to configure queue-based traffic shaping. By
default, traffic shaping is not performed for queues.
g
in
command to apply the queue profile to an interface.
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
If the rate of incoming packets on an interface is higher than the rate of

outgoing packets, the interface is congested. If there is insufficient
ht
space for storing the packets, some packets are discarded. When
packets are discarded, hosts or routers retransmit the packets, leading
to a vicious circle.
s:
When congestion occurs, multiple packets preempt resources. The

ce
packets that cannot obtain resources are discarded. The bandwidth,

delay, and jitter of key services cannot be ensured. The core of
ur
congestion management is to decide the resource scheduling policy

that specifies the packet forwarding sequence. Generally, devices use
so
the queue technology to cope with congestion. The queue technology

involves queue creation, traffic classifier, and queue scheduling.
Re
Initially, there is only one queue scheduling policy, that is, First-in-First-
out. To meet different service requirements, more scheduling policies
are developed.
ng
Queue scheduling mechanisms include hardware queue scheduling

ni
and software queue scheduling. Hardware queue is also called transmit

queue (TxQ). The interface drive uses this queue when transmiting
ar
packets one by one. The hardware queue is a FIFO queue. Software

queue schedules data packets to hardware queue according to QoS
Le
requirements. It can use multiple scheduling methods.

Data packets enter the software queue only when the hardware queue
is full.
re
Mo
en
The hardware queue length depends on the bandwidth setting on the
m/
interface. If the interface bandwidth is high, transmission delay is short,
so queue length can be long. An appropriate hardware queue length is
co
important. If the hardware queue length is too long, the policy execution
performance of the software queue degrades because the hardware
.
queue uses the FIFO mechanism for scheduling. If the hardware queue
ei
length is too short, scheduling efficiency is low, link use efficiency is low,
w
and the CPU usage is high.
ua
LAN ports support the FQ and WRR queues.
WAN ports support the FQ and WFQ queues.
.h
Configuration commands:
g
Run the qos queue-profile queue-profile-name command to
in
create a queue profile and display the queue profile view.
On the WAN-side interface, run the schedule{ { pq start-
rn
queue-index [ to end-queue-index ] } | {wfq start-queue-index
ea
[ to end-queue-index ] } command to set a scheduling mode
for each queue on the WAN-side interface.
/l
On the LAN-side interface, run the schedule{ { pq start-
queue-index [ to end-queue-index ] } | { drr start-queue-index
:/
[ to end-queue-index ] } | {wrr start-queue-index [ to end-
queue-index ] } command to set a scheduling mode for each
tp
queue on the LAN-side interface.

Run the qos queue-profile queue-profile-name command to
ht
apply the queue profile to an interface.

s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
FIFP characteristics:
Advantages:
ht
Simple
Disadvantages:
Unfair and no separation between flows. A large flow
s:
will occupy the bandwidth of other flows, which

ce
prolongs the delay of other flows.

When congestion occurs, FIFO discards some packets.
ur
When TCP detects packet loss, it lowers transmission

speed to avoid congestion. However, UDP does not
so
lower transmission speed because it is a

connectionless protocol. As a result, the TCP and UDP
Re
packets in FIFO are not equally processed. The TCP

packet rate is too low.
A flow may occupy all the buffer space and blocks
ng
other types of traffic.

ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
RR
Advantages:
ht
Different flows are separated, and bandwidth is equally

allocated to queues.
Available bandwidth is equally allocated to other
s:
queues.
Disadvantages:
ce
Weights cannot be configured for the queues.

ur
When queues have different packet lengths,

scheduling is inaccurate.
so
When scheduling rate is low, delay and jitter indicators

will deteriorate. For example, when a packet arrives at
Re
an empty queue that is just scheduled, this packet can

be processed only when all the other queues are
ng
scheduled. In this situation, jitter is serious. However, if

scheduling rate is high, the delay is short. The RR
ni
mode is widely used on high-speed routers.

ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Compared with RR, WRR can set the weights of queues. During the
WRR scheduling, the scheduling chance obtained by a queue is in
ht
direct proportion to the weight of the queue. During the WRR

scheduling, the empty queue is directly skipped. Therefore, when there
is a small volume of traffic in a queue, the remaining bandwidth of the
s:
queue is used by the queues according to a certain proportion.

Advantages:
ce
Bandwidth is allocated based on weights, and the

ur
remaining bandwidth of a queue is equally allocated to

other queues. Low-priority queues are also scheduled
so
in a timely manner.
It is easy to implement.
Re
Applicable to DiffServ ports.

Disadvantages:
Similar to RR, WRR is inaccurate when queues have
ng
different packet lengths.

ni
When scheduling rate is low, packet delay is unstable

and the delay and jitter indicators cannot be lowered to
ar
the expected values.

Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
PQ
PQ has four-level queues, including Top, Middle, Normal, and
ht
Bottom. However, most devices support eight-level queues.

Packets in queues with a low priority can be scheduled only
after all packets in queues with a high priority have been
s:
scheduled. Therefore, PQ has obvious advantages and

ce
disadvantages.
PQ ensures that the packets in high-priority queues obtain
ur
high bandwidth, low delay and jitter; however, the packets in

low-priority queues cannot be scheduled in a timely manner or
so
even cannot be scheduled. As a result, the lower-priority

queues starve out.
Re
PQ has the following characteristics:

Uses ACL to classify packets into different types and
ng
adds packets to the corresponding queues.

Packets are discarded only by using the Tail Drop
ni
mechanism.
When the queue length is set to 0, the queue length
ar
can be infinite. That is, the packets entering this queue

are not discarded by Tail Drop unless the memory
Le
space is exhausted.
The FIFO logic is used internal the queue.
The packets in low-priority queues are scheduled only
re
after all packets in high-priority queues are scheduled.

Mo
PQ ensures high quality for specified service traffic, but does

not care about the quality of other services.
en
Advantages:
m/
Precisely controls the delay of high-priority queues.
Easy to implement, differentiating services
co
Disadvantages:
Cannot allocate bandwidth as required. When high-
.
priority queues have many packets, the packets in low-
ei
priority queues cannot be scheduled.
w
It shortens the delay of high-priority queues by
ua
compromising the service quality of low-priority queues.
If a high-priority queue transmits TCP packets and a
.h
low-priority queue transmits UDP packets, the TCP
packets are transmitted at a high speed, while UDP
g
packets cannot obtain sufficient bandwidth.
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
CQ
The number of bytes to be scheduled must be specified for
ht
each queue. A packet can be scheduled only when its length

exceeds the specified byte size. If the configured byte size is
too small, the queue may be congested. If the configured byte
s:
size is small, bandwidth allocation is inaccurate. For example,

ce
500 bytes is specified for a queue, while most packets in the

queue exceed 1000 bytes. Therefore, the bandwidth actually
ur
allocated is higher than the expected bandwidth. If the number

of bytes specified is large, it is difficult to control the delay. CQ
so
can schedule multiple packets each time. The number of

packets to be scheduled is the same as the number of packets
Re
that can be accommodated by the bytes scheduled each time.

Advantages:
Allocates bandwidth according to certain percentages.
ng
When the traffic volume of a queue is small, other

ni
queues can occupy the bandwidth of this queue.

Easy to implement
ar
Disadvantages:
When the specified number of bytes is small,
Le
bandwidth allocation is inaccurate. When the specified

number of bytes is large, delay and jitter are serious.
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
WFQ
Weighted Fair Queuing (WFQ) classifies packets by flow. On
ht
an IP network, the packets with the same source IP addresses,

destination IP addresses, protocol numbers, and IP
precedence belong to the same flow. On an MPLS network,
s:
the packets with the same labels and EXP fields belong to the
ce
same flow. WFQ assigns each flow to a queue, and tries to

assign different flows to different flows. When packets leave
ur
the queues, WFQ allocates the bandwidth on the outbound

interface for each flow according to the weights. The smaller
so
the weight value of the flow is, the smaller the bandwidth the
flow obtains. The greater the weight value of the flow is, the
Re
greater the bandwidth the flow obtains. In this manner,

services of the same priority are treated equally; services of
ng
different priorities are allocated with different weight.

For example, there are eight flows on the interface, with
ni
weights as 1, 2, 3, 4, 5, 6, 7, and 8 respectively. The total

bandwidth quota is the sum of weights, that is, 1 + 2 + 3 + 4 +
ar
5 + 6 + 7 + 8 = 36. The bandwidth occupied by each flow is:

Weight of each flow/Total bandwidth quota. That is, flows
Le
obtain the bandwidth of 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 7/36,
and 8/36. Thus, WFQ assigns different scheduling weights to
services of different priorities while ensuring fairness between
re
services of the same priority.

Mo
Advantages:
en
The queues are scheduled fairly based on the
m/
granularity of bytes.
Differentiates services and allocates weights.
co
Properly controls delay and reduces jitter.
Disadvantages:
.
Difficult to implement.
w ei
ua
g .h
in
rn
ea
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Congestion Avoidance
Tail drop is a traditional method in the congestion avoidance
ht
mechanism. When the length of a queue reaches the

maximum value, all the packets are discarded. If too many
TCP packets are dropped, TCP times out. This may result in
s:
slow TCP start and trigger the congestion avoidance

ce
mechanism so that the device slows down the transmission of

TCP packets. When queues drop several TCP-connection
ur
packets at the same time, these TCP connections start

congestion avoidance and slow startup, which is referred to as
so
global TCP synchronization. Thus, these TCP connections

simultaneously send fewer packets to the queue so that the
Re
rate of incoming packets is smaller than the rate of outgoing

packets, reducing the bandwidth usage. Moreover, the volume
ng
of traffic sent to the queue varies greatly from time to time. As

a result, the volume of traffic over the link fluctuates between
ni
the bottom and the peak. The delay and jitter of certain traffic
are affected.
ar
The traditional packet loss policy uses the tail drop method.
When the queue length reaches the upper limit, the excess
Le
packets (buffered at the queue tail) are discarded.

To prevent global TCP synchronization, Random Early
Detection (RED) is used. The RED technique randomly
re
discards packets to prevent the transmission speed of multiple

Mo
TCP connections from being reduced simultaneously. The

TCP rate and network traffic volume thus are stable.
en
The device provides Weighted Random Early Detection
m/
(WRED) based on RED technology. WRED discards packets
in queues based on DSCP field or IP precedence. The upper
co
drop threshold, lower drop threshold, and drop probability can
be set for each priority. When the number of packets of a
.
priority reaches the lower drop threshold, the device starts to
ei
discard packets. When the number of packets reaches the
w
upper drop threshold, the device discards all the packets. A
ua
higher threshold indicates a high drop probability. The
maximum drop probability cannot exceed the upper drop
.h
threshold. WRED discards packets in queues based on the
drop probability, thereby relieving congestion.
g
WRED configuration:
in
Configure a drop profile.
Run the drop-profile drop-profile-name
rn
command to create a drop profile and enter the
ea
drop profile view.
Run the dscp{ dscp-value1 [ to dscp-value2 ] }
/l
&<1-10> low-limit low-limit-percentage high-
limit high-limit-percentage discard-percentage
:/
discard-percentage command to set DSCP-
based WRED parameters.
tp
Run the ip-precedence { ip-precedence-value1

ht
[ to ip-precedence-value2 ] } &<1-10> low-limit

low-limit-percentage high-limit high-limit-
percentage discard-percentage discard-
s:
percentage command to set IP precedence-

based WRED parameters.
ce
Apply the drop profile.

ur
command to enter the queue profile view.

Run the schedule wfq start-queue-index [ to
so
end-queue-index ] command to set the

Re
scheduling mode of a queue to WFQ.

Run the queue { start-queue-index [ to end-
queue-index ] } &<110> drop-profile drop-
ng
profile-name command to bind a drop profile to

a queue in a queue profile.
ni

command to apply the queue profile to an
ar
interface.
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Traffic classification is used to identify the packets with certain

characteristics according to a rule, and is the prerequisite and basis for
ht
differentiated services. You can define rules to classify packets and

specify the relationships between rules:
AND: Packets match a traffic classifier only when the packets
s:
match all the rules. If a traffic classifier contains ACL rules,

ce
packets match the traffic classifier only when the packets

match one ACL rule and all the non-ACL rules. If a traffic
ur
classifier does not contain ACL rules, packets match the traffic
classifier only when the packets match all the non-ACL rules.
so
OR: Packets match a traffic classifier as long as the packets

match a rule.
Re
A traffic behavior refers to an action taken for packets. Performing

ng
traffic classification is to provide differentiated services. A

traffic classifier takes effect only when it is associated with a
ni
traffic control action or a resource allocation action.

ar
A traffic policy is configured by binding traffic classifiers to traffic

behaviors. After a traffic policy is applied to an interface,
Le
globally, to a board, or to a VLAN, differentiated service is

provided.
re
Traffic policy configuration commands

Mo
en
Configure a traffic classifier.
m/
Run the traffic classifier classifier-name [ operator
{ and | or } ] command to create a traffic classifier and
co
enter the traffic classifier view.
Configure a traffic behavior.
.
Run the traffic behavior behavior-name command to
ei
create a traffic behavior and enter the traffic behavior
w
view.
ua
Configure a traffic policy.
Run the traffic policy policy-name command to create
.h
a traffic policy and enter the traffic policy view.
The classifier behavior command binds a traffic
g
behavior to a traffic classifier to a traffic behavior in a
in
traffic policy.
Run the traffic-policy policy-name { inbound | outbound }
rn
command to apply a traffic policy to the interface or sub-
ea
interface in the inbound or outbound direction.
/l
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
SNMP model
NMS station is the manager in a network management system. It
ht
uses the SNMP protocol to manage and monitor the network. The NMS
software runs on an NMS server.
Agent is a process on the managed device. The agent maintains data
s:
on the managed device, receives and processes the request packets

ce
from the NMS, and then sends the response packets to the NMS.
Management object is the object to be managed. A device may have
ur
multiple management objects, including a hardware component (such

as an interface board) and parameters (such as a routing protocol)
so
configured for the hardware or software.

MIB is a database specifying variables that are maintained by the
Re
managed device and can be queried or set by the agent. MIB defines
attributes of the managed device, including the name, status, access
rights, and data type of objects.
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Operations of SNMPv1 and SNMPv2c

Get: reads one or several parameter values from the MIB of the agent
ht
process.
GetNext: reads the next parameter value from the MIB of the agent
process.
s:
Set: sets one or several parameter values in the MIB of the agent
ce
process.
Response: returns one or more queried values. The agent performs
ur
this operation that corresponds to the GetRequest, GetNextRequest,

SetRequest, and GetBulkRequest operations. Upon receiving a Get or
so
Set request, the agent performs the Query or Modify operation using
MIB tables and then sends the responses to the NMS.
Re
Trap: sent by an agent process to notify the NMS of a fault or event

on the managed device.
ng
New Operation Types of SNMPv2c

ni
GetBulk: The NMS queries managed devices in batches. It is

implemented based on the GetNext operation. A GetBulk operation
ar
equals to a series of GetNext operations. You can specify the number

of times the GetNext operation is executed on the managed device
Le
during a GetBulk interaction.

InformRequest: sent by a managed device to notify the NMS of an
alarm on a managed device. After the managed device sends an inform,
re
the NMS must send an InformResponse packet to the managed device.

Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
Operations related to SNMPv3:

The NMS sends a Get request without security parameters to the
ht
agent.
The agent responds and returns requested parameters to the NMS.
The NMS sends a Get request carrying security parameters to the
s:
agent.
The agent encrypts response packet and returns required parameters
ce
to the NMS.
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
NQA Principles
Creating a test instance
ht
NQA requires two test ends, an NQA client and an

NQA server (or called the source and destination). The
NQA client (or the source) initiates an NQA test. You
s:
can configure test instances through command lines or

ce
the NMS. Then NQA places the test instances into test
queues for scheduling.
ur
Starting the test instance

When starting an NQA test instance, you can choose to
so
start the test instance immediately, at a specified time,

or after a delay. A test packet is generated based on
Re
the type of a test instance when the timer expires. If the

size of the generated test packet is smaller than the
ng
minimum size of a protocol packet, the test packet is

generated and sent out with the minimum size of the
ni
protocol packet.
Processing a test instance
ar
After a test instance starts, the protocol-related running

status can be collected according to response packets.
Le
The client adds a timestamp to a test packet based on

the local system time before sending the packet to the
server. After receiving the test packet, the server sends
re
a response packet to the client. The client then adds a

Mo
timestamp to the received response packet based on

the current local system time. This helps the client
calculate the round-trip time (RTT) of the test packet
based on the two timestamps.
en
An NQA ICMP test instance checks whether a route from the NQA
m/
client to the destination is reachable. The ICMP test has a similar
function as the ping command, while the ICMP test provides more
co
output information:
By default, the command output shows the results of the latest five
.
tests.
ei
The output includes the average delay, the packet loss ratio, and the
w
time the last packet is correctly received.
ua
Test Procedure
.h
Source (R1) sends an ICMP echo request packet to the destination
(R2).
g
After receiving the ICMP echo request packet, the destination (R2)
in
responds to the source (R1) with an ICMP echo reply packet.
The source (R1) then can calculate the time of communication
rn
between the source (R1) and the destination (R2) by subtracting the
ea
time the source sends the ICMP echo request packet from the time the
source receives the ICMP echo reply packet. The calculated data can
/l
reflect the network performance and operating status.
:/
tp
ht
s:
ce
ur
so
Re
ng
ni
ar
Le
re
Mo
en
m/
. co
w ei
ua
g .h
in
rn
ea
/l
:/
tp
NTP synchronization process

R1 sends an NTP packet to R2. The packet carries a timestamp,
ht
10:00:00 am (T1), indicating the time it leaves R1.

When the NTP packet reaches R2, R2 adds a timestamp, 11: 00:01
am (T2), to the NTP packet, indicting the time R2 receives the packet.
s:
When the NTP packet leaves R2, R2 adds a transmit timestamp,

ce
11:00:02 am (T3), to the NTP packet, indicating the time it leaves R2.
When R1 receives this response packet, it adds a new receive
ur
timestamp, 10:00:03 am (T4), to the packet. R1 uses the received

information to calculate the following two important parameters:
so
Roundtrip delay of the NTP packet: Delay = (T4 - T1) -

(T3 - T2)
Re
Clock offset of R1 by taking R2 as a reference: Offset =

((T2 - T1) + (T3 - T4))/2
After the calculation, R1 knows that the roundtrip delay is 2 seconds
ng
and the clock offset of R1 is 1 hour. R1 sets its own clock based on
ni
these two parameters to synchronize its clock with that of R2.

ar
Le
re
Mo
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en
Mo
re
Le
ar
ni
ng
Re
so
ur
ce
s:
ht
tp
:/
/l
ea
rn
in
g .h
ua
w ei
. co
m/
en

HCIE-R&S Huawei Certified Internetwork Expert - Routing and Switching PDF

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

HCIE-R&S Huawei Certified Internetwork Expert - Routing and Switching PDF

Cargado por

Copyright:

Formatos disponibles

Mo

Huawei Technologies Co.,Ltd

Copyright Huawei Technologies Co., Ltd. 2010. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any

Huawei Certified Internetwork Expert-Enterprise

optimization technologies. HCDP-Enterprise consists of IESN (Implement Enterprise

(Improving Enterprise Network performance), which includes advanced IPv4 routing

well as the configuration of Huawei products.

HCIE-Enterprise (Huawei Certified Internetwork Expert-Enterprise) is designed to

endue engineers with a variety of IP technologies and proficiency in the maintenance,

competence in planning, design and optimization of large-scale IP networks.

STP ................................................................................................................................................. 548

MULTICAST .................................................................................................................................... 636

IPv6 ................................................................................................................................................ 719

MPLS VPN ...................................................................................................................................... 805

OTHER TECHNOLOGIES .................................................................................................................. 841

RIPv1 packet format

The Header includes the Command and Version fields. Route

target network, and Metric field.

Command: indicates whether the packet is a request or response.

The value is 1 or 2. The value 1 indicates a request, and the value 2

Version: specifies the used RIP version. The value 1 indicates a

Address Family Identifier: specifies the used address family. The

table, the value is 0.

The value can be a network address or host address.

through to the destination. Although the field value ranges from 0 to

RIPv2 packet format

Route Tag: indicates external routes learned from other

protocols or routes imported into RIPv2.

Next Hop: indicates a next-hop address that is better than

indicates that the advertising router address is the

When authentication is configured in RIPv2, RIPv2 modifies

Changes the Address Family Identifier field to 0XFFFF.

Changes the IP Address, Subnet Mask, Next Hop, and

Compared with RIPv1, RIPv2 has the following advantages:

Supports route tags. Route tags are used in routing policies to

Supports subnet masks, route summarization, and CIDR.

RI mainly uses three timers:

It periodically triggers the transmission of route updates at a

the aging time, the RIP device considers the route as

the metric of the route to 16.

marked as unreachable and the route is deleted from the

interval, namely, 120 seconds. If the RIP device does not

neighbor within the garbage-collect time (defaults to 120

RIP route update advertisement is controlled by the update

expires, the device sets the metric of the route to 16

prevent routing loops.

R1 sends R2 a route to network 10.0.0.0/8. If split horizon is

not configured, R2 sends the route learned from R1 back to R1.

unreachable and R2 does not receive route unreachable

that network 10.0.0.0/8 is reachable to R1. Subsequently, R1

reach network 10.0.0.0/8 through R2; R2 still considers that it

loop occurs. After split horizon is configured, R2 does not send

Poison reverse function

table of the peer end.

After receiving a route 10.0.0.0/8 from R1, R2 sets the metric

of the route to 16, indicating that the route is unreachable, if

10.0.0.0/8 learned from R2, preventing a routing loop.

is enabled on Huawei devices (except on NBMA networks)

Comparisons between split horizon and poison reverse

Both split horizon and poison reverse can prevent routing

horizon avoids advertising a route back to neighbors along the

marks a route as unreachable and advertises the route back to