Está en la página 1de 71

Advanced VPC Operation and Troubleshooting

BRKCRS-3146

Dmitry Goloubev
Technical Leader, Tech services

Follow us on Twitter for real time updates of the event:


@ciscoliveeurope, #CLEUR
Housekeeping

We value your feedback- don't forget to complete your online session


evaluations after each session & the Overall Conference Evaluation
which will be available online from Thursday
Visit the World of Solutions and Meet the Engineer
Visit the Cisco Store to purchase your recommended readings
Please switch off your mobile phones
After the event dont forget to visit Cisco Live Virtual:
www.ciscolivevirtual.com
Follow us on Twitter for real time updates of the event:
@ciscoliveeurope, #CLEUR

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Goals

Understand general concepts of Virtual


Port Channel feature on Nexus 7000
Review the impact of VPC on bridging
and routing
Learn how to troubleshoot VPC

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
VPC at the network level
enables to build PortChannel to 2 separate switches
virtualizing network building block
from this to this or, logically

No blocked ports, More usable bandwidth, Load-sharing


Distribution switch or link failure does not mean reconvergence

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
VPC components at a glance
2 active control planes
2 configs
2 points of management
VPC domain

2 active data planes Primary Secondary

Active Active
Control Plane Peer-Link Control Plane
Primary-Secondary notion for some Active Active
aspects of operation Data Plane
Peer
Data Plane

Keepalive link

Control messages and Data frames


flow between active and standby via VPC

Peer-Link

Peer-Link is 802.1Q trunk

Control messages are carried by CFS


over Peer Link

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Agenda

Initialization & Redundancy considerations


Spanning Tree
Traffic forwarding
1st hop redundancy
Multicast considerations

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Stages of VPC initialization
1. VPC manager starts
2. Peer-keepalive comes up (receives keepalives from the peer)
3. Peer-link comes up (data is not passing through yet, just CFS)
4. Primary/Secondary Role resolved
5. Global Consistency check
6. Peer-link is up for data
7. SVIs brought up (VPC + 10 sec)
8. VPCs brought up (SVI + 30 sec)

16:34:06 %VPC-5-VPCM_ENABLED: vPC Manager enabled


16:34:07 %VPC-5-PEER_KEEP_ALIVE_STATUS: In domain 2, peer keep-alive status changed to enabled

16:34:17 %ETHPORT-5-IF_UP: Interface port-channel2 is up in Layer3 Peer-Keepalive

16:34:19 %VPC-3-VPC_PEER_LINK_BRINGUP_FAILED: vPC peer-link bringup failed (vPC peer is not reachable over cfs)

16:34:19 %ETHPORT-5-IF_UP: Interface port-channel1 is up in mode trunk Peer-Link
16:34:23 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to primary

16:34:23 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_START: vPC restore, delay interface-vlan bringup timer started
16:34:33 %VPC-5-VPC_DELAY_SVI_BUP_TIMER_EXPIRED: vPC restore, delay interface-vlan bringup timer expired, reiniting interface-vlans
16:34:33 %INTERFACE_VLAN-5-UPDOWN: Line Protocol on Interface vlan 4, changed state to up
16:34:33 %VPC-5-VPC_RESTORE_TIMER_START: vPC restore timer started to reinit vPCs Timers are adjustable in VPC
16:34:41 %VPC-3-VPC_BRINGUP_FAILED: vPC 102 bringup failed (Peer-link state is not UP) domain configuration context
16:35:03 %VPC-5-VPC_RESTORE_TIMER_EXPIRED: vPC restore timer expired, reiniting vPCs SVI delay restore interface-vlan
16:35:13 %VPC-5-VPC_UP: vPC 102 is up VPC delay restore
16:35:13 %ETHPORT-5-IF_UP: Interface port-channel102 is up in mode trunk
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
VPC consistency checking
Certain configuration mistakes could lead to loops or blackholing [when STP config is
inconsistent] Others might cause undesirable forwarding implications to specific
interfaces [Inconsistent ACL, SVIs]

Consistency checking prevents the prevents network-wide issues (type1) and warns
about possible forwarding oddities (type2)
Inconsistency Type Action Example of inconsistency
Type 1 / Global Vlans suspended on peer-link, VPCs up with Rapid-PVST STP on one peer, MST
respective vlans suspended STP on another
Type 1 / Interface Vlans suspended on respective VPC MTU mismatch, STP guard config
mismatch
Type 2 Syslog message SVI is up on one peer, down on another

Nexus#
Nexus# shsh vpc
vpc consistency-parameters
consistency-parameters global
interface port-channel 1
Name
Name Type
Type Local
Local Value
Value Peer
Peer Value
Value
-------------
------------- ----
---- ----------------------
---------------------- -----------------------
-----------------------
STP
lag-id
Mode 11 Rapid-PVST
[(7f9b, Rapid-PVST
[(7f9b,
STP
... Disabled 1 None None
STP
modeMST Region Name 11 ""
active ""
active
STP
STP MST
PortRegion
Type Revision 11 0Default 0Default
STP
STP MST
PortRegion
Guard Instance to 11 None None
STP
VLANMST
Mapping
Simulate PVST 1 Default Default
STP
Native
Loopguard
Vlan 11 Disabled
1 Disabled
1
STP
PortBridge
Mode Assurance 11 Enabled
trunk Enabled
trunk
STP
MTU Port Type, Edge 11 Normal,
1500 Disabled, Normal,
1500 Disabled,
BPDUFilter,
Duplex Edge BPDUGuard 1 Disabled
full Disabled
full
STP
Speed
MST Simulate PVST 11 Enabled
10 Gb/s Enabled
10 Gb/s
Interface-vlan
Allowed VLANs admin up 2- 101
101 101
101
Interface-vlan routing 2 1,101 1,101
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Graceful Consistency check
VPC Type 1 inconsistency suspends all vlans on corresponding VPC on both
peers

This triggers forwarding interruption during config changes (for example while
changing MTU on VPC)

As of 4.2(8) and 5.2(1) VPC supports Graceful Consistency Check

Graceful consistency check brings down interfaces on secondary peer


upon inconsistency, primary peer keeps forwarding traffic

Enabled by default

Nexus(config-vpc-domain)# graceful consistency-check

Nexus# show vpc brief


vPC domain id : 1
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
vPC role : secondary
...
Graceful Consistency Check : Enabled

vPC status
----------------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- ------ ------------
1 Po1 down* failed vPC type-1 2-10
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
VPC behavior at initialization

Peer-Keepalives must be heard before we


bring up the Peer-Link
VPC control plane must be able to
communicate to the peer over peer-link
Negotiate LACP/STP operating roles for the
chassis
Wait for per-port peer parameters and
handshake to bring up vPC ports

Performs peer parameters consistency check on


each VPC bringup
Will not bring up VPCs if only one of two VPC
peers comes up (for example after power
outage)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
VPC Reload Restore

Allows to bring up VPCs after timeout


if peer is presumed dead
Default timeout 360 sec
Assumes primary role for STP and
LACP

Nexus(config)# vpc domain 1


Nexus(config-vpc-domain)# reload restore ?
<CR>
delay Duration to wait before assuming
peer dead and restoring vpcs

Nexus(config-vpc-domain)# reload restore delay ?


<240-3600> Time-out for restoring vPC links
(in seconds)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
VPC auto-recovery
(replaces Reload-Restore as of NXOS 5.2.1)
Auto-recovery addresses cases of multiple failures. For example
Peer-link fails and after a while primary switch (or keepalive link) fails
Both VPC peers are reloaded and only one comes back up
How it works
If Peer-link is down on secondary switch, 3 consecutive missing peer-keepalives will
trigger auto-recovery
After reload (role is none established) auto-recovery timer (240 sec) expires while
peer-link and peer-keepalive still down, autorecovery kicks in
Switch assumes primary role
VPCs are brought up bypassing consistency checks

Nexus(config)# vpc domain 1


Nexus(config-vpc-domain)# auto-recovery
Nexus# sh vpc | i recovery
Auto-recovery status : Enabled (timeout = 240 seconds)

Failure type Reload restore Auto recovery


After reload only single peer comes up

Peer-link fails, then eventually complete
primary switch fails -
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Troubleshooting VPC: initialization
Always start with sh vpc it gives ~90% of all information needed for initial situation assessment
vpc1# sh vpc
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 1
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status: success
Type-2 consistency reason : Consistency Check Not Performed CFS can communicate with the
vPC role : primary peer
Number of vPCs configured : 1
Peer Gateway : Disabled We hear peer-alives
Dual-active excluded VLANs : - Configs are compatible
vPC Peer-link status
--------------------------------------------------------------------- Master/Slave for certain apps
id Port Status Active vlans
--
1
---- ------ --------------------------------------------------
Po100 up 1,101 Peer-Link is up with expected vlans
vPC status
----------------------------------------------------------------------
Vlans are active on VPCs
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- ------ ------------
1 Po1 up success success 101

Peer status issue check if peer-link is up, check if remote end is also configured as peer-link, then look at CFS.
Note peer-link will fully come up when 1) peer-keepalive is up and 2) peers can talk via CFS over peer-link

Peer-keepalive issue check sh vpc keepalive, check outgoing interface being up, in correct vrf, check the route
to destination (in correct vrf), ping the remote and check the same on the remote peer

Role issue check sh vpc role on both sides, note that peer thats been up/active the longest will remain
operational-active even if other peer will have better priority. This is done to minimize traffic disruption. If role is none
established it means the VPC came up after reload/new config and VPCs will not come up before role is resolved or
reload-restore/auto-recovery kicks in

Consistency issues check sh vpc consistency global|interface

Vlans not up check if respective vlan allowed on peer-link, check syslog for other causes sh log log | inc VLANS

Always keep track of situation on both peers


BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
VPC redundancy model
Process restartability
Supervisor redundancy Processes checkpoint their runtime state
VPC redundancy Crashing process is restarted statefully by
NXOS system manager

VPC Domain

Switch 1 Switch 2
HA-policy will trigger Process 1 Process 1
Active Active
supervisor switchover Process 2 Process 2
in response to
excessive process Process X Process X

crashing, software,
hardware or Standby(SSO) Standby(SSO)
diagnostic failure

Devices dual-attached to VPC domain are protected against


BRKCRS-3146
single switch failureCisco
2011 Cisco and/or its affiliates. All rights reserved.
(power,
Public
hardware, maintenance etc)
15
VPC Keepalive link

Heartbeat between vPC peers to prevent dual-active scenario


Keepalives are sent every second by default on UDP port 3200
3 second hold timeout on peer-link loss how long we ignore
keepalives after peer-link loss
5 seconds keepalive timeout (starts after hold timeout after peer-link
down) how long we wait for failure after hold timeout
Use dedicated link, although NXOS does not enforce this just IP
connectivity is verified
Management interface can be used as keepalive link, but do not connect
the interfaces together directly (only active supervisor management
interface is up)

Peer Keepalive

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Handling Peer-link failure flow Note: If primary fails completely
once the VPCs are down on
Peer-link failure secondary, VPCs will stay down
until primary recovers
Ignore keepalives
for hold-timeout (3 sec)
2ndary
Start keepalive timeout timer
Am I primary?
(default 5 sec)

primary
Keepalive timeout no no
Received Keepalive?
expired?

yes yes
Primary is gone Primary is alive
Become primary Bring down all VPC ports

Done
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Handling Peer-link failure flow with Auto-recovery
Note: Unlike in the previous case
the keepalive status is always
checked, not only for
keepalivehold + keepalivetimeout
no Peer-link seconds after peer-link failure
Received Keepalive
Down? yes

yes no Primary is alive


NEW Bring down all VPC ports
Missed 3
I am primary? Keepalives in a no
2ndary row?
yes
primary
Primary is gone
Become primary
Bypass consistency checks
Bring up VPCs

Done
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
If Peer-link and Keepalive both fail
while primary peer is still alive

Dual-active situation
There will be 2 primary switches sending independent BPDUs
VPC Port-channels on upstream/downstream switches will be error-disabled
by EtherChannel Misconfiguration Guard after ~90seconds
http://www.cisco.com/en/US/tech/tk389/tk213/technologies_tech_note09186a008009448d.shtml
If Nexus 7000/5000 is on the other end of VPC no errordisable as NXOS
does not support EtherChannel Guard

Depending on remote configuration (presence of VPC, peer-switch


etc) there can be different outcomes ranging from no impact to
STP dispute, to STP state cycling between dispute, blocking and
forwarding. Split vlan
Provision redundancy for keepalive link, make sure it doesnt
share datapath with peer-link

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
What to do if only 1 peer is operational
and VPCs are down
due to power issue, hardware failure on the 2nd peer etc
VPC(s) will be down if they had to flap or current peer was reloaded
(because consistency check couldnt be performed without 2nd peer)
Non-issue with auto-recovery, but what if current NXOS version < 5.2 ?
Possible actions
Recover 2nd peer
or remove VPC config from port-channel(s)
vpc(config-if)# no vpc 123
or in case of many VPCs, remove VPC config
vpc# sh run vpc > bootflash:myvpc.conf
vpc(config)# no feature vpc
vpc#
...
sh vpc
Peer status : peer link is down
vPC keep-alive status : Suspended (Destination IP not reachable)
Configuration consistency status : failed
Configuration inconsistency reason: Consistency Check Not Performed
vPC
...
role : none established
vPC status
----------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
-- ---- ------ ----------- ------ ------------
102 Po102 down Not Consistency Check Not -
Applicable Performed
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Troubleshooting VPC peer-keepalives
Nexus# show vpc peer-keepalive

vPC keep-alive status : peer is alive


--Send status : Success
--Last send at : 2009.06.19 00:41:15 589 ms
--Sent on interface : Eth2/35
--Receive status : Success
--Last receive at : 2009.06.19 00:41:14 580 ms
--Received on interface : Eth2/35
--Last update from peer : (1) seconds, (9) msec
Peer-keepalive is only essential at
vPC Keep-alive parameters the time when peer-link goes down
--Destination : 7.7.7.77
--Keepalive interval : 1000 msec
or comes up
--Keepalive timeout : 5 seconds At any other time peer-keepalive
--Keepalive hold timeout : 3 seconds failure will only trigger syslog
--Keepalive vrf : v1
--Keepalive udp port : 3200 Peer-keepalives might be affected
--Keepalive tos : 192 by extreme control plane load
Nexus# show vpc statistics peer-keepalive
(check CPU utilization & COPP)
vPC keep-alive status : peer is alive Number of keepalive state
vPC keep-alive statistics transitions, closer to 0 - better
----------------------------------------------------
peer-keepalive tx count: 9773
peer-keepalive rx count: 8985
average interval for peer rx: 991
Count of peer state changes: 0

Only reception of keepalive packets at IP level is required


Generic routing/switching connectivity troubleshooting might be needed if packets are lost
(make sure there is a route/arp in the correct VRF)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Cisco Fabric Services
CFS
CFS messaging

Transport mechanism for control-plane


messaging between VPC peers
Uses
Consistency validation
MAC address synchronization
vPC member port status signalling
IGMP snooping synchronization
vPC status signalling
VPC CFS messages are encapsulated in Ethernet frames and
delivered between to peer via the peer-link

Nexus# sh cfs application


----------------------------------------------
Application Enabled Scope
----------------------------------------------
arp Yes Physical-eth
stp Yes Physical-eth
vpc Yes Physical-eth
igmp Yes Physical-eth
l2fm Yes Physical-eth
...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
VPC: CFS troubleshooting

Cisco Fabric Services Transport of control messages between VPC peers


Nexus# show cfs status Nexus# show cfs internal ethernet-peer statistics
| i Trans|Rece
Distribution : Enabled
Number of Segments Transmitted : 218
Distribution over IP : Disabled
Number of Acks Transmitted : 223
IPv4 multicast address : 239.255.70.83
Maximum Segment Size Transmitted : 0
IPv6 multicast address : ff15::efff:4653
Number of Transmission Timeouts : 0
Distribution over Ethernet : Enabled
Number of segments in Transmit Queue : 0
Number of segments in Re-Transmit Queue : 0
Nexus# show cfs peers
Total Number of Segments Received : 441
Number of Acks Received : 217
Physical Fabric
Number of Duplicate Messages Received : 0
---------------------------------------------
Number of Unexpected Segments Received : 0
Switch WWN IP Address
Number of fragmented segments Received : 2
---------------------------------------------
Number of duplicate fragments Received : 0
20:00:00:1b:54:c2:42:41 10.48.73.222 [Local]
Number of unfragmented segments Received : 210
Nexus
Number of Received Segments Dropped : 0
20:00:00:1b:54:c2:42:44 0.0.0.0
TX/RX
Number of Unreliable counters
segments should move
Transmitted when
: 1
Number of UnreliableVPC is active
segments or coming up : 1
Received
Total number of entries = 2
Remote peer should be seen
Shows timestamps for when CFS
Nexus# sh cfs internal notification log name vpc
Sun Nov 14 15:27:22 2010: Peer add 20:00:00:1b:54:c2:42:44
communication for VPC was
Sun Nov 14 19:05:25 2010: Peer gone 20:00:00:1b:54:c2:42:44 interrupted (peer-reload, peer-link
Sun Nov 14 19:08:03 2010: Peer add 20:00:00:1b:54:c2:42:44 issues etc)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Swapping Primary Secondary roles
Sometimes it is preferred for operational reasons to have specific switch as
primary
VPCs are down for ~1 minute after primary changes to secondary
Approach
1. Change role priority
2. Bounce peer-link

vpc1(config)# vpc domain 2


vpc1(config-vpc-domain)# role priority 60
Warning:
!!:: vPCs will be flapped on current primary vPC switch while attempting role change ::!!
Note:
--------:: Change will take effect after user has re-initd the vPC peer-link ::--------
vpc1(config-vpc-domain)# int po1
vpc1(config-if)# shut
....
vpc1(config-if)# no shut
...
21:28:34 %VPC-5-ROLE_PRIORITY_CFGD: In domain 2, vPC role priority changed to 60
21:28:34 %VPC-5-SYSTEM_PRIO_CFGD: In domain 2, vPC system priority changed to 32667
21:28:36 %ETHPORT-5-IF_DOWN_NONE: Interface port-channel102 is down (None)
21:28:36 %VPC-4-VPC_ROLE_CHANGE: In domain 2, VPC role status has changed to secondary
21:35:40 %VPC-5-VPC_PEER_LINK_UP: vPC Peer-link is up

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
VPC operational considerations
from troubleshooting perspective
VPC troubleshooting is often part of investigation of larger scale
event connectivity issues following power-outage, upgrade,
migration, major changes etc
Datacenter connectivity being impacted usually implies lots of pressure
(time and otherwise)
Always know the current situation before trying to recover
Trying to fix a non-issue one risks to make things worse At minimum collect
the state of the system before trying anything drastic

When traffic forwarding is concerned basic information on interfaces,


VPC states, STP states, MAC addresses, L3 routes/ARPs is essential
takes a minute to collect, just paste this into shell on both peers
term len 0
sh int
sh vpc
sh port-channel summary
sh spanning-tree
sh mac address-table
sh routing vrf all
sh ip arp vrf all
sh tech detail is preferred (though takes ~10 minutes to collect,
depending on CPU load and number of linecards)
note: if VDCs are used best practice is to collect sh tech detail from
both main VDC and VDC in question. sh tech brief is faster
alternative

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
VPC config considerations

VPC Domain # must be unique for each


Layer2-adjacent VPC domain otherwise
issues with multicast forwarding, LACP
negotiation of cross-VPC links may arise
Set logging level for vpc to 5 makes VPC
operation easier to follow
Use LACP for the peer-link (channel-group
<x> mode active) more resilient to separate
link failures (fiber/sfp going bad) or switch
control-plane failures
Use auto-recovery (if available, use reload-
restore if not) useful for cases of multiple
failures, more graceful recovery

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Agenda

Initialization & Redundancy considerations


Spanning Tree
Traffic forwarding
1st hop redundancy
Multicast considerations

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Spanning Tree in VPC domain

1 STP runs on both switches (2 active control


planes) but only primary switch drives STP of
Primary Secondary
VPCs. Port state changes are communicated to
secondary via CFS messages.
1 STP process 1 STP process
For non-VPC ports domain appears as 2 bridges

2 Peer-link is part of STP. BPDU handling is


2 modified such that Peer-link will not be blocked
(similar to MST implementation of IST)

Non-VPC ports are managed independently by


local STP process on each switch

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
STP behavior upon VPC primary failure

1 Primary switch (STP root) fails

2 Secondary switch becomes operational primary


and STP root

Primary OP-Primary
Secondary STP root port doesnt change nor any STP port
states for VPCs, forwarding continues
Backup
1 ROOT ROOT
ROOT
Depending on control plane load it might take few
2 seconds for Op-primary to start sending BPDUs.

This might cause STP reconvergence on


connected switches hence increasing hello time
or peer-switch feature might be considered in
large deployments

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
STP behavior upon VPC primary recovery

1 Left switch comes back up


2 Peer-Link comes back up
2 3 VPC role is resolved as Operational-secondary
3 OP-Secondary OP-Primary
Secondary

Backup
4 Left switch has better STP priority becomes
1 ROOT SYNC ROOT
ROOT STP root
4 5
5 STP root port of right switch will change and that
will trigger SYNC: all non-edge STP ports will be
temporarily blocked
Once sync is complete ports will resume
forwarding

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
VPC Peer-Switch feature
Both VPC switches originate BPDUs with preconfigured information. This
allows to keep the same BPDU when primary fails/recovers no extra
SYNC required short interruption in forwarding described on previous
slide is avoided
Both left and right switches consider themselves root
Both left and right switches send BPDUs all the time no need to raise
hello time & STP Bridge Assurance can be enabled on VPCs

Primary Secondary

ROOT ROOT

spanning-tree vlan 1-1000 priority 8192 spanning-tree vlan 1-1000 priority 8192
vpc domain 1 vpc domain 1
peer-switch peer-switch

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
VPC Peer-Switch feature
Primary Secondary
left# sh span vlan 101

VLAN0101
Spanning tree enabled protocol rstp
ROOT ROOT
Root ID Priority 8293
Address 0023.04ee.be01
This bridge is the root
...

Bridge ID Priority 8293 (priority 8192)


Address 0023.04ee.be01
...

Interface Role Sts Cost Prio.Nbr Type


---------------- ---- --- --------- -------- ---------------
Po1 Desg FWD 1 128.4096 (vPC) P2p
Po100 Root FWD 2 128.4195 (vPC peer-link)

left# sh vpc role | i mac


vPC system-mac : 00:23:04:ee:be:01 right# sh span vlan 101
vPC local system-mac : 00:1b:54:c2:42:43
VLAN0101
Spanning tree enabled protocol rstp
Root ID Priority 8293
Address 0023.04ee.be01
In Peer-Switch mode bridge-ID This bridge is the root
...
comes from system-mac as
opposed to local mac in normal Bridge ID Priority
Address
8293 (priority 8192)
0023.04ee.be01
mode ...
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- ---------------
Po1 Desg FWD 1 128.4096 (vPC) P2p
Po100 Desg FWD 2 128.4195 (vPC peer-link)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
STP inconsistencies

When STP detects certain abnormal situations it will mark ports as


inconsistent and block them to prevent forwarding loops
- Root Root Guard feature detected inconsistency
(unwanted bridge tries to become root)
- Loop Loop Guard feature detected inconsistency
(port becomes designated because no BPDUs are being received)
- Bridge Assurance (BA)
(no BPDUs are received from remote side)
- VPC Peer-link
(any of above inconsistencies happened on VPC peer-link)

%STP-2-VPC_PEER_LINK_INCONSIST_BLOCK: vPC peer-link detected BPDU receive timeout


blocking port-channel11 VLAN0121.

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Handling Peer-Link STP inconsistencies
on Primary switch

1 1 When peer-link STP inconsistency is detected on


inconsistency

Primary Secondary
primary switch the link will be put in inconsistent
STP state (effectively blocking state)

BPDUs are not sent on peer-link when it is


inconsistent. This is to allow secondary switch to
detect inconsistency and react

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Handling Peer-Link STP inconsistencies
on Secondary switch

When peer-link STP inconsistency is detected on


2Secondary 1
inconsistency

Primary inconsistency secondary switch the peer link will be put in


inconsistent STP state (effectively blocking
state)
1 2 Respective vlans or MST instances are also
2 blocked on all VPCs

This behavior depends on STP Bridge Assurance on peer-link (default) as a way to signal to the
secondary peer about inconsistency
With BA disabled on Peer-link any inconsistency on the Primary will lead to Peer-link flap
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
STP troubleshooting: PES/SPS & BPDU redirection
Primary VPC peer controls the port states on the secondary peer by
means of SPS (set-port-state) messages
Changes in STP information are syncronized between peers using PES
(port-event-sync) messages
Constantly incrementing SPS/PES
nexus# sh spanning-tree internal info vpc | exc 0$ counters might indicate STP
...
======= CFSoe Statistics =========================
instability or constant
Total PES Msgs sent : 4 reconvergence.
Total SPS Msgs sent : 4 Use sh spanning detail and
Total MCS Msgs sent : 8 debug spanning-tree events to
Total PES Response Msgs received : 4 find a reason for reconvergences
Total SPS Response Msgs received : 4
Total Response Msgs received : 8

BPDUs are sent to VPCs out of primary switch. If VPC leg connected to
primary is down, BPDUs are sent over peer-link and sent out by
secondary
nexus# sh system internal frame traffic | i BPDU
Ingress BPDUs qualified for redirection 42
Ingress BPDUs redirected to peer 42
Egress BPDUs qualified for redirection 0
Egress BPDUs dropped due to remote down 0
Egress BPDUs redirected to peer 0
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
STP troubleshooting It is possible to see situation when
there are 2 root ports: peer-link
vpc1# sh spanning-tree vlan 4 and some VPC
VLAN0004 This happens when STP root is
Spanning tree enabled protocol rstp behind VPC and BPDU is received
Root ID Priority 32772 by the peer - this does not indicate
Address 0018.ba88.4a00
Cost 2 any issue
Port 4096 (port-channel1)
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32772 (priority 32768 sys-id-ext 4)
Address 68bd.abd7.51c2
Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Po1 Root FWD 1 128.4096 (vPC peer-link) Network P2p
Po102 Root FWD 1 128.4197 (vPC) P2p

Peer link is running STP

vpc1# sh spanning-tree vlan 4 detail | i "^ Port|BPDU"


Port 4096 (port-channel1, vPC Peer-link) of VLAN0004 is root forwarding
BPDU: sent 46416, received 46418
Port 4197 (port-channel102, vPC) of VLAN0004 is root forwarding
BPDU: sent 0, received 0

On the other end of peer-link po1 is designated

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
STP troubleshooting This output can be easily limited to
necessary Vlan/Interface, but it
Looking at BPDUs live
doent dump the BPDU
vpc1# debug spanning-tree bpdu_tx tree 101
Very chatty use debug logfile
14:20:37.556707 stp: RSTP(101): transmitting RSTP BPDU on port-channel100
14:20:37.556750 stp: vb_vlan_shim_send_bpdu(1933): VDC <file>
4 Vlan 101to redirect output to a file
port port-
channel100 enc_type 1 len 42
14:20:37.556834 stp: RSTP(101): transmitting RSTP BPDU on port-channel1
14:20:37.556863 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 101 port port-channel1
enc_type 2 len 36

vpc1# debug spanning-tree all


14:22:23.560147 stp: RSTP(1): transmitting RSTP BPDU on port-channel100
14:22:23.560169 stp: vb_vlan_shim_send_bpdu(1933): VDC 4 Vlan 1 port port-channel100
enc_type 2 len 36
14:22:23.560219 stp: BPDU TX: vb 1 vlan 1 port port-channel100 len 36 ->0180c2000000
CFG P:0000 V:02 T:02 F:78 R:80:01:00:1b:54:c2:42:43 00000002
B:80:01:00:1b:54:c2:42:44 9063 A:0000 M:0014 H:0002 F:000f

Alternatively use ethanalyzer to capture and dump BPDUs. Beware the BPDUs
received by other peer and redirected to primary will not be seen in expected way
because of extra encapsulation
Looking at past events
nexus# sh spanning-tree internal event-history tree 0 interface port-channel 50
VDC02 MST0000 <port-channel50>
0) Transition at 497772 usecs after Tue Oct 20 17:42:01 2009
State: FWD Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE]

1) Transition at 661395 usecs after Tue Oct 20 17:42:01 2009


State: FWD Role: Root Age: 4 Inc: no [STP_PORT_ROLE_CHANGE]

2) Transition at 17741 usecs after Tue Oct 20 17:42:03 2009


State: BLK Role: Root Age: 5 Inc: no [STP_PORT_STATE_CHANGE]
...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Layer2 stability features recap

Feature Condition Works on Effect Note


Detects if link becomes Error-disables Useful on port-channels to
unidirectional Physical unidirectional take out broken links,
UDLD I.e. link cannot carry BPDUs port links alternative fast-timers
both ways causes loops PAGP/LACP

Expects to receive a BPDU Blocks port at Main protection mechanism


Bridge every hello_time from the STP level where supported, alternative
Assurance peer. Logical (BA- is Loop Guard
(BA) I.e. cases of dead control port inconsistent
plane on the remote side, state)
also BPDU loss
Checks the remote port role Blocks port at Complements BA, on by
in the received BPDU, role STP level default. Somewhat overlaps
Dispute should not be designated in Logical (Disputed with UDLD, but not as
BPDU received on port state) effective on port-channels.
designated port Only works with RSTP/MST
Cases of unidirectional BPDUs
communication
Doesnt allow port to take Blocks port at Superseded by BA + Dispute,
designated role if it stopped STP level use with PVST+ or when BA
Loop receiving BPDUs Logical (Loop- is not supported
Guard Unidirectional port inconsistent)
communication, control plane
issues on remote
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Bridge assurance, Dispute & UDLD

BA is default enabled on Peer-Link, not recommended for VPCs


unless Peer-Switch feature is also operational
Dispute is default enabled (for both RSTP and MST on VPC)
UDLD [normal mode] is recommended to take out bad links from
channels (otherwise LACP takes ~100sec vs ~20 with UDLD)
Recommendation
Preferred BA + UDLD + Dispute (on all interswitch links when using
Peer-switch) when all switches support this (nexus 7000/5000 and
cat6500/VSS do support)
Without Peer-switch BA should be kept only on Peer-Link (no BA or
LoopGuard on VPCs) use UDLD + Dispute
If preferred config is not supported use Loop Guard + UDLD
(supported by all Cisco switches)
Can potentially mix and match supported features per-switch, but do
understand which cases in which combinations each feature covers
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Agenda

Initialization & Redundancy considerations


Spanning Tree
Traffic forwarding
1st hop redundancy
Multicast considerations

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Special case for forwarding

x
PC B

PC A ends a packet to PC B

x
1
2 MAC B is not known by left switch flood

2 3 3 MAC B is not known by right switch flood

4 B receives duplicate frames


5 MAC A will be learned on wrong port on the lower
x

access switch blackholing traffic to A

PC A 1
x

A 5

Frames received on Peer-Link


must not be flooded out of VPCs

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Special case for forwarding: VPC way

PC B

2 3

1 MAC B is not known by left switch flood


1
2 Frames received from Peer-Link are never sent
out of VPC (except those without operational
2 ports on ingress switch)
Egress port ASICs will drop the frame

PC A 3 Frame is still flooded to devices that are solely


connected to egress switch

This rule (called VPC check) stands for all traffic


(L2, L3, unicast, multicast, broadcast, flooded etc)
on Nexus 7000 (Nexus 3000/5000 VPC have
similar rule, but different implementation)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Summary: VPC traffic forwarding with Nexus 7000

X
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
Topologies where VPC forwarding rules will have
implications

Packets arriving to Configuration and


routed routed
the left switch, with operational state of
destination MAC of SVI interfaces for
right switch will be vlans present on
dropped VPCs should be
OSPF consistent
With peer-gateway SVI 1 up SVI 1 up
enabled adjacencies SVI 2 down SVI 2 up Otherwise packet

x
x
may not come up arriving to left switch
for destination on
vlan 2
This issue is not VPC in vlan 2 will
specific to OSPF have to cross Peer-
same for any routing Link and will be
protocol dropped by right
switch
Use routed links to Add routed cross-link
connect routers between peers
Frames received from Peer-Link are never sent out of
VPC (except those without operational ports on ingress
switch)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Verifying whether frame will be sent to peer-link

Verify where the destination MAC address of the frame points to

Nexus# show mac address-table vlan 35


Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+------+----------------
+ 35 0007.b400.0101 dynamic 0 False False Po1
G 35 0007.b400.0102 static - False False sup-eth1(R)
G 35 001b.54c2.4241 static - False False sup-eth1(R)
* 35 001b.54c2.4244 static - False False vPC Peer-Link
+ 35 0012.da65.9ec0 dynamic 0 False False Po1

If frame arrives to this switch in vlan 35 destined to 001b.54c2.4244 it will


be sent to peer-link
If this MAC address belongs to one of L3 SVI interfaces of peer-switch
and IP destination of the frame is behind the VPC and this VPC has active
links on this (local) switch then frame will be dropped by peer-switch

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
MAC address learning

1 MAC A is learned on lower VPC

2 MAC A is learned on Peer-Link


PC B
3 Frame destined to A arriving to right switch will be
sent to Peer-Link
2
A A
Ax
1
3

PC A
Traffic should prefer local links when available
(traffic locality rule)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
MAC address learning: VPC way

PC B

1 MAC A is learned on lower VPC


2
MAC addresses are never learned from traffic on
A CFS message A Peer-Link
1 2 Left switch sends a CFS message to right switch
telling about MAC A learned on lower VPC. Right
switch updates MAC address table
3 3 Frame destined to A arriving to right switch will be
sent to lower VPC
PC A

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Po50 Po22
Vlan 50 Vlan 20

Troubleshooting VPC
Layer 2
91.0.0.10
0013.1908.e246 20.1.2.3

nexus# sh mac address-table address 0013.1908.e246 vlan 50


VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
* 50 0013.1908.e246 dynamic 0 F F Po50
nexus# sh spanning-tree vlan 50 interface port-channel 50
Mst Instance Role Sts Cost Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
MST0002 Desg FWD 200 128.4145 (vPC) P2p
nexus# sh hardware mac address-table 2 address 0013.1908.e246 vlan 50
Valid| PI | BD
| |
|
|
MAC | Index | Stat| SW | Modi| Age | Tmr |
| | ic | | fied| Byte| Sel | MAC addresses should point
-----+----+-------+---------------+--------+-----+----+-----+-----+-----+
1 1 161 0013.1908.e246 0x00a36 0 3 0 141 1 to expected ports in expected
nexus# sh system internal pixm info ltl 0x00a36 | i Eth.*, vlans (path towards source)
0x0a36 Eth2/36, The ports should be in STP
nexus# sh mac address-table address 0021.55e0.66c2 vlan 20 forwarding mode
---------+-----------------+--------+---------+------+----+------------------
VLAN MAC Address Type age Secure NTFY Ports Hardware MAC address
* 20 0021.55e0.66c2 dynamic 660 F F Po22 table should be consistent
nexus# sh spanning-tree vlan 20 interface port-channel 22
Mst Instance Role Sts Cost Prio.Nbr Type with software table
---------------- ---- --- --------- -------- --------------------------------
MST0000 Desg FWD 200 128.4117 (vPC) Network P2p Finding port# for given index
nexus# sh hardware mac address-table 1 address 0021.55e0.66c2 vlan 20 Linecard Slot number
Valid| PI | BD | MAC | Index | Stat| SW | Modi| Age | Tmr |
| | | | | ic | | fied| Byte| Sel |
-----+----+-------+---------------+--------+-----+----+-----+-----+-----+
1 1 18 0021.55e0.66c2 0x00a32 0 2 0 103 1
nexus# sh system internal pixm info ltl 0x00a32 | i Eth.*,
0x0a32 Eth1/13, Eth1/14,
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Po50 Po22
Vlan 50 Vlan 20
Troubleshooting
Layer 3 VPC

91.0.0.10
0013.1908.e246 20.1.2.3

nexus# sh routing ip 20.1.2.3


...
20.1.2.3/32, ubest/mbest: 1/0
*via 20.1.1.240, Vlan20, [1/0], 03:48:59, static Is there route to
nexus# sh ip arp 20.1.1.240 destination
Address Age MAC Address Interface Is the next hop resolved
20.1.1.240 00:02:17 0021.55e0.66c2 Vlan20
Looking at module 2
nexus# sh forwarding ip route 20.1.2.3 module 2
... because this is where
------------------+------------------+--------------------- packets in question
Prefix | Next-hop | Interface
------------------+------------------+--------------------- should be received
20.1.2.3/32 20.1.1.240 Vlan20
Is adjacency consistent
nexus# sh forwarding adjacency 20.1.1.240 module 2 with ARP
IPv4 adjacency information Router MAC must have
next-hop rewrite info interface Gateway flag in order for
-------------- --------------- ------------- packet to be L3 switched
20.1.1.240 0021.55e0.66c2 Vlan20
nexus# sh int vl 20 | i address
Hardware is EtherSVI, address is 0023.ac66.1a42
nexus# sh mac address-table address 0023.ac66.1a42 vlan 20
VLAN MAC Address Type age Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
G 20 0023.ac66.1a42 static - F F sup-eth1(R)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Where given packet will be load-balanced
For equal-cost routes

nexus# sh routing hash 91.0.0.10 20.1.2.3


Load-share parameters used for software forwarding:
load-share mode: address source-destination port source-destination
Universal-id seed: 0xcdb5769f
Hash for VRF "default"
Hashing to path *20.1.1.3 (hash: 0x2a), for route:
20.1.2.3/32, ubest/mbest: 2/0
*via 20.1.1.3, Vlan20, [1/0], 00:01:37, static Load-balancing is configurable
*via 20.1.1.240, Vlan20, [1/0], 16:32:42, static under ip load-sharing address in
default VDC and affects all VDCs

For port-channels
nexus# sh port-channel load-balance forwarding-path interface port-channel 22 dst-ip
20.1.2.3 src-ip 91.0.0.10 vlan 20 module 2
Missing params will be substituted by 0's.
Load-balancing is configurable
Module 2: Load-balance Algorithm: source-dest-ip-vlan
RBH: 0 Outgoing port id: Ethernet1/14
under port-channel load-balance
in default VDC and affects all VDCs
Use sh port-channel rbh-distribution to see which link sends traffic for
which of 8 available load-balancing buckets

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Datapath Drops
#1 command to look for hardware
nexus# sh hardware internal errors all
---------------------------------------- packet drops
Hardware errors as reported in module 1
---------------------------------------- Not every drop listed here is actual
data packet drop
|------------------------------------------------------------------------|
| Device:R2D2 Role:MAC Run several | times to see if any
|------------------------------------------------------------------------|
Instance:7 counters increase at rate similar to
ID Name Value Ports
-- ---- ----- traffic loss
-----
28688 aric_no_port_select_error 0000000000000002
... To clear1,3,5,7
counters,I2use
| Device:Ashburton Role:MAC
clear statistics
|------------------------------------------------------------------------|
Mod: 1 |
module-all device all
|------------------------------------------------------------------------|
Instance:0
3629 Egress Port-1 VSL Dropped Packet Count 0000000853635833 5 -
3630 Egress Port-2 VSL Dropped Packet Count 0000000857893046 3 -
...
|------------------------------------------------------------------------|
| Device:Naxos Role:MAC SECURITY |
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
106 m1_fab_p25_txq_tc0_drop_count 00000000000012af 2 -
...
|------------------------------------------------------------------------|
| Device:Metropolis Role:REWR |
|------------------------------------------------------------------------|
Instance:1
ID Name Value Ports
-- ---- ----- -----
70 Krypton input controller zero portsel cnt 0000000000000038 18,20,22,24,26,28,30,32
|------------------------------------------------------------------------|
| Device:Lamira Role:L3 |
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
93 CL2 Invalid Pkt count 00000008759cb9cb 1-32 I1
...
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
Agenda

Initialization & Redundancy considerations


Spanning Tree
Traffic forwarding
1st hop redundancy
Multicast considerations

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
1st hop redundancy with VPC
MAC_B vMAC
IP B IP A

PC B
Each of VPC peers will L3 forward packets
destined to its respective Router MAC address
HSRP/VRRP/GLBP used for 1st hop redundancy

Router MAC1 Router MAC2 Both switches will L3 switch packets to vMAC
0001.0002.0003 HSRP 0005.0006.0007
Virtual MAC Virtual MAC address as long as one of them is HSRP active or
0000.0c07.ac00 0000.0c07.ac00 HSRP standby.

If both switches are HSRP listening, they will not


L3 switch packets to vMAC
PC A

MAC_A vMAC
IP A IP B

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
First hop redundancy troubleshooting
standby active
Interface Vlan1 Interface Vlan1
ip address 1.1.1.252/24 ip address 1.1.1.253/24
hsrp 1 hsrp 1
HSRP
ip 1.1.1.254 ip 1.1.1.254

Both peers will L3


forward packets destined
Nexus# sh hsrp brief to vMac address as long as
Interface Grp Prio P State Active addr Standby addr Group addr either peer in VPC domain
Vlan1 1 100 Standby 1.1.1.253 local 1.1.1.254 is in active or standby
state for corresponding
Nexus# sh mac address-table address 0000.0c07.ac01 group
VLAN MAC Address Type age Secure NTFY Ports Virtual mac address (vMac)
---------+-----------------+--------+-----+------+------+----------- will be installed in both
G 1 0000.0c07.ac01 static - False False sup-eth1(R) peers
G (gateway) flag must be
Nexus2# sh hsrp brief present on any MAC
Interface Grp Prio P State Active addr Standby addr Group addr address for which the
Vlan1 1 100 Active local 1.1.1.252 1.1.1.254 nexus is expected to L3
forward packets
Nexus2# sh mac address-table address 0000.0c07.ac01 Only active will respond
VLAN MAC Address Type age Secure NTFY Ports to ARP for VIP
---------+-----------------+--------+-----+------+------+-----------
G 1 0000.0c07.ac01 static - False False sup-eth1(R)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
1st hop issue with some devices
MAC_B Router MAC1 1 PC A sends a packet to Server B
IP B IP A
3
2 Left VPC switch will receive the packet and
forward it to Server B, note Source MAC of
outgoing packet will be that of Router1
3 Server B responding to PC A will populate
Server B
destination MAC from source MAC of received
Router MAC1 MAC_B MAC_B Router MAC1 frame (this is wrong, it should use ARP)
IP A IP B IP B IP A
4 If frame from BA will be load-balanced to right
4
switch the MAC address of Router1 will point to
2 Router MAC1 Router MAC2
Peer-Link and this is where the frame will be sent
0001.0002.0003 0005.0006.0007
Virtual MAC Virtual MAC
0000.0c07.ac00 0000.0c07.ac00 5 Left switch will receive the frame from Peer-Link
and drop it
X

5
PC A

Why? Frames received from Peer-Link are never


MAC_A vMAC sent out of VPC except those without operational
IP A IP B
1 ports on ingress switch - egress port ASICs will
drop the frame (VPC check)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
Peer-Gateway : the workaround
MAC_B Router MAC1
IP B IP A
1

Server B With peer-gateway both peers will install router


MACs of each other in L2 table which will allow
MAC_B Router MAC1 them to L3 forward traffic destined to either
IP B IP A Router MAC
2
Router MAC1 Router MAC2 1 Server B responding to PC A will populate
0001.0002.0003 0005.0006.0007 destination MAC from source MAC of received
Virtual MAC
Router MAC2 Virtual MAC
Router MAC1
0000.0c07.ac00
0005.0006.0007 0000.0c07.ac00
0001.0002.0003 frame (this is wrong, it should use ARP)
Virtual MAC Virtual MAC
0000.0c07.ac00 0000.0c07.ac00
2 Right switch will forward packet towards
destination
PC A

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
Peer-Gateway : the implications

1 Top device attempts to establish OSPF adjacency


with the left switch
2 If peer-gateway is enabled in VPC domain and
1 OSPF unicast packet will be load-balanced to the
right switch, this packet will be dropped

MAC_B Router MAC1


IP TOP IP LEFT, TTL 1 Why? Right switch will try to L3-switch the
unicast packet (because RouterMAC1 is marked
as gateway MAC and destination IP is not local)
Router MAC1
0001.0002.0003
2 X Router MAC2
0005.0006.0007 As packet has TTL==1 it will be dropped
Router MAC2 Router MAC1
0005.0006.0007 0001.0002.0003
Virtual MAC Virtual MAC Same applies to any other protocol that uses
0000.0c07.ac00 0000.0c07.ac00 unicast packets with TTL==1 entering right switch
but destined to left switch (or vise versa)

Routing protocol peering with devices attached to


VPC domain via SVI interface is not supported
Routed interface should be used in this case

There is peer-gateway exclude-vlan command to turn off peer-gateway on certain vlans


BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
VPC Agenda

Initialization & Redundancy considerations


Spanning Tree
Traffic forwarding
1st hop redundancy
Multicast considerations

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
IP Multicast with VPC
Receiver sends IGMP report (join)
RP
Access switch sends join to right VPC peer

Right VPC peer creates (*,G) adds VPC to OIF (as


proxy-DR)
Source S1 IGMP is encapsulated in CFS and sent to left peer

Left peer (DR) creates (*,G) adding VPC to OIF

DR (left peer) sends PIM Join to RP


(*,G)VPC (*,G)VPC
Primary 2ndary Once (S1,G) traffic starts arriving, VPC peers will
CFS:IGMP
(S1,G)VPC (S1,G)null resolve which one will be forwarder for that (S,G):
DR Proxy-DR peer with best metric to source or primary in a tie
(this mechanism is specific to PIM in VPC mode,
normally PIM would use assert)
Receiver Only forwarder will have OIFs populated in (S,G)
the non-forwarder wont have VPC SVIs in OIF list

IGMP join Forwarder will send a copy of frame to the peer-


link for receivers single-connected to other peer
Goal is to allow the peer receiving source traffic to forward it to receivers behind
VPC without crossing peer-link (VPC check will drop such traffic otherwise)
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 72
IP Multicast with VPC: Prebuilt-SPT
RP

In case of DR failure proxy-DR becomes DR and


Source S1
posts OIF-list from (*,G) to (S,G), but it will also
need to pull traffic from RP/source which delays
recovery

(*,G)VPC (*,G)VPC With ip pim pre-build-spt proxy-DR will also send


Primary 2ndary a PIM Join to source/RP to draw the traffic
(S1,G)VPC (S1,G)VPC
(S1,G)null
DR New DR Traffic pulled by proxy-DR will be dropped until it
becomes DR provision uplink and replication
bandwidth accordingly

Receiver

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
IP Multicast with VPC: source behind VPC
RP

When Source is behind VPC both DR and Proxy-


DR will add OIFs for the group to (S,G)
(*,G)VPC2 (*,G)VPC2
Primary 2ndary
This is because either peer can receive source
(S1,G)VPC2 (S1,G)VPC2
traffic and need to be able to send it to receivers
behind VPCs without crossing peer-link (to avoid
DR Proxy-DR
dropping the traffic by VPC check)

VPC1 VPC2
Source S1
Receiver

When VPC is configured on N7K-F248XP-25 linecard (F2) there is no proxy-DR


function (due to hardware specifics). Packet will be bridged to DR over peer-link
(VPC check is modified
BRKCRS-3146
accordingly for L3 multicast
2011 Cisco and/or its affiliates. All rights reserved. Cisco Public
packets on F2 linecards) 74
Forwarder election in VPC

Peers do metrics exchange over CFS for each new source


Peer that has better metric to source or primary will be forwarder

VPC1# sh ip pim internal vpc rpf


Source: 10.0.1.1
Pref/Metric: 110/21
Source role: primary
Forwarding state: Win (forwarding)

For sources behind VPC both peers will forward as they have no control on which
one will get the traffic

VPC1# sh ip pim internal vpc rpf


Source: 1.1.1.1
Pref/Metric: 0/0
Source role: primary
Forwarding state: Win-force (forwarding)

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
VPC multicast: following packet flow
Nexus# show ip mroute 239.1.2.3
(*, 239.1.2.3/32), uptime: 06:46:05, igmp pim ip static control plane state for this group
Incoming interface: Vlan36, RPF nbr: 36.0.0.3
Outgoing interface list: (count: 2)
where information came from
Ethernet2/43, uptime: 03:01:36, static stable?
Vlan37, uptime: 06:46:05, igmp
RPF interface
(33.0.0.33/32, 239.1.2.3/32), uptime: 06:46:05, ip pim mrib
Incoming interface: Vlan36, RPF nbr: 36.0.0.3
Outgoing interface list: (count: 2)
Ethernet2/43, uptime: 03:01:36, mrib
Vlan37, uptime: 06:46:04, mrib

Nexus# show ip igmp snooping groups vlan 37


Type: S - Static, D - Dynamic, R - Router port where are receivers on this vlan?
Vlan Group Address Ver Type Port list
37 */* - R Vlan37
37 239.1.2.3 v2 D Eth2/8

Are packets being switched by this entry?


Nexus# show ip mroute 239.1.2.3 summary software-forwarded
Total number of routes: 3 Is traffic being switched for this group?
Total number of (*,G) routes: 1 counters updated once ~1 minute
Total number of (S,G) routes: 1
Total number of (*,G-prefix) routes: 1 packets forwarded in software
Group count: 1, rough average sources per group: 1.0 average packet size
Group: 239.1.2.3/32, Source count: 1
Source packets bytes aps pps bit-rate oifs
(*,G) 0 0 0 0 0.000 bps 2
sw-pkts: 0
33.0.0.33 5046908 252345396 49 200 80.053 kbps 2
sw-pkts: 1
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Following the flow: forwarding information
Nexus# show forwarding multicast route group 239.1.2.3
slot 1
=======
(*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G
Received Packets: 0 Bytes: 0
Number of Outgoing Interfaces: 2
Outgoing Interface List Index: 4
Vlan37 Outgoing Packets:0 Bytes:0
Ethernet2/43 Outgoing Packets:N/A Bytes:N/A
(33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags:
Received Packets: 5723369 Bytes: 366295616
Number of Outgoing Interfaces: 2
Outgoing Interface List Index: 4 This is platform independent forwarding
Vlan37 Outgoing Packets:0 Bytes:0 information
Ethernet2/43 Outgoing Packets:N/A Bytes:N/A
Ingress linecard entry
slot 2 Egress linecard entry
=======
Counters are updated once per ~1minute
(*, 239.1.2.3/32), RPF Interface: Vlan36, flags: G Counters between ingress/egress do not have to
Received Packets: 0 Bytes: 0
Number of Outgoing Interfaces: 2 match, as information is collected not at the same
Outgoing Interface List Index: 4 exact time, receiver might join after the entry was
Vlan37 Outgoing Packets:5725816 Bytes:366452224
created etc
Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816
(33.0.0.33/32, 239.1.2.3/32), RPF Interface: Vlan36, flags:
Received Packets: 0 Bytes: 0
Number of Outgoing Interfaces: 2
Outgoing Interface List Index: 4
Vlan37 Outgoing Packets:5725816 Bytes:366452224
Ethernet2/43 Outgoing Packets:3032294 Bytes:194066816

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
When traffic arrives via VPC

How to find which slot receives the S,G flow when ingress interface is
port-channel scattered across several modules?
show forwarding multicast route group <g> source <s>

Nexus# show forwarding multicast route group 239.1.1.1 source 1.0.1.2 | i Received|slot
slot 1
Received Packets: 0 Bytes: 0
slot 2
Received Packets: 727203 Bytes: 487290999

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 79
Are there drops in forwarding path?

Start looking from Ingress module

Nexus# show hardware internal errors module 1


----------------------------------------
Hardware errors as reported in module 1
----------------------------------------
...
|------------------------------------------------------------------------|
| Device:Lamira Role:L3 Mod: 1 |
| Last cleared @ Thu Apr 8 12:57:37 2010
| Device Statistics Category :: ERROR
|------------------------------------------------------------------------|
Instance:0
ID Name Value Ports
-- ---- ----- -----
259 L3 Fib Miss Pkt ctr 0000000000000007 1-32 I1
262 L3 Non-Rpf Drop Pkt ctr 0000000000125617 1-32 I1
319 NF2 V4 IPMAC Lkup Error 0000000000272277 1-32 I1
455 Exception cause: DROP (Unicast) 0000000000025510 1-32 I1
465 Exception cause: DROP (Multicast) 0000000000226148 1-32 I1

Always take several snapshots and look for drops that grow coherently with
[suspected] multicast traffic drops
There are always some drops shown by above command this doesnt always
mean the actual network packets are dropped. Some of these are diag packets,
some are packets that are dropped on blocked ports, extra floods etc
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Review & Summary

Infrastructure
Redundancy at process, supervisor, port-channel, chassis,
VPC level
Both peers are needed to bring up VPCs auto-
recovery/reload-restore can change this
Peer-Keepalive + Role defines behavior during VPC
failovers
Forwarding
Traffic locality (VPC check) + No learning on Peer-Link
No blocking ports (generally), but common L2 stability
mechanisms still important (LACP active, UDLD, BA,
Dispute)
Interfacing with L3 requires separate links + cross link
Troubleshooting
Layered, always take basic info, narrow down to a
layer/issue type before trying to recover
Data plane troubleshoot each peer like normal switch
paying attention to nuances like VPC check, dual-DR and
Router-MACs

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Recommended Reading

Please visit the Cisco Store for suitable reading.


Please complete your Session Survey
We value your feedback
Don't forget to complete your online session evaluations after each session.
Complete 4 session evaluations & the Overall Conference Evaluation
(available from Thursday) to receive your Cisco Live T-shirt

Surveys can be found on the Attendee Website at www.ciscolivelondon.com/onsite


which can also be accessed through the screens at the Communication Stations

Or use the Cisco Live Mobile App to complete the


surveys from your phone, download the app at
www.ciscolivelondon.com/connect/mobile/app.html

1. Scan the QR code


(Go to http://tinyurl.com/qrmelist for QR code reader
software, alternatively type in the access URL above)
2. Download the app or access the mobile site
3. Log in to complete and submit the evaluations
http://m.cisco.com/mat/cleu12/
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 84
BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Thank you.

BRKCRS-3146 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 86

También podría gustarte