The Case for Migrating CMU's Distribution Layer to VSS
Gabriel Somlo, June 2009
This document describes why and how CMU's distribution routing layer should
be switched from HSRP-based redundancy to VSS. To make that case, I begin by
covering the current design, along with the behavior of several technologies
of interest to us (URPF, SLB, and FWSM). Next, I review the steps required to
convert a distribution-layer router pair (Pod) from HSRP redundancy to a VSS
stack. Finally, I revisit the behavior of URPF, SLB, and FWSM under the new
model. Please feel free to append to this document as more experience and
information is gathered on the "quirks" of VSS. This information should then
be used as we consider a migration of our production distribution routers to
VSS.
1. Our Current State: HSRP-Redundant Distribution Layer
Here's a snippet that illustrates CMU's current distribution layer design:
--------
| core | L-3 Core
-+----+-
1/4 / \ 2/4
/ \
/ \
/ v903 \ v901
/ \
/ \
/ \
/ 2/1 \ 2/1
------+------ ------+------
| |5/4 5/4| |
| pod-t-a84 +--------+ pod-t-233 | L-3 Distribution Layer
| | v973 | |
------+------ ------+------
.2 | 2/8 .1 .3 | 2/8
| |
| |
| v4 | v4
| | L-2 Access Layer
| ---------------- |
| | | |
+--+ 233-bag-a +--+
4/0/1 | | 4/0/2
---+-------+----
v4 | v4 |
... ...
Without loss of generality, we only show one core router and one access-layer
switch, with one access-layer vlan/subnet (Vlan4). The two "halves" of our
distribution router pair are each uplinked to each core using a dedicated
point-to-point vlan/subnet, over which they speak OSPF. They each have an
address (.2 and .3 respectively) on the subnet associated with the user-facing
vlan (vlan4) and use HSRP to share the advertised default gateway IP, .1
To avoid spanning-tree loops and the network outages typically associated with
recalculating STP-based layer-2 paths, vlan4 is not trunked on the interlink
between the two "pod halves". Instead, the interlink is a point-to-point
Layer-3 routed connection, and the two halves speak OSPF to each other over
this link as well.
During normal operation, one "half" of pod-t is HSRP-active, and thus "owns"
the .1 default gateway IP for the subnet on vlan4. All edge devices on vlan 4
will thus send it their outbound network traffic. For return traffic (traffic
sent from the core and headed toward the edge devices on vlan4), the load is
shared 50-50 by the two pod-t halves, since both advertise equal OSPF routes
for vlan4's subnet to the core. Traffic received by any device on vlan4 has a
50% chance of being relayed by either pod-t-a84 or pod-t-233.
This design can survive several different failure scenarios related to one
member of a distribution router pair (e.g. pod-t-a84):
- total chassis failure:
- pod-t-233 takes over .1 HSRP interface
- pod-t-233 remains the only one announcing the subnet to the core routers
- link failure:
- 2/8 fails:
- HSRP migrates .1 to other chassis
- OSPF announcements for the subnet fail, and besides, packets for the subnet received from the core can be forwarded over the v973 interlink
- 2/1 fails:
- packets originated from the subnet forwarded over v973 interlink
The two "halves" are geographically separate, so the chance of both going down
simultaneously is small enough to simply ignore.
2. Interaction of HSRP Redundancy with Other Technologies
Our HSRP-based distribution-layer redundancy scheme interacts with several
other aspects of our network design, among which we count the topology of the
L-2 access layer, and technologies we rely on such as URPF, SLB, and FWSM
blades. We describe each of these interactions in this section, with the
expectation that a migration to VSS would mitigate or eliminate the negative
effects of these interactions.
2.0. HSRP vs. Access Layer Topology
To avoid STP loops, we have so far avoided making available an access Vlan
(such as vlan4) on more than one dually-uplinked access switch. In the figure
above, vlan4 exists on the 233-bag-a switch and its uplinks to each "half" of
pod-t, but not on the interlink between the two pod-t halves. If we made vlan4
available on another access switch there would now exist a STP loop which
would need to be recalculated in the event of certain links failing:
----------------
| |
+--+ 233-bag-b +--+
| | | |
| ---------------- |
| v4 | v4
| |
------+------ ------+------
| | | |
| pod-t-a84 +--------+ pod-t-233 |
| | v973 | |
------+------ ------+------
| |
| v4 | v4
| ---------------- |
| | | |
+--+ 233-bag-a +--+
| |
----------------
We are therefore reluctant to deploy any edge/access vlan to more than one
dually uplinked aggregator access switch connected to a pair of distribution
routers.
2.2. HSRP vs. URPF
URPF (Unicast Reverse Path Forwarding) checks are supported on routed
interfaces mainly as source address anti-spoof mechanism. When URPF checks are
enabled on an interface, incoming packets are allowed only if the receiving
interface is a valid way to route traffic back to the IP address from which
the incoming packets were originated. The straightforward application is to
prevent users on a subnet served by our router from sending out packets with
spoofed source IP addresses outside the range dictated by the subnet.
In our HSRP-based redundancy design, URPF interferes with external traffic
to the router interfaces (.1, .2, and .3). This mainly impacts monitoring
software that attempts to ping the router interfaces and alert in case one or
more of them become unreachable.
Assume pod-t-a84 is currently HSRP-active for the subnet on vlan4, and thus
the owner of both the .1 address and its own .2 one. Assume a monitoring
station sends a ping to the .1 address from somewhere beyond the core. At the
core, the ping has a 50% chance of being sent to either pod-t-a84 or pod-t-233,
as both advertise the entire subnet to the core via OSPF with equal metrics.
If the core sends the packet directly to pod-t-a84, it will be received and
answered without incident. If, on the other hand, the packet is sent to
pod-t-233, it will be forwarded out via the .3 interface into the L-2 access
layer, and reach the .1 interface on pod-t-a84 from within the subnet. When
this happens, URPF will discard the packet because, based on its off-subnet
source IP address, it was received over the wrong interface.
In effect, roughly half the time, external packets to our .1, .2, and .3
interfaces will not reach their destination. When URPF is enabled on an
interface, we have the option to add an ACL to exempt given source IPs from
being subject to the checks. We currently have such a list containing the
known addresses of our monitoring machines, and other trusted IPs we need to
be able to reliably contact our router interfaces. The downside is that such
access lists must constantly be managed and updated, to facilitate the
continued monitoring of the network.
2.3. HSRP vs. SLB
We currently use SLB (Server Load Balancing) to make available well-known
caching DNS server IP addresses from several different locations (at least one
physical server from each distribution router pair). Without delving into too
much SLB-specific detail, we configure a "server farm" containing the real IP
addresses of one or more DNS servers, and then we creat "vservers" which
advertise the anycast virtual server IPs into the OSPF cloud.
SLB interacts with our HSRP design in two ways. First, SLB itself is HSRP
aware, and can be configured to only advertise the virtual IPs from the
HSRP-active side of a redundant pair. Should HSRP fail over to the other
member, SLB virtual IP advertisements would begin to be originated from that
member also. In addition, SLB allows its internal connection database to be
replicated to another router. We use both measures to insure that in the event
of a Pod member failing, the other member will take over not just the job of
advertising and dispatching connections to the virtual server IPs, but also
the management of existing connections.
Two extra configuration lines per SLB "vserver" are necessary: one of them
ties the "inservice" status of the vserver to the "active" HSRP status of the
router, and the other one establishes a connection to the "standby" HSRP
router over which current connection status information is replicated.
2.4. HSRP and FWSM blades
FWSM (FireWall Service Modules) provide VPN/Crypto/Firewall handling
acceleration to a 6500 series Cisco router. They are functionally similar to
ASA5500-series security appliances, but instead of having their own phisical
connections, they access the 6500's Layer-2 infrastructure via the chassis
backplane.
Layer-2 connections between the 6500 chassis and the FWSM cards are
established by issuing the following commands on the chassis:
firewall multiple-vlan-interfaces
firewall vlan-group G V1,V2,V3-Vn
firewall module M vlan-group G
Then, on the FWSM itself, Vlans V1, V2, etc. are available as Layer-3
configurable interfaces. The 6500 chassis itself may or may not have a
layer-3 interface on these vlans. We typically configure an IP on the vlan
serving as the FWSM's OUTSIDE interface, and simply switch the other firewall
vlans at layer-2, allowing the FWSM alone to act as the default gateway on the
subnets associated with those vlans.
Just like ASA5500's, two FWSM cards can be configured as an active-standby
pair. In our HSRP-based distribution router design, we host an FWSM card in
each Pod "half". For the FWSM's own active-standby redundancy, both cards need
shared access to the same set of VLANS, which introduces similar issues to the
topology of access/edge vlans hosted by the 6500 routers, described in
Subsection 2.1 above. We'd have to trunk at least the OUTSIDE and dedicated failover Vlans across the the pod interlink, but any production vlans
serviced by the FWSMs would be subject to the same topology limitations
w.r.t. STP loops as the Vlan4 in Subsection 2.1. Such vlans could only be
made available to one access-layer switch before STP loops would be created
on them.
3. Using VSS at the Distribution Layer
In essence, VSS is a mechanism that allows two 6500 routers to be "stacked"
similarly to how other Cisco L2 switches can be stacked. Another way to think
about it is "backplane over etherchannel" across two 6500 chassis. Rather than
using a dedicated short-distance stacking connector, VSS can occur over long
distances, allowing the same geographic redundancy and survivability as our
current HSRP based design.
VSS requires that the two 10Gig ports available on the SUP cards (Te5/4 and
5/5 in our case) be used for the VSL (virtual switch link). Extra ten gig
links from additional 10gig line cards may optionally be used for additional
"backplane" capacity. Interfaces now have three numbers (the old slot/port
numbering scheme is now prefixed by the switch number, which can be either 1
or 2). Hence, Te2/1 on pod-t-a84 in the HSRP design with standalone routers
now becomes Te1/2/1. Similarly, Te2/1 on pod-t-233 becomes Te2/2/1 in the new
unified VSS switch.
----------
| core | L-3 Core
-+------+-
1/4 | | 2/4
| |
| v903 | etherchannel
| |
1/2/1 | | 2/2/1
---------+------+---------
| pod-t sw1 || pod-t sw2 |
| || | L-3 Distribution Layer
| a84 VSL 233 |
| */5/* |
---------+------+---------
1/2/8 | | 2/2/8
| |
| v4 | etherchannel
| |
4/0/1 | | 4/0/2
----+------+----
| |
| 233-bag-a | L-2 Access Layer
| |
----+------+----
v4 | v4 |
... ...
Independent links between the two HSRP members and each core router or access
switch now become etherchannels. This reduces VLAN utilization (we no longer
require a dedicated Vlan901 for pod-t-233's point-to-point core uplink), IP
address utilization (pod-t-233 no longer requires its own loopback IP,
dedicated .3 IPs on each access subnet, and pod-t-a84 no longer requires its
own .2 addresses on each access subnet). The resulting VSS-capable unified
system only requires one address per subnet serviced.
A switch to VSS eliminates the topology and STP related issues described in
the previous section, and greatly simplifies the operation of URPF and SLB.
3.1. Topology and FWSM improvements
The access layer switch uplinks to the two physical chassis with a single
(multi-chassis) etherchannel rather than with two separate layer-2
connections. As such, we can trunk any given access vlan (e.g. vlan4) to as
many access switches as we desire, without the potential for introducing STP
loops into the topology.
The same holds true for vlans serviced by FWSM blades. The layer-2 link
between the VSS switch and the two FWSM cards (one per physical chassis) is
accomplished with only a slight difference from the standalone configuration:
firewall multiple-vlan-interfaces
firewall vlan-group G V1,V2,V3-Vn
firewall switch 1 module M vlan-group G
firewall switch 2 module M vlan-group G
The vlan-group is made available to both FWSM cards. From this point on, the
two cards share access to all these vlans over the backplane (and when I say
"backplane" I include the VSL). The VSS switch only needs an IP address on the
vlan supporting the FWSM OUTSIDE interface. The rest of the vlans are handed
out via layer-2 etherchannels just like regular access vlans, without the
potential for creating STP loops.
3.2. URPF improvements
The potential for an externally sourced packet to hit the router from within
an access vlan/subnet is eliminated. URPF can be enabled on any access subnet
without the requirement to manage an exception ACL containing the source IPs
of network monitoring gear.
3.3. SLB simplification
We can operate SLB without the need to explicitly consider and configure
replication and failover. Replication/failover for SLB is built into the
underlying IOS when running in VSS mode.
4. Converting an HSRP Pair to VSS
This section describes how an HSRP-redundant pair of distribution routers (or
"Pod" in CMU-speak) can be converted into a VSS stack without causing any
user-visible outage. The Cisco document used to devise this migration plan may
be found online here
4.1. Preparing the two Pod halves for VSS
First, make sure both routers (and FWSM blades) are running the appropriate
software versions. If FWSM support is required, that means SXI on the IOS side
and 4.0.X on the FWSM side. At the time of this writing, we're running the
"s72033-advipservicesk9_wan-mz.122-33.SXI1.bin" image on the routers, and
"c6svc-fwm-k9.4-0-5.bin" on the FWSM cards.
Next, insure that the two ten-gig interfaces on the sup cards (Te5/4 and
Te5/5) are available for configuring the VSL. If currently in use, migrate
them to other interfaces.
Make backup copies of the running configs on each router, just in case.
Plan to keep around the loopback IP of the HSRP active router. The loopback(s)
configured on the other box (as well as dedicated point-to-point vlans and
subnets) will cease to be in use after the conversion is complete. As a
convention, the active HSRP router will become "switch 1" in the VSS stack,
and the other one will become "switch 2". We need to pick a number to reflect
our "switch virtual domain", and, by local convention, we use the last octet
of the router loopback IP. So, if pod-t-a84's IP address is 128.2.1.132, we'll
end up using "switch virtual domain 132".
We begin the preparations by configuring SSO (Stateful SwitchOver) and NSF
(NonStop Forwarding) on both routers. Note that SSO may already be enabled by
default:
redundancy
mode sso
router ospf 1
nsf
Next, we assign the switch number within the virtual domain. On pod-t-a84, we
enter:
switch virtual domain 132
switch 1
Similarly, on pod-t-233 we enter:
switch virtual domain 132
switch 2
Next, we configure the VSL portchannels on (at least) the sup ten-gig
interfaces (in our case, Te5/4 and Te5/5). We pick portchannel numbers 10 for
switch1 and 20 for switch2. Note that both port channel numbers must be
available on both switches before this configuration step is performed. On
pod-t-a84, we enter:
interface port-channel 10
switch virtual link 1
no shut
interface range Te5/4-5
channel-group 10 mode on
no shut
Similarly, on pod-t-233:
interface port-channel 20
switch virtual link 2
no shut
interface range Te5/4-5
channel-group 20 mode on
no shut
Next, we must insure that both switches have the same PFC mode and it's set to
"PFC3c":
platform hardware vsl pfc mode pfc3c
We are now ready to convert the two routers to a unified stack. To do this
seamlessly, we perform the next step on pod-t-a84 first. This will cause the
chassis to reload, and come back in VSS mode. During the reload, pod-t-233
will still perform its HSRP standby duties and continue passing user traffic
without interruptions in service:
switch convert mode virtual
After pod-t-a84 reloads, it will reclaim HSRP primary status from pod-t-233.
The only difference is that now all its interfaces are prefixed by the switch
ID (1 in this case): interfaces such as Ten2/1 are now numbered Ten1/2/1, to
reflect that they're part of "switch 1" in the stack. At this moment,
pod-t-a84 is a VSS stack with only one member. We can now issue the conversion
command on pod-t-233, again without causing interuption in user traffic, since
now pod-t-a84 is up and running. Note: run 'term mon' on the already converted
pod-t-a84, to monitor for messages that let us know when the second stack
member becomes available. On pod-t-233, issue the conversion command:
switch convert mode virtual
After this step, pod-t-233 no longer officially exists. The chassis will
reload, and once the reload is complete, we'll notice that pod-t-a84 has extra
ports and modules available. There will be new interfaces such as Te2/2/1,
reflecting the availability of a second stack member. Watch the logging
messages on pod-t-a84, and specifically wait for something like "VSL_UP: Ready
for control traffic", and "perform exec command switch acept mode virtual".
Once the latter message is logged, we can issue the final conversion command
on the new stack:
switch accept mode virtual
This last command merges the configs across the two chassis, and finalizes the
"stacking" process.
Ports on "switch 2" of the stack will become available in unconfigured,
shutdown mode (except for interfaces Ten2/5/4 and Ten2/5/5, which are part of
the VSL).
4.2. Configuring a freshly converted VSS stack
4.2.1. Optional stack member priority and preemption
We may optionally wish to configure VSL priority and preemption, to insure
that e.g. switch 1 is always active when available (and switch 2 is always
standby when 1 is available). I have not configured this on my test setup, and
would like a better reason to configure it than simple aesthetics. If both
switches have equal priority, a failover switches the active status to the
*other* switch, where it will remain until another failover. In either event,
the following commands *would* make switch 1 the preferred-active, higher
priority member of the stack:
switch virtual domain 132
switch 1 priority 105
switch 1 preempt
4.2.2. Converting access-layer downlinks to etherchannels
At this point, the new VSS stack uses only connections on switch 1, vlan
numbers, subnets, and IP addresses inherited from pod-t-a84, and is still
configured to act as the HSRP primary (but with an HSRP standby which no
longer exists).
The fact that all non-VSL ports on switch 2 are in shutdown mode presents
us with an opportunity to convert our existing uplinks to multi-chassis
etherchannels in a seamless manner without causing user-visible outages. As an
example, our 233-bag-a access switch is now connected to the stack from its
Ten4/0/1 interface to Ten1/2/8. Its other port, Ten4/0/2, physically wired to
Ten2/2/8 on the stack, is currently down (due to Ten2/2/8 itself being
shutdown). We begin by configuring an etherchannel link between the access
switch and the stack using this latter physical connection. On both the access
switch and the stack, we pick an available portchannel number X, and configure
it to use the available link that is currently down. On the access switch, we
enter:
interface Port-channelX
switchport
switchport trunk encapsulation dot1q
switchport trunk native vlan 4
switchport trunk allowed vlan 4,A,B,C
switchport mode trunk
no shut
interface Ten4/0/2
switchport
switchport trunk encapsulation dot1q
switchport trunk native vlan 4
switchport trunk allowed vlan 4,A,B,C
switchport mode trunk
channel-group X mode on
no shut
We now have two links between the access switch and the stack. The original
link which stayed up after the conversion (Te4/0/1 -> Te1/2/8), and the new
etherchannel (containing Te4/0/2 -> Te2/2/8). One of these links will be
pruned by STP for a brief period, until we proceed to shut down the former.
This may result in a few seconds of production traffic interruption (in case
the STP-pruned link was the etherchannel, which would now have to go through
the listening/learning/forwarding cycle). Once things stabilize, the newly
available ports (Te4/0/1 and Te1/2/8) may be added to the port-channel
interface on the acces switch and stack, respectively. The resulting two-link
etherchannel is now an STP-free pyhsically redundant uplink between the access
switch and the distribution-layer stacked router.
4.2.3. Dual-Active detection, PAGP, and Etherchannel-ing the core uplinks
Before converting the single uplink to (each) core to an etherchannel, a
discussion of Dual-Active detection is in order. Each VSS stack member
monitors the VSL, and, should the VSL fail in its entirety (i.e., all
etherchanneled ten-gig interfaces in the VSL), assume that it now is the "sole
survivor" and become active. In very rare cases (e.g. backhoe cuts entire
non-pyhisically-diverse VSL), both stack members might end up thinking their
mate is down and attempt to become active at the same time, with all the
undesireable side-effects that entails.
Several options exist to implement a dual-active detection mechanism which
would allow the stack members to realize what just happened, and allow one of
them to shut itself down and wait for the VSL to come back up. All but one of
these options require a separate dedicated link between the two chassis, which
feels clunky, requires extra hardware, and, unless we use ports on modules
*other* than the ones already supporting the VSL, *and* unless we insure this
dedicated link is geographically diverse from the VSL, is only of marginal
value as it might go down along with the rest of the VSL member links. For
that reason, I've picked the one remaining method which does not rely on a
special link dedicated to dual-active detection.
The Port AGgregation Protocol (PAgP) is a Cisco-proprietary Etherchannel
management protocol which allows a VSS stack to perform dual-active detection
with the help of one or more adjacent Cisco devices which also have PAgP
enabled on the etherchannel connecting them to the VSS stack. Since we only
have two core routers, and they are also Cisco 6500 devices, I decided to use
the etherchannels connecting our VSS stack to them for dual-active detection.
This allows us to stay away from requiring proprietary protocols to be
operated on our access-layer switches, which have a much higher chance of
being non-Cisco devices and thus not even supporting PAgP. We convert the
(currently down) link between the stack's Ten2/2/1 and the core's Ten2/4 into
a PAgP-enabled etherchannel on vlan 903 (same as the existing active uplink
between the stack and our core). We discard vlan901 as it is no longer
necessary. Note also that we use 'channel-group X mode desirable' instead of
'mode on', which enables use of PAgP.
interface Port-channelX
switchport
switchport access vlan 903
switchport mode access
no shut
interface Ten2/2/1
switchport
switchport access vlan 903
switchport mode access
channel-group X mode desirable
no shut
We are now free to shut down the old active link (between Te1/2/1 on the stack
and Te1/4 on the core), and add it to the portchannel.
Once PAgP-capable etherchannels have been established to (all of) the core
router(s), we may enable PAgP-based dual-active detection on the stack:
int poX
shut
switch virtual domain 132
dual-active detection pagp trust channel-group X
int poX
no shut
where X iterates over the port-channel number connecting us to each core
router. Note that the port-channel interface must be in shutdown mode while it
is being added to the dual-active detection setup. Note also that this should
not be a problem when multiple cores are available, and the process is being
conducted one port-channel at a time.
4.2.4. Cleaning up the stack config
At this point, the conversion to VSS is complete. We might consider renaming
the resulting stack to some more location-neutral name such as "pod-t".
We should, at this point, remove all the HSRP-related configuration entries
from each interface, as well as the ACL containing the URPF check exceptions.
Before, the configuration of an interface might look something like this (note
that all non-relevant entries which stay the same between conversion have been
removed for simplicity):
interface Vlan4
ip address 128.2.6.2 255.255.255.0
ip verify unicast source reachable-via rx allow-self-ping 195
standby 1 ip 128.2.6.1
standby 1 priority 105
standby 1 preempt
standby 1 authentication md5 key-string 7 XXXXXXXXXXXXXXXXXXXX
standby 1 name SIXNET
standby 1 track Loopback2 30
standby 1 track Vlan903 20
standby 1 track Vlan904 20
We may begin by replacing the URPF statement with simply:
interface Vlan4
ip verify unicast source reachable-via rx allow-self-ping
Note the absence of an ACL reference. We can remove ACL 195 from the stack as
soon as all interfaces have been cleaned up, with the long-term benefit of one
less place for bitroth to accumulate over time.
Next, we may remove most of the 'standby' hsrp config statements:
interface Vlan4
no standby 1 priority 105
no standby 1 preempt
no standby 1 authentication md5 key-string 7 XXXXXXXXXXXXXXXXXXXX
no standby 1 name SIXNET
no standby 1 track Loopback2 30
no standby 1 track Vlan903 20
no standby 1 track Vlan904 20
The next step might cause a brief (seconds or less) interruption in traffic
over the interface being cleaned up. Also, it is very important to be logged
into the stack from *somewhere other* than the subnet serviced by the
interface being cleaned up. In rapid sequence, we type:
interface Vlan4
no standby 1 ip 128.2.6.1
ip address 128.2.6.1 255.255.255.0
This finalizes the removal of HSRP configuration on our routed interfaces.
5. Managing and Monitoring the VSS Stack
Previously, each HSRP Pod "half" could be independently monitored via pings to
its respective loopback IP. With VSS, this is no longer an option.
5.1. SNMP Monitoring
To monitor the health of a VSS stack we may use SNMP. VSS-specific SNMP
traps maybe sent to a listener in the event of a VSL failover, by adding
the following configuration to the stack:
snmp-server enable traps vswitch vsl
During failover, the following traps are sent to snmptrapd:
SNMPv2-SMI::enterprises.9.9.388 Enterprise Specific Trap (1) Uptime: 0:07:26.86
SNMPv2-SMI::enterprises.9.9.388.1.3.1.1.3.168 = INTEGER: 2
The INTEGER value is 2 for failure, and 1 for recovery. The MIB used to
translate these into human-readable form is available here.
In addition, we may configure our monitoring station to use SNMP polling to
monitor specific interfaces (such as Te1/2/1 and Te2/2/1) to allow detection
of a failed stack member.
5.2. IOS commands for VSL troubleshooting and management
Troubleshooting commands:
sho switch virt link [detail]
Displays information on the VSL
sho switch virt link port-channel
Displays port-channel specific information on the VSL
sho switch virt link port
Displays port-specific information on VSL member ports
show switch virtual role
Lists role and state information on each stack member
show module switch { 1 | 2 | all }
Lists modules present in each stack member
show switch virtual dual-active pagp
Shows operational status information on the dual-active detection setup.
Administrative commands:
redundancy reload peer
Causes the peer to reload. Useful during software upgrade.
redundancy force-switchover
Causes a switchover of the active role to the other stack member. Also useful
during software upgrade. Currently, ssh login sessions are lost during a
switchover.
redundancy config-sync {ignore | validate} mismatched-commands
Synchronizes configurations across the two stack members. Can be useful in
tracking down mismatches.
6. Conclusion
VSS greatly simplifies management of distribution-layer routing, and reduces
the opportunity for outages to occur either due to configuration mismatches
across two separately managed HSRP-redundant routers, or due to the almost
unavoidable spanning-tree loops that are introduced in the absence of VSS.
The conversion to VSS can be performed without any significant downtime, as
outlined throughout this document. CMU's Pod-T (our distribution router
dedicated to testing) has been switched over with great results, and will be
used for long-term stress testing, before a migration to VSS is considered for
our other (production) distribution-layer routers.