Smart System Upgrade

Smart System Upgrade (SSU) dramatically speeds up system upgrades and minimizes network disruptions with:

  • Faster Reloads: SSU streamlines and optimizes the upgrade process, significantly reducing the reload time.
  • Uninterrupted Connections: Even while the system reboots, SSU keeps your network connections active by sending LACP PDUs. These PDUs are essential signals that maintain port channels, ensuring your devices stay connected.
  • Minimal Traffic Loss: SSU uses protocols that support "graceful restart," allowing network services to restart without dropping connections, preventing interruptions to your data flow.
Features capable of hitless restart under SSU include:
  • QinQ
  • 802.3ad Link Aggregation/LACP
  • 802.3x Flow Control
  • BGP (you must enable BGP graceful restart; refer to Configuring BGP.)
  • MP-BGP (you must enableBGP graceful restart; refer to Configuring BGP.)
  • 128-way Equal Cost Multipath Routing (ECMP)
  • VRF
  • Route Maps
  • L2 MTU
  • QoS

Note: SSU and VRRP are not compatible. Use a different upgrade method if you have VRRP configured on the switch.

Upgrading the EOS image with Smart System Upgrade

Using SSU to upgrade the active EOS image is a five-step process:
  1. Prepare the switch for upgrade (Prepare the Switch for SSU).
  2. Transfer the image file to the switch (Transfer the Image File for SSU). (This is not required if the desired file is already on the switch.)
  3. Modify the boot-config file to point to the desired image file (Modify boot-config).
  4. Start the SSU process (Start the SSU Process).
  5. Verify that the upgrade was successful (Verify Success of the Upgrade).

Prepare the Switch for SSU

Note: Configuring BGP graceful restart resets BGP sessions. If configuring BGP graceful restart as part of the SSU process, ensure that BGP sessions are stable and all BGP routing information has been learned and advertised before proceeding with SSU.

Backing Up Critical Software

Before upgrading the EOS image, ensure copies of the currently running EOS version and the running-config file are available in case of corruption during the upgrade process. To copy the running-config file, use the copy running-config command. In this example, the system copies the running-config contents to a file on the switch's flash drive.

switch# copy running-config flash:/cfg_06162014
Copy completed successfully.
switch#

Making Room on the Flash Drive

Determine the size of the new EOS image. Verify that there is enough space on the flash drive for two copies of this image, plus a recommended 240MB (if available) for diagnostic information in case of a fatal error. Use the dir command to check the bytes free figure.

switch# dir flash:
Directory of flash:/
-rwx   293168526      Nov 4    22:17   EOS4.11.0.swi
-rwx          36      Nov 8    10:24   boot-config
-rwx       37339      Jun 16   14:18   cfg_06162014

606638080 bytes total (602841088 bytes free)

Verifying Connectivity

Ensure the switch has a management interface configured with an IP address and default gateway. See Assigning a Virtual IP Address to Access the Active Ethernet Management Port and Configuring a Default Route to the Gateway. Confirm network connectivity to the switch using the show interfaces status command and pinging the default gateway.

switch# show interfaces status
Port    Name     Status     Vlan       Duplex   Speed      Type
Et3/1            notconnect   1         auto    auto     1000BASE-T

<-------OUTPUT OMITTED FROM EXAMPLE-------->
Ma1/1            connected   routed     unconf   unconf    Unknown 

switch#ping 1.1.1.10
PING 172.22.26.1 (172.22.26.1) 72(100) bytes of data.
80 bytes from 1.1.1.10: icmp_seq=1 ttl=64 time=0.180 ms
80 bytes from 1.1.1.10: icmp_seq=2 ttl=64 time=0.076 ms
80 bytes from 1.1.1.10: icmp_seq=3 ttl=64 time=0.084 ms
80 bytes from 1.1.1.10: icmp_seq=4 ttl=64 time=0.073 ms
80 bytes from 1.1.1.10: icmp_seq=5 ttl=64 time=0.071 ms

Verifying Configuration

Verify the switch configuration is valid for SSU using the show reload fast-boot command. If parts of the configuration are blocking SSU execution, an error message will be displayed explaining what they are. For SSU to proceed, correct the configuration conflicts before issuing the reload fast-boot command.

switch# show reload fast-boot
switch#'reload fast-boot' cannot proceed due to the following:
  Spanning-tree portfast is not enabled for one or more ports
  Spanning-tree BPDU guard is not enabled for one or more ports
switch#

Note: You can still use the show reload hitless and reload hitless commands, but they have the same effect as the commands shown earlier.

Configuring BGP

For hitless restart of BGP and MP-BGP, BGP graceful restart must first be enabled using the graceful-restart command. The default restart time value (300 seconds) is appropriate for most configurations.

The BGP configuration mode issuing the graceful-restart command determines which BGP connections will restart gracefully.

Note: configuring BGP graceful restart resets BGP sessions. If configuring BGP graceful restart as part of the SSU process, ensure that BGP sessions are stable and all BGP routing information has been learned and advertised before proceeding with SSU.

For all BGP connections, use the graceful-restart command in BGP configuration mode:
switch# config
switch(config)# router bgp 64496
switch(config-router-bgp)# graceful-restart
switch(config-router-bgp)#

For all BGP connections in a specific VRF, use the graceful-restart command in BGP VRF configuration mode:
switch# config
switch(config)# router bgp 64496
switch(config-router-bgp)# vrf purple
switch(config-router-bgp-vrf-purple)# graceful-restart
switch(config-router-bgp-vrf-purple)# exit
switch(config-router-bgp)#

For all BGP connections in a specific BGP address family, use the graceful-restart command in BGP address-family configuration mode:
switch# config
switch(config)# router bgp 64496
switch(config-router-bgp)# address-family ipv6
switch(config-router-bgp-af)# graceful-restart
switch(config-router-bgp-af)# exit
switch(config-router-bgp)#

Transfer the Image File for SSU

The target image must be copied to the file system on the switch, typically onto the flash drive:

  1. Verify that the flash drive has enough space for two copies of the image plus an optional 240MB for diagnostic information.
  2. Use the copy command to copy the image to the flash drive.
  3. Confirm that the system transferred the new image file correctly.

The following command examples illustrate transferring an image file from various locations to the flash drive.

USB Memory

Command

copy usb1:/sourcefile flash:/destfile

Example

switch# copy usb1:/EOS-4.14.4.swi flash:/EOS-4.14.4.swi

FTP Server

Command

copy ftp:/ftp-source/sourcefile flash:/destfile

Example

switch# copy ftp:/user:password@10.0.0.3/EOS-4.14.4.swi flash:/EOS-4.14.4.swi

SCP

Command

copy scp://scp-source/sourcefile flash:/destfile

Example

switch# copy scp://user@10.1.1.8/user/EOS-4.13.2.swi flash:/EOS-4.13.2.swi

HTTP

Command

copy http://http-source/sourcefile flash:/destfile

Example

switch# copy http://10.0.0.10/EOS-4.14.4.swi flash:/EOS-4.14.4.swi

After transferring the file, verify that it is present in the directory, then confirm the MD5 checksum using the verify command. The MD5 checksum is available from the EOS download page of the Arista website.

switch# dir flash:
Directory of flash:/
-rwx     293168526   Nov 4     22:17     EOS4.14.2.swi
-rwx            36   Nov 8     10:24     boot-config
-rwx         37339   Jun 16    14:18     cfg_06162014
-rwx     394559902   May 30    02:57     EOS4.13.1.swi

606638080 bytes total (208281186 bytes free)
switch# verify /md5 flash:EOS-4.14.4.swi 
verify /md5 (flash:EOS-4.14.4.swi) =c277a965d0ed48534de6647b12a86991
switch#

Modify boot-config

After transferring and confirming the desired image file, use the boot system command to update the boot-config file to point to the new EOS image.

This command changes the boot-config file to point to the image file located in flash memory at EOS-4.14.4.swi.

switch# configure terminal
switch(config)# boot system flash:/EOS-4.14.4.swi

Use the show boot-config command to verify that the boot-config file is correct:

switch(config)# show boot-config
Software image: flash:/EOS-4.14.4.swi
Console speed: (not set)
Aboot password (encrypted): $1$ap1QMbmz$DTqsFYeauuMSa7/Qxbi2l1

Save the configuration to the startup-config file with the write command.

switch# write

Start the SSU Process

After updating the boot-config file, verify that your configuration supports SSU (if you have not already done so) using the show reload fast-boot command. If parts of the configuration are blocking SSU execution, an error message will be displayed explaining what they are.

switch# show reload fast-boot
switch#'reload fast-boot' cannot proceed due to the following:
  Spanning-tree portfast is not enabled for one or more ports
  Spanning-tree BPDU guard is not enabled for one or more ports
switch#

Start the SSU process using the reload fast-boot command to reload the switch and activate the new image. The CLI will identify any changes that must be made to the configuration before starting SSU, prompt the saving of any modifications to the system configuration, and request confirmation before reloading.

switch# reload fast-boot
System configuration has been modified. Save? [yes/no/cancel/diff]:y
Copy completed successfully.
Proceed with reload? [confirm]y

Note: You can still use the show reload hitless and reload hitless commands, but they have the same effect as the commands shown earlier.

Verify Success of the Upgrade

Before making any configuration changes to the switch after reloading, verify that the SSU process is complete using the command show boot stages log. If it is, the last message should be Hitless boot stages complete.

switch# show boot stages log
Timestamp           Delta Begin Msg
2022-10-03 12:42:06 000.000000 Asu Hitless boot stages started
2022-10-03 12:42:06 000.001592 stage CriticalAgent started
2022-10-03 12:42:06 000.001834   event CriticalAgent:PhyEthtool completed

[ . . . ]

2022-10-03 12:43:02 056.316874 stage BootSanityCheck is complete
2022-10-03 12:43:02 056.317491 Asu Hitless boot stages complete
switch#

You can also verify the completion of the SSU process by checking the syslog for the following message:

LAUNCHER-6-BOOT_STATUS: 'reload fast-boot' reconciliation complete

To verify whether the SSU upgrade was successful, use the show reload cause command. Suppose a fatal error occurred during the upgrade process. In that case, the switch will have completely rebooted, and the fatal error details will be displayed along with the directory containing the diagnostic information.

If the SSU upgrade has succeeded, it will read Hitless reload requested by the user.

Fatal Error Display

switch# show reload cause
Reload Cause 1:
-------------------
Fatal error occurred during Asu Hitless boot. (stageMgr - LinkStatusUpdate timed out)

Reload Time:
------------
Reload occurred at Sun Oct 02 12:06:37 2022 PDT.

Recommended Action:
-------------------
The system rebooted due to a fatal error.
If the problem persists, contact your customer support representative.

Debugging Information:
-------------------------------
/mnt/flash/persist/fatalError-2022-10-02_120637
switch#

Successful Upgrade Display

switch# show reload cause
Reload Cause 1:
-------------------
Hitless reload requested by the user.

Reload Time:
------------
Reload occurred at Mon Oct 03 13:29:31 2022 PDT.

Recommended Action:
-------------------
No action necessary.

Debugging Information:
-------------------------------
None available.
switch#

The show version command confirms whether the correct image is loaded. The Software image version: line displays the version of the active image file.

switch# show version
Arista DCS-7050QX-32-F
Hardware version: 02.00
Serial number: JPE14071098
System MAC address: 001c.7355.556f
Software image version: 4.14.5F-2353054.EOS4145F
Architecture: i386
Internal build version: 4.14.5F-2353054.EOS4145F
Internal build ID: e8748ea7-916d-4217-878f-4bfe2adc7122
Uptime: 4 minutes
Total memory: 3981328 kB
Free memory: 1342408 kB
switch#

Note: If a fatal error occurs during the SSU process, the new EOS image will still be loaded and booted.