EVPN vxlan Single-Gateway Centralized Routing
In a traditional EVPN vxlan centralized anycast gateway deployment, multiple L3 VTEPs serve the role of the centralized anycast gateway. For the hosts to have a consistant ARP binding for any of the individual centralized gateway VTEPs, each VTEP operating as a centralized gateway is configured with a virtual router MAC (VARP MAC).A virtual VTEP IP (VARP VTEP IP) is shared between all of the L3 VTEPs operating as centralized gateways. Each centralized gateway VTEP also advertises an EVPN type-3 route for both its primary VTEP IP and VARP VTEP IP, so both IPs end up in the overlay floodset.
The traditional configuration works fine, but in the specific case of a network with only a single L3 VTEP centralized gateway (or single MLAG pair operating as the L3 VTEP centralized gateway), this leads to unnecessary BUM traffic. When both the physical VTEP IP and the VARP IP end up in the overlay floodset, BUM traffic is duplicated to the centralized gateway, which can overhead workloads that have a lot of broadcast or multicast traffic. There is only a single centralized gateway, so it is not necessary to have a VARP VTEP IP to provide a stable ARP binding for the gateway.
- A change to the default ARP behavior when a host ARPs for the MAC address
associated with a virtual IP address (SVI IP).
Previously, ARPing for a virtual IP returned different MAC address bindings, depending on whether or not a VARP VTEP IP was configured. If a VARP VTEP IP was configured, the ARP request returns the configured VARP MAC. If one was not, the ARP request returns the switch router MAC. This feature changes the behavior to always respond with the configured VARP MAC to an ARP request for a virtual IP. This closes an exception to the rule that virtual IPs are always associated with the VARP MAC.
- A new EVPN MAC VRF configuration command that generates an EVPN type-2 route for the VARP MAC with a nexthop of the physical VTEP IP.
In a traditional EVPN centralized anycast gateway development, the presence of a configured VARP VTEP IP advertises an EVPN type-2 route for the VARP MAC with a nexthop of the VARP VTEP IP. This allows TOR switches to learn the ARP binding of the centralized anycast gateway (presumably, their default gateway). With this feature, no VARP VTEP IP is configured, so an alternative method is required to advertise the appropriate EVPN route. This feature adds a new EVPN MAC VRF configuration command, redistribute router-mac next-hop vtep primary ,which when configured on a MAC VRF advertises an EVPN type-2 route for the VARP MAC with a nexthop of the primary VTEP IP. This allows TOR switches to learn the ARP binding of the centralized anycast gateway.
configuration
In the example, CE1 is a multi-homed CE in VLAN20. CE2 is a remote CE in VLAN30. Asymmetric IRB is configured for inter-VLAN traffic.
switch(config)# interface Port-Channel100
switch(config-if-Po100)# switchport access vlan 20
!
switch(config-if-Po100)# evpn ethernet-segment
switch(config-evpn-es)# identifier 0033:3333:3333:3333:3333
switch(config-evpn-es)# route-target import 00:03:00:03:00:03
switch(config-evpn-es)# lacp system-id 1234.5678.0123
!
switch(config)# interface Ethernet1
switch(config-if-Et1)# switchport mode trunk
switch(config-if-Et1)# channel-group 100 mode on
!
switch(config)# interface Loopback0
switch(config-if-Lo0)# ip address 10.255.0.0/32
!
switch(config)# interface Vlan20
switch(config-if-Vl20)# ip address virtual 20.0.20.1/24
!
switch(config)# interface Vlan30
switch(config-if-Vl30)# ip address virtual 20.0.30.1/24
!
switch(config)# interface vxlan1
switch(config=if-Vx1)# vxlan source-interface Loopback0
switch(config=if-Vx1)# vxlan udp-port 4789
switch(config=if-Vx1)# vxlan vlan 20 vni 10020
switch(config=if-Vx1)# vxlan vlan 30 vni 10030
!
switch(config)# ip virtual-router mac-address 00:00:80:00:00:00
!
switch(config)# router bgp 300
switch(config-router-bgp)# router-id 0.0.0.1
switch(config-router-bgp)# neighbor 10.0.0.1 remote-as 303
switch(config-router-bgp)# neighbor 10.0.0.1 ebgp-multihop
switch(config-router-bgp)# neighbor 10.0.0.1 send-community extended
switch(config-router-bgp)# neighbor 10.0.0.1 maximum-routes 12000
switch(config-router-bgp)# redistribute static
!
switch(config-router-bgp)# vlan 20
switch(config-macvrf-20)# rd 10.255.0.0:20
switch(config-macvrf-20)# route-target both 64500:10020
switch(config-macvrf-20)# redistribute learned
!
switch(config-router-bgp)# vlan 30
switch(config-macvrf-30)# rd 10.255.0.0:30
switch(config-macvrf-30)# route-target both 64500:10030
switch(config-macvrf-30)# redistribute learned
!
switch(config-macvrf-30)# address-family evpn
switch(config-router-bgp-af)# neighbor 10.0.0.1 activate
The Ethernet segment to the multi-homed CE is configured on the port channel interface Port-Channel 100. SVI 20 and SVI 30 along with VARP IP are configured for inter-subnet routing. A VARP MAC is configured globally on PE1. The configuration on PE2 is similar to the configuration shown above. On PE3, SVI 20 and SVI 30 are configured along with VARP IP and VARP MAC.
Symmetric IRB with IPv4 example:
switch(config)# vrf instance red
switch(config-vrf-red)# rd 10.255.0.0:0
!
switch(config-vrf-red)# interface Port-Channel100
switch(config-if-Po100)# switchport access vlan 20
!
switch(config-if-Po100)# evpn ethernet-segment
switch(config-evpn-es)# identifier 0033:3333:3333:3333:3333
switch(config-evpn-es)# route-target import 00:03:00:03:00:03
switch(config-evpn-es)# lacp system-id 1234.5678.0123
!
switch(config-if-Po100)# interface Ethernet6/6/1
switch(config-if-Et6/6/1)# switchport mode trunk
switch(config-if-Et6/6/1)# channel-group 100 mode on
!
switch(config-if-Et6/6/1)# interface Loopback0
switch(config-if-Lo0)# ip address 10.255.0.0/32
!
switch(config-if-Lo0)# interface Vlan20
switch(config-if-Vl20)# vrf red
switch(config-if-Vl20)# ip address virtual 20.0.20.1/24
!
switch(config-if-Vl20)# interface vxlan1
switch(config-if-Vx1)# vxlan source-interface Loopback0
switch(config-if-Vx1)# vxlan udp-port 4789
switch(config-if-Vx1)# vxlan vlan 10 vni 10010
switch(config-if-Vx1)# vxlan vlan 20 vni 10020
switch(config-if-Vx1)# vxlan vrf red vni 20000
!
switch(config-if-Vx1)# ip virtual-router mac-address 00:00:80:00:00:00
!
switch(config)# router bgp 300
switch(config-router-bgp)# router-id 0.0.0.1
switch(config-router-bgp)# maximum-paths 2
switch(config-router-bgp)# neighbor 10.0.0.1 remote-as 303
switch(config-router-bgp)# neighbor 10.0.0.1 ebgp-multihop
switch(config-router-bgp)# neighbor 10.0.0.1 send-community extended
switch(config-router-bgp)# neighbor 10.0.0.1 maximum-routes 12000
switch(config-router-bgp)# redistribute static
!
switch(config-router-bgp)# vlan 20
switch(config-macvrf-20)# rd 10.255.0.0:20
switch(config-macvrf-20)# route-target both 64500:10020
switch(config-macvrf-20)# redistribute learned
!
switch(config-macvrf-20)# address-family evpn
switch(config-router-bgp-af)# neighbor 10.0.0.1 activate
!
switch(config-router-bgp-af)# vrf red
switch(config-router-bgp-vrf-red)# rd 10.255.0.0:0
switch(config-router-bgp-vrf-red)# route-target import evpn 64500:20000
switch(config-router-bgp-vrf-red)# route-target export evpn 64500:20000
switch(config-router-bgp-vrf-red)# router-id 10.255.0.0
!
The Ethernet segment to the multi-homed CE1 is configured on the port channel interface Port-Channel 100. SVI 20 along with VARP IP and VARP MAC is configured. Also, IP VRF is configured which is needed for symmetric IRB. The configuration on PE2 is similar. On PE3, IP VRF and SVI 30 are configured for symmetric IRB.
vxlan example
A netwok with two VRFs, red and blue has VLANs 10 and 20 in red and VLANs 30 and 40 in blue. The spines act as a route reflectors. Multicast groups are used to encapsulate traffic arriving in a VRF such that it is delivered to VTEPs that have that VRF provisioned.
interface Loopback0
ip address 10.0.0.20/32
!
vlan 10
vlan 20
vlan 30
vlan 40
!
interface Ethernet1
switchport access vlan 10
!
interface Ethernet2
switchport access vlan 20
!
interface Ethernet3
switchport access vlan 30
!
interface Ethernet4
switchport access vlan 40
!
interface Vlan10
vrf red
ip address virtual 192.168.1.0/24
ip igmp
pim ipv4 local-interface loopback0
!
interface Vlan20
vrf red
ip address 192.168.2.0/24
ip igmp
!
interface Vlan30
vrf blue
ip address 192.168.1.0/24
ip igmp
!
interface Vlan40
vrf blue
ip address 192.168.2.0/24
ip igmp
!
interface vxlan1
vxlan source-interface Loopback0
vxlan vlan 10 vni 10
vxlan vlan 20 vni 20
vxlan vlan 30 vni 30
vxlan vlan 40 vni 40
vxlan vrf red vni 100
vxlan vrf blue vni 200
vxlan vlan 10 flood group 225.1.1.2
vxlan vlan 20 flood group 225.1.1.3
vxlan vlan 30 flood group 226.1.1.2
vxlan vlan 40 flood group 226.1.1.3
vxlan vrf red multicast group 225.1.1.1
vxlan vrf blue multicast group 226.1.1.1
!
Show Commands
The following examples are based on the sample topology and configuration in the previous sections.
switch# show bgp evpn route-type mac-ip 20.0.20.2
BGP routing table information for VRF default
Router identifier 0.0.3.1, local AS number 302
Route status codes: s - suppressed, * - valid, > - active, # - not installed, E - ECMP head,
e - ECMP S - Stale, c - Contributing to ECMP, b - backup
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric LocPref Weight Path
* > RD: 10.255.0.0:20 mac-ip 0000.0101.0000 20.0.20.2
10.255.0.0 - 100 0 303 300 i
* > RD: 10.255.0.1:20 mac-ip 0000.0101.0000 20.0.20.2
10.255.0.1 - 100 0 303 301 i
As shown above, there are two EVPN MAC-IP routes for the multi-homed CE.
switch# show ip route vrf red 20.0.20.2/32
VRF: red
Codes: C - connected, S - static, K - kernel,
O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
N2 - OSPF NSSA external type2, B - BGP, B I - iBGP, B E - eBGP,
R - RIP, I L1 - IS-IS level 1, I L2 - IS-IS level 2,
O3 - OSPFv3, A B - BGP Aggregate, A O - OSPF Summary,
NG - Nexthop Group Static Route, V - vxlan Control Service,
DH - DHCP client installed default route, M - Martian,
DP - Dynamic Policy Route, L - VRF Leaked
B E 20.0.20.2/32 [200/0] via VTEP 10.255.0.0 VNI 20000 router-mac 00:00:78:01:00:00
via VTEP 10.255.0.1 VNI 20000 router-mac 00:00:78:04:00:00
Two routes to the multi-homed CE are installed from the two EVPN MAC-IP routes and they form L3 ECMP.
switch# show ip bgp 20.0.20.2/32 vrf red
BGP routing table information for VRF red
Router identifier 10.255.0.2, local AS number 302
BGP routing table entry for 20.0.20.2/32
Paths: 2 available
303 300
10.255.0.0 from 10.0.2.1 (0.0.1.1), imported EVPN route, RD 10.255.0.0:20
Origin IGP, metric 0, localpref 100, IGP metric 0, weight 0, received 00:14:43 ago,
valid, external, ECMP head, ECMP, best, ECMP contributor
Extended Community: Route-Target-AS:64500:10020 Route-Target-AS:64500:20000
TunnelEncap:tunnelTypevxlan EvpnMacMobility:1 EvpnRouterMac:00:00:78:01:00:00
Remote VNI: 20000
Rx SAFI: Unicast
303 301
10.255.0.1 from 10.0.2.1 (0.0.1.1), imported EVPN route, RD 10.255.0.1:20
Origin IGP, metric 0, localpref 100, IGP metric 0, weight 0, received 00:14:43 ago,
valid, external, ECMP, ECMP contributor
Extended Community: Route-Target-AS:64500:10020 Route-Target-AS:64500:20000
TunnelEncap:tunnelTypevxlan EvpnMacMobility:1 EvpnRouterMac:00:00:78:04:00:00
EvpnNdFlags:pflag
Remote VNI: 20000
Rx SAFI: Unicast
The second route has EvpnNdFlags:pflag to indicate that this is a proxy MAC-IP route.
switch# show bgp evpn instance sbd red
EVPN instance: SBD red
Route distinguisher: 100:1
Service interface: VLAN-based
Local IP address: 10.0.0.20
Encapsulation type: vxlan
vtep2#show bgp evpn instance sbd blue
EVPN instance: SBD red
Route distinguisher: 200:1
Service interface: VLAN-based
Local IP address: 10.0.0.20
Encapsulation type: vxlan
switch# show bgp evpn route-type imet next-hop 10.0.0.10 detail
BGP routing table information for VRF default
Router identifier 0.0.0.1, local AS number 300
BGP routing table entry for imet 10.0.0.10, Route Distinguisher: 10:1
Paths: 1 available
Local
10.0.0.10 from 10.0.0.1 (0.0.1.1)
Origin IGP, metric -, localpref 100, weight 0, valid, internal, best
Extended Community: Route-Target-AS:10:1 TunnelEncap:tunnelTypevxlan
VNI: 10
PMSI Tunnel: PIM-SSM Tree, MPLS Label: 10, Leaf Information Required: false,
Tunnel ID: 10.0.0.10, 225.1.1.2
BGP routing table information for VRF default
Router identifier 0.0.0.1, local AS number 300
BGP routing table entry for imet 10.0.0.10, Route Distinguisher: 100:1
Paths: 1 available
Local
10.0.0.10 from 10.0.0.1 (0.0.1.1)
Origin IGP, metric -, localpref 100, weight 0, valid, internal, best
Extended Community: Route-Target-AS:100:1 TunnelEncap:tunnelTypevxlan Multicast
Flags: IGMP proxy, OISM-supported, SBD
VNI: 100
PMSI Tunnel: Ingress Replication, MPLS Label: 100, Leaf Information Required: false,
Tunnel ID: 10.0.0.10
switch# show bgp evpn route-type smet multicast 228.1.1.1
BGP routing table information for VRF default
Router identifier 0.0.1.1, local AS number 300
Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric LocPref Weight Path
* > RD: 100:1 smet (S, G): (*, 228.1.1.1) originating IP: 10.0.0.10
10.0.0.10 - 100 0 i
* > RD: 100:1 smet (S, G): (*, 228.1.1.1) originating IP: 10.0.0.20
- - - 0 i
* > RD: 100:1 smet (S, G): (*, 228.1.1.1) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 200:1 smet (S, G): (*, 228.1.1.1) originating IP: 10.0.0.20
- - - 0 i
* > RD: 200:1 smet (S, G): (*, 228.1.1.1) originating IP: 10.0.0.40
10.0.0.40 - 100 0 i
switch# show bgp evpn route-type spmsi
BGP routing table information for VRF defaultRouter identifier 0.0.0.1, local AS number 300
Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP
S - Stale, c - Contributing to ECMP, b - backup
% - Pending BGP convergence
Origin codes: i - IGP, e - EGP, ? - incomplete
AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop
Network Next Hop Metric LocPref Weight Path
* > RD: 10:1 spmsi (S, G): (*, *) originating IP: 10.0.0.10
10.0.0.10 - 100 0 i
* > RD: 10:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 20:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 30:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 40:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 10:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 20:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 30:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 40:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 30:1 spmsi (S, G): (*, *) originating IP: 10.0.0.40
10.0.0.40 - 100 0 i
* > RD: 100:1 spmsi (S, G): (*, *) originating IP: 10.0.0.10
10.0.0.10 - 100 0 i
* > RD: 100:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 200:1 spmsi (S, G): (*, *) originating IP: 10.0.0.20
- - - 0 i
* > RD: 100:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 200:1 spmsi (S, G): (*, *) originating IP: 10.0.0.30
10.0.0.30 - 100 0 i
* > RD: 200:1 spmsi (S, G): (*, *) originating IP: 10.0.0.40
10.0.0.40 - 100 0 i
Limitations
- L2 loop-free protocols, such as Spanning Tree Protocol (STP) are not supported between PE-CE with EVPN vxlan All-Active Multi-homing. The topology must be loop-free. STP must be disabled for the Multihomed VLANs on PEs and CEs.
- A limit of two multi-homing destinations are installed at one time for any particular MAC address. Any other valid destinations are ignored in the context of switching unicast traffic, until one of the active destinations is removed. The multi-homing PEs operate normally regardless of the number of PEs on the ethernet segment; this limitation only affects the selection of a destination for unicast traffic.
- vxlan GPE is not supported, so packets cannot be flagged as having been broadcast. As a result, unicast packets that are known at the sender and unknown at the receiver are dropped if the receiving PE is not a designated forwarder. Similarly, unicast packets that are unknown at the sender and known at the receiver are duplicated at the receiving CE. This state automatically resolves itself as the BGP network distributes the appropriate type 1 auto-discovery and type 2 MAC/IP advertisement EVPN routes.
- Fast mass withdrawal is not supported, so there may be a delay if an interface harboring a large number of MAC addresses goes down.
Flood Traffic Filtering with EVPN
The vxlan fabric managed by EVPN does not always flood with broadcast, multicast, or unknown MAC traffic. Sometimes, ARP request broadcast and ND multicast traffic flood the vxlan fabric as well. There may be other cases where flooding ARP plus other traffic is allowed, but not all broadcast traffic into the fabric.
Most of the ARPs are learned through EVPN, and for some cases where the ARP is not learned through EVPN, it is acceptable to flood such ARP requests. However, rate limit the ARP flooding going into the vxlan fabric.
configuration
The default command disables flooding of different kinds of traffic into the vxlan fabric to restrict only ARP and ND traffic flooding.
switch(config-rtr-l2-vpn)# flooding default disabled
switch(config-rtr-l2-vpn)# arp flooding
switch(config-rtr-l2-vpn)# nd flooding
Show Commands
The show vxlan counters software command shows the number of ARP and ND packets that were prevented from getting flooded into the vxlan fabric.
switch# show vxlan counters software
. . . . . . . .
Tx pkts after IPv6 encapsulation : 0
SW pkts forwarded to remote VTEPs via HW HER : 4
SW pkts forwarding to remote VTEPs via HW HER failed : 0
Packets suppressed from getting flooded : 0