As one of our final validation points in an ongoing project has been enabling ESX to use a LAG on both the Management and VM Networks. The last time we did this we were unable to get it to work, and it turns out the solution was actually very simple. Basically the core of the issue was that we were attempting to use Dynamic LACP which ESX does not support, instead you are to use Static LACP.
Create New LAG
(switch01)# configure (switch01) (Config)# port-channel esxhost00-vmnet (switch01) (Config)# exit
Show New LAG
(switch01) #show port-channel Logical Interface Group Id Port-Channel Name Link State Mbr Ports Active Ports ----------------- -------- ----------------- ---------- --------- ------------ 0/1/11 11 esxhost00-vmnet Down
The above command gives us the Logical Interface, which we will need. But if we look at it with a different command we can see the problem.
(switch01) #show port-channel all Port- Link Log. Channel Adm. Trap STP Mbr Port Port Intf Name Link Mode Mode Mode Type Ports Speed Active ------ --------------- ------ ---- ---- ------ ------- ------ --------- ------ lag 11 esxhost00-vmnet Up En. En. En. Dynamic
The problem here is that by default we create LAGs in dynamic mode. ESX requires static mode.
Set the LACP Mode as Static
(switch01)# configure (switch01) (Config)#interface 0/1/11 (switch01) (Interface 0/1/11)#port-channel static (switch01) (Interface 0/1/11)#exit
Down the Switchport for the Standby Vmnic
(switch01)# configure (switch01) (Config)#interface 2/0/2 (switch01) (Interface 2/0/2)#shutdown (switch01) (Interface 2/0/2)#exit
Review Existing ESX Portgroup Configuration
# esxcli network vswitch standard portgroup policy failover get --portgroup-name="VM Network" Load Balancing: srcport Network Failure Detection: link Notify Switches: true Failback: true Active Adapters: vmnic4 Standby Adapters: vmnic6 Unused Adapters: Override Vswitch Load Balancing: false Override Vswitch Network Failure Detection: false Override Vswitch Notify Switches: false Override Vswitch Failback: false Override Vswitch Uplinks: true # esxcli network vswitch standard portgroup policy failover get --portgroup-name="Management Network" Load Balancing: srcport Network Failure Detection: link Notify Switches: true Failback: true Active Adapters: vmnic4 Standby Adapters: vmnic6 Unused Adapters: Override Vswitch Load Balancing: false Override Vswitch Network Failure Detection: false Override Vswitch Notify Switches: false Override Vswitch Failback: false Override Vswitch Uplinks: true
Change Existing ESX Portgroup Configuration
# esxcli network vswitch standard portgroup policy failover set --portgroup-name="VM Network" --load-balancing=iphash --failure-detection=link --notify-switches true --failback false --active-uplinks=vmnic4,vmnic6 # esxcli network vswitch standard portgroup policy failover set --portgroup-name="Management Network" --load-balancing=iphash --failure-detection=link --notify-switches true --failback false --active-uplinks=vmnic4,vmnic6
Review New ESX Portgroup Configuration
# esxcli network vswitch standard portgroup policy failover get --portgroup-name="VM Network" Load Balancing: iphash Network Failure Detection: link Notify Switches: true Failback: false Active Adapters: vmnic4, vmnic6 Standby Adapters: Unused Adapters: Override Vswitch Load Balancing: true Override Vswitch Network Failure Detection: true Override Vswitch Notify Switches: true Override Vswitch Failback: true Override Vswitch Uplinks: true # esxcli network vswitch standard portgroup policy failover get --portgroup-name="Management Network" Load Balancing: iphash Network Failure Detection: link Notify Switches: true Failback: false Active Adapters: vmnic4, vmnic6 Standby Adapters: Unused Adapters: Override Vswitch Load Balancing: true Override Vswitch Network Failure Detection: true Override Vswitch Notify Switches: true Override Vswitch Failback: true Override Vswitch Uplinks: true
Add Both Switchports to the LAG
(switch01)# configure (switch01) (Config)#interface 2/0/3 (switch01) (Interface 2/0/3)#addport 0/1/11 (switch01) (Interface 2/0/3)#exit (switch01) (Config)#interface 2/0/2 (switch01) (Interface 2/0/2)#addport 0/1/11 (switch01) (Interface 2/0/2)#exit (switch01) (Config)#exit
Review LAG Configuration
(switch01) #show port-channel all Port- Link Log. Channel Adm. Trap STP Mbr Port Port Intf Name Link Mode Mode Mode Type Ports Speed Active ------ --------------- ------ ---- ---- ------ ------- ------ --------- ------ lag 11 esxhost00-vmnet Up En. En. En. Static 2/0/3 Auto True 2/0/2 Auto False
Verify ESX Connectivity
So now that we have a LAG configured and operational albeit only on one port, the traffic will either work or not. So if you have connectivity then you should be able to safely enable the disabled port (in my case 2/0/2) and not have any problems with your traffic. Now if you do have problems with connectivity, then you probably have a problem with something like VLANs not being configured correctly on the LAG, we ran into that when we put this into production.
Up the Switchport for the Standby Vmnic
(switch01) #configure (switch01) (Config)#interface 2/0/2 (switch01) (Interface 2/0/2)#no shutdown (switch01) (Interface 2/0/2)#exit (switch01) (Config)#exit
Review Final LAG Configuration
(switch01) #show port-channel all Port- Link Log. Channel Adm. Trap STP Mbr Port Port Intf Name Link Mode Mode Mode Type Ports Speed Active ------ --------------- ------ ---- ---- ------ ------- ------ --------- ------ lag 11 vdihost00-vmnet Up En. En. En. Static 2/0/3 Auto True 2/0/2 Auto True
Now notice here that we are showing the LAGs interface up with both ports Active, so assuming that you still have basic connectivity to each of your vswitches then you should be good to go.