Jul 20 2017

Print this Post

Hyper-V Dynamic Port Distribution and f5 Load Balancer

Recently I got involved in a mysterious case. Hyper-V VMs face intermittent issues when communicating with f5 hardware load balancer. Let me first describe the network architecture and symptoms,

  • VMs, load balancer, and load balanced services are in the same VLAN and subnet.
  • The case we are working on is SMTP, so we can test with telnet.
  • Virtual Switches on Hyper-V are using NIC Teaming in switch independent mode with Dynamic port distribution (Windows Server 2016)
  • Telnet to the Virtual IP (VIP) works sometimes and fails most of the time.
  • Telnet directly to the service (Exchange servers) bypassing the f5 works perfectly all the time.
  • Changing the load balancing algorithm to Hyper-V Port fixes the issue.

Now if there is one thing I hate, its network issues. And what I hate most is network issues where things work sometimes but often not.

Why this is happening?

To spare you a lot of details, I will jump into what happens with Dynamic Port and why Hyper-V Port fixes it. Below is a simplified diagram of the network, note that there’s an affinity between the virtual NIC and one of the NIC team members.

As per this TechNet article, in Dynamic mode there are two cases for an outgoing packet from the vNIC to the network,

  1. The packet goes from the affinitized team member. The packet is sent with the MAC Address of the vNIC (no MAC replacement)
  2.  The packet goes from a non-affinitized team member. The packet is sent with the MAC address of the physical NIC used. (MAC address is replaced)

This can be seen in the network trace below from f5 showing a successful packet with the vNIC’s MAC address and two failed with the Physical NIC’s MAC address.

What happens in the two failed telnet sessions is as follows,

  1.  VM sends packet from vNIC to virtual switch.
  2. Virtual switch sends the packet through a non-affinitized physical NIC. This means replacing the MAC address to match that of the physical NIC.
  3. The packet reaches f5 which replies to the MAC address of the physical NIC.
  4. The packet is received by the Hyper-V host which drops the packet as it does not know how to translate its MAC address to that of the vNIC on the switch.

Now this may seem like an error on Hyper-V’s side, but the it is not. The IP RFCs clearly state that f5 should not reply to the MAC address of the source packet. Responses should be sent to the MAC address in the last ARP response received, which will always point to the MAC address of the vNIC. However, f5 implements a feature called Auto Last Hop which replies to the source MAC regardless of the routing tables.

This is evident in the network trace below from the Exchange server showing how it receives the packet from the physical NIC’s MAC address but replies to the correct MAC of the vNIC, thus no errors occur.

Configuring the load balancing algorithm to Hyper-V port fixes the issue with f5 because this algorithm uses only the affinitized NIC and no MAC replacement is performed.

Another option is to reconfigure the f5 to stop replying to the MAC address of the source MAC address regardless of the routing tables. This is explained in great details on their website in the article K13876: Overview of the Auto Last Hop setting and can be done for the whole device, VLAN, or Virtual Server.

Also, note that only switch independent modes are affected because switch dependent mode MAC replacement is never done.

Oh… Logical explanations are like a cold drink on a hot day!


About the author

Walid AlMoselhy

Permanent link to this article: http://almoselhy.azurewebsites.net/2017/07/hyper-v-dynamic-port-distribution-and-f5-load-balancer/

Leave a Reply

Your email address will not be published. Required fields are marked *