MS Network Load Balancing – The Fine Print

Natty Light!

Microsoft’s NLB Clustering is kind of to High Availability Load Balancing what Natural Light is to the beer world. Both will basically get the job done, and on the cheap, but in the long run they might leave you with a wicked headache and wishing you spent a few extra dollars for a Sam Adams.

A lot of my time at work recently has been spent researching and testing load balancing and fail-over solutions for a group of Windows based application servers. Having never had load balancing requirements before, an NLB clustering solution sounded good at first, especially being included free with the OS. However, I found that unless your environment exactly meets requirements, you may be better off not going down the MS NLB road. This brief overview of my lessons learned may help others also considering NLB solutions.

* Basically MS NLB works by assigning a virtual IP address (VIP) to the network adapter of each cluster member. Traffic is sent to the VIP, received by all cluster members, accepted by one, dropped by the rest.

* MS NLB supports two configurations: unicast mode, or multicast mode. Unicast mode replaces the existing MAC address of all cluster members with a new cluster MAC address, which is shared by all nodes. Multicast mode adds the cluster MAC address to the node adapter, but also leaves the original one. With both methods, the nodes share an IP and MAC address, so that when a client asks “who has this IP address” (an ARP request), all nodes respond.

* Unicast mode aims to be simple, and has the advantage of working across routers with no problems. However, this method has the negative side effect of flooding switch ports. MS-NLB hides the MAC address of outgoing cluster traffic, switches never learn what ports cluster members are attached to, so traffic destined for the cluster is flooded out all ports. This effectively turns a switch into a hub as far as cluster traffic goes, which can cause network issues with busy clusters. This can be overcome by adding static ARP entries on the switch (if supported), but that can quickly become a management nightmare. Another possible drawback to unicast mode is that cluster members cannot directly communicate with each other without adding a 2nd NIC.

* Multicast mode attempts to address switch flooding by using IGMP Multicast support, which tells the switch to direct cluster traffic only to those ports with cluster members attached. However, this assumes the switch supports IGMP snooping and has it enabled. Also, many routers & layer 3 switches do not support this mode because ARP replies associate a unicast IP with a multicast MAC, which may or may not be against standards depending on whether you ask Microsoft or Cisco. No IGMP support means switch flooding. And no IGMP router support means no cluster access outside of that subnet unless a static ARP entry is used.

* Planning to implement NLB in a virtualized environment adds complexity. The only one I can speak to from experience is VMWare ESX. They support both modes, however unicast is not recommended. By default, unicast doesn’t work because the virtual switches learn MAC addresses despite the cluster masking outbound traffic, which breaks clustering. This can be overcome by disabling the NotifySwitch feature, but that in turn breaks operations like VMotion. Multicast works, but is subject to the same problems as mentioned above, and made more complex by the many different physical / virtual topologies.

I certainly don’t intend to demean Microsoft (or Natty Light) on their products. Microsoft could have easily not included it with the OS, leaving the only option as an expensive hardware load balancer. MS NLB does work, and providing you are aware of and can address its limitations, you may find it to be an effective low cost load balancing solution in your environment. On the flip-side, if you find that the management and overhead is too much and you need a hardware LB device, there are a number of powerful and relatively inexpensive possibilities. The ones from Barracuda Networks are a good choice. There are also other factors not covered here that need to be taken into account; session support, affinity, and redundant network topologies to name a few. So make sure to do adequate research, up to and including packet captures to prove intended operation.

Related Links:
Network Load Balancing Clusters – TechNet
Selecting Unicast or Multicast – TechNet
Implementing NLB in a Virtualized Environment


3 Responses

  1. Nice article, but you are completely leaving out the other software load balancers. Have you ever heard of Resonate, Inc? The have a great commercial cross-platform inexpensive cross-platform solution.

  2. Great article/detail. One quick question – do I need anything special as far as a switch or router goes? Or is this all software?

  3. Possibly, see the mentions in the article about static ARP entries and IGMP snooping.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: