Load Balancing FAQ
Generally, load balancing is any method for evenly distributing processing or service requests across devices in a network. We are talking about server and network load balancing here.
Preface: This page is heavily borrowed from other sites including LVS / LVSKB / HAProxy / Loadbalancer.org (our main site) etc. I aim to give it more original content and structure ASAP... honest :-). Mainly created because our primary site doesn't get anywhere in the search results for 'load balancing', which we think it should!
We will try to give a lot of links to external sites and also our competitors so that this is not a complete waste of your precious browsing time.
One of the common problems with IT is the horrendous abuse of terminology by marketing types using terms like ADC, ADN... and on and on...
We'd like to use this page to clear up a few terms using our own perspective from 10 years as a load balancing appliance vendor.
If you think anything is missing (no really?), drop us an email.... ( email@example.com )
Layer-2 Load Balancing (bonding)
Layer-4 Load Balancing
Layer-7 Load Balancing (reverse proxy)
Hardware SSL acceleration or offload
Persistence / Sticky/Affinity
Server health checking
DNS Load Balancing
Link Load Balancing
Load Balancing Optimization / Compression
WAN Load Balancing Optimization / Compression
SIP Load Balancing
Computing Load Balancing
Free BSD stuff CARP, PF and hoststated
Load Balancing Appliance vendors
Layer-2 load balancing, aka link aggregation, port aggregation, ether channel, or gigabit ether channel port bundling is to bond two or more links into a single, higher-bandwidth logical link. Aggregated links also provide redundancy and fault tolerance if each of the aggregated links follows a different physical path. Link aggregation may be used to improve access to public networks by aggregating modem links or digital lines. Link aggregation may also be used in the enterprise network to build multi-gigabit backbone links between gigabit ethernet switches. See also NIC teaming or Link Aggregation Control Protocol (LACP).
The Linux kernel has the Linux bonding driver, which can aggregate multiple links for higher throughput or fault tolerance.
Our Opinion: The Linux Bonding driver works really well in master/slave mode without any changes to your infrastructure. If you have a trunk configured on your switches then you can use full 802.3ad LACP.
Layer-4 load balancing is to distribute requests to the servers at transport layer, such as TCP, UDP and SCTP transport protocol. The load balancer distributes network connections from clients who know a single IP address for a service, to a set of servers that actually perform the work. Since connection must be established between client and server in connection-oriented transport before sending the request content, the load balancer usually selects a server without looking at the content of the request.
IPVS / LVS is an implementation of layer-4 load balancing for the Linux kernel, and has been ported to FreeBSD recently. Loadbalancer.org, Kemp Technologies & Barracuda et al. use IPVS extensively in their hardware load balancers.
Layer-4 load balancing can also be used to balance traffic at multiple Internet access links, in order to increase Internet access speed. See DSL load balancing for more information. SmoothWall, FatPipe, Xrio et al. provide appliances to do this.
Our Opinion: IPVS aka. LVS is awesome, a fast reliable open source load balancing solution best combines with HA-Linux (Heartbeat), Keepalive or Ultramonkey / Ldirectord.
Layer-7 load balancing, also known as application-level load balancing, is to parse requests in application layer and distribute requests to servers based on different types of request contents, so that it can provide quality of service requirements for different types of contents and improve overall cluster performance. The overhead of parsing requests in application layer is high, thus its scalability is limited, compared to layer-4 load balancing.
KTCPVS is an implementation of layer-7 load balancing for the Linux kernel. With the appropriate modules, the Apache, Lighttpd and nginx web servers can also provide layer-7 load balancing as a reverse proxy.
Lots of commercial vendors use Layer 7 load balancing for cookie insertion etc. Barracuda do cookie insertion OK... Loadbalancer.org and Kemp do a nice extra which is Terminal Server RDP cookies....BUT for real flexibility F5 and Citrix netscaler dominate the Layer 7 Load Balancing market, F5 like to call it ADC Application Delivery Controller or ADN Application Delivery Network... we prefer the honest term of proxy or reverse proxy but that's not so sexy is it?
Our Opinion: KTCPVS doesn't seem as mature as HAProxy and it looks like the best features of kernel splicing etc. are being integrated into HAProxy as well. Exceliance and Loadbalancer.org are working with the community to ensure RDP cookies, source IP persistence and keepalive are integrated into the open source HAProxy solution so that it can give the big boys a run for their money.UPDATE: Hey its all finished and its juicylicious in HAProxy 1.4.2...
SSL Termination is the ability for a load balancer to establish a secure tunnel with the client thus in most cases replacing the requirement for the web server to perform SSL. In order for the load balancer to perform this function it must be configured with an SSL certificate either self generated or signed by a certificate authority. SSL termination is often required for any Layer 7 trickery such as cookie insertion etc. otherwise the load balancer can't read the encrypted payload of the packets. Layer 4 load balancing doesn't have the need to read the packet contents and therefore doesn't require SSL Termination.
Our Opinion: SSL Termination puts a heavy processing load on your load balancing appliance, why not spread the SSL termination load across your cluster for better scalability? Obviously you have to use it if you want to use Layer 7 functionality on SSL traffic. BTW: A basic Celeron CPU processor can do 700 TPS these days.
"Concerning the CPU intensive tasks (compression, SSL, ...),
I find it very important to explain that once the device is
saturated, it's the end and you will never scale anymore. Also,
explaining that a $100k device can see its performance divided
by 10 or 100 just to save some configuration on backend servers
is stupid." - Willy Tarreau (Author of HAProxy)
Hardware SSL acceleration or offload means that a special hardware chipset is used to handle the CPU intensive process of handling SSL termination. Modern hardware acceleration cards can handle 10,000 TPS + (termination per second).
Our Opinion: Commonly abused term by vendors check the TPS rating! Not as important as it used to be as a quad core CPU can do thousands of TPS (which is a lot). Also are you sure you really want to do all this on the load balancer? Why not use the cluster for it instead?
Question: Does anyone one still sell decent PCI-E hardware SSL accelerator cards?
Persistence is a feature that is required by many web applications.
Once a user has interacted with a particular server all subsequent
requests are sent to the same server thus persisting to that particular
server. It is normally required when the session state is stored
locally to the web server as opposed to a database.
No. If your application does not have a persistent backend storage device (a database) then you only get increased performance, failover will lose the session.
"THIS IS THE SAME FOR ALL LOAD BALANCERS - NO PERSISTENCE IN THE APPLICATION = NO SESSION FAILOVER" - Malcolm Turnbull - (Founder of Loadbalancer.org)
Server health checking is the ability of the load balancer to run a test against the servers to determine if they are providing service.
Ping: This is the most simple method,
however it is not very reliable as the server can be up whilst the web
service could be down. Also ICMP pings are often blocked by firewalls.
Our Opinion: These kind of health checks are all fairly standard from load balancer vendors....
DNS load balancing is to distribute requests to different servers though resolving the domain name to different IP addresses of servers. When a DNS request comes to the DNS server to resolve the domain name, it gives out one of the server IP addresses based on scheduling strategies, such as simple round-robin scheduling or geographical scheduling. This redirects the request to one of the servers in a server group. Once the domain is resolved to one of the servers in specified time-to-live, subsequent requests from the clients using the same local caching DNS server are sent to the same server.
More information is on the DNS Load Balancing page. PowerDNS link here?
Our Opinion: Much maligned this can be a great way to properly load balance a site... What do you think Google uses!?.. OK they probably do a bit of GSLB and health checking as well but at the end of the day I'm sure they use a GSLB based DNS solution with an LVS backend.... but who am I to guess?
Link load balancing is to balance traffic among multiple links from different ISPs or one ISP for better scalability and availability of Internet connectivity, and also cost saving.Our Opinion: Lots of people want this.... and lots of people say that it doesn't work very well... and lots of vendors say its a nightmare to try and support...We think the core problem is that it only really works if your ISP supports the link balancing technology. Several vendors enable inline compression for optimization, as far as load balancing appliances are concerned this means HTTP gzip compression which all web servers do anyway, so its a bit daft...
F5 has a hardware card for this but it costs a fortune and is a bit daft
Our Opinion: Never heard of anything so stupid, also relates to the quote from Willy about CPU intensive tasks.
WAN Load Balancing Optimization / Compression
Several vendors enable inline compression for optimization with or
without link balancing, BlueCoat
and Riverbed come to mind....
mainly for corporate networks to save on bandwidth costs.