Monday, March 20, 2006

Load Balancing Applications on Linux

A server is limited in how many users it can serve in a given period of time, and once it hits that limit, the only options are to replace it with a newer, faster machine, or add another server and share the load between them. A load balancer can distribute connections among two or more servers, proportionally cutting the work each has to do. Load balancing can help with almost any kind of service, including HTTP, DNS, FTP, POP/IMAP, and SMTP. There are a number of open source load balancing applications, but one simple command-line load balancer, balance, remains one of the most popular available.

Ideally you should install a load balancer on a dedicated machine that can handle all the incoming connections, with a separate network interface for internal and external connections. However, none of this is necessary for the purposes of this article. To start testing balance, download the latest version from the project's Web site. Unpack it, build it, and install it as follows:

# tar -zxvf balance-3.34.tar.gz
# cd balance-3.34
# make
# make install

Note : Read the README file before Installation

Keep in mind that you'll need to be running as root in order to access ports below 1024.

Let's start with a simple case. We have connections coming in to port 80, the default HTTP port. We'd like to evenly share the work between two computers (although load may be distributed among any number). You specify machines to balance by referencing their IP addresses or hostnames. By default balance will connect to those machines on the same port on which it is listening. You can specify other ports by adding ":port" to the end of the address.

Let's assume we have two machines with hostnames "alpha" and beta". The most basic solution (we'll get to more sophisticated uses in a moment) is just to alternate connections between the two computers, back and forth. This kind of balancing is called round-robin. It simply means each person or device gets an equal share, one after the other.

Balance has a simple command-line interface. We need to tell it where incoming connections will be coming from, and the possible destinations. By running:

# balance -f 80 alpha beta

we can share the load equally between servers alpha and beta. Including the -f parameter will keep balance in the foreground. Without it, balance will fork to the background, but an adminstrator could communicate with it interactively by running balance -i. In this example, if the machines alpha and beta machines happen to be serving different data and you were the only current user, refreshing the page over and over would alternate you between the two sites (although presumably in most cases you would want both computers to serve the same content).

Another thing we can do with balance is set a failover machine. That is, if for some reason a connection fails or times out, balance will establish a connection to the failover. For example, the command:

# balance -f 80 alpha beta ! failover

tells balance to forward a connection to the machine named failover only if both alpha and beta fail. The exclamation point separates the machines into two separate groups. Connections will only be forwarded to the next group if all connections to the first fail.

Another way of telling balance to move to the next group is by setting a limit on the number of connections a machine can handle, as follows:

# balance -f 80 alpha::256 ! beta::64 ! failover

This specifies that alpha can handle up to 256 simultaneous connections, after which point balance will move on to beta, and once beta has 64 connections, we finally move to the failover machine. The basic idea here is that we're filling up one virtual bucket before we move on to the next.

There's one important thing still lacking with these kinds of balancing commands. While sufficient for static HTML content, many real-world Web sites require sessions. User logins, shopping carts, or any kind of "memory" from page to page require session data to be retained when a user clicks onto a different page. Because HTTP is inherently stateless, each time we load a new page we're starting a new connection, which the load balancer might well send to a new machine. This would make preserving session information difficult.

The easiest solution to this problem is to make sure each client always gets forwarded to the same machine. We can tell balance to do this with the command:

# balance -f 80 alpha beta %

The percent symbol denotes that the preceding group will be a "hash" type. Balance will hash the user's IP address and associate it with one of the machines. As long as the IP address remains the same, a connection initiated from it will always go to the same computer. A good hashing algorithm will make sure hashes are evenly spread among the machines.

Where do we go from here?

These techniques will produce a good, workable load balancer, but in cases where load is great, they will not suffice. An application like the Linux Virtual Server is more appropriate for cases like this. The LVS works on the IP level to increase efficiency, in contrast to balance, which works on the application level and thus has increased overhead in that it must deal with the HTTP protocol. In addition, LVS provides many different kinds of scheduling in addition to round-robin and hashing, which are the only methods we can use in the free version of balance. But the basic principles remain the same.

Thanks to load balancing, you can keep your servers' connection and download times high, and seamlessly serve the ever-increasing number of clients using the Internet every day.

Costa Walcott is the co-founder of Draconis Software and a freelance writer.