Load balancing and web acceleration
- Lab Setup
- Tasks

Load balancing and web acceleration

Lab Setup

We will be using a virtual machine in the faculty's cloud.
When creating a virtual machine in the Launch Instance window:
- Select Boot from image in Instance Boot Source section
- Select SCGC Template in Image Name section
- Select a flavor that is at least m1.medium.
The username for connecting to the VM is student

First, download the laboratory archive:

[student@scgc ~] $ cd scgc
[student@scgc ~/scgc] $ wget --user=<username> --ask-password https://repository.grid.pub.ro/cs/scgc/laboratoare/lab-06.zip
[student@scgc ~/scgc] $ unzip lab-06.zip

After unzipping the archive, in the scgc directory the following files should be present:
- the base image base.qcow2.
- the scgc-vm1.qcow2, scgc-vm2.qcow2 and scgc-vm3.qcow2 images which will be used for the virtual machines.

In order to start the virtual machines, use the following command:

student@scgc:~/scgc$ chmod +x lab06-start-kvm
student@scgc:~/scgc$ ./lab06-start-kvm

After the KVM machines have started, use the following commands to access them (password “student”):

student@scgc:~/scgc$ ssh student@10.0.0.10
student@scgc:~/scgc$ ssh student@10.0.0.20
student@scgc:~/scgc$ ssh student@10.0.0.30

After logging in, use su to switch the user to root.

Tasks

0. Linux Virtual Server (LVS)

Linux Virtual Server (LVS) is an advanced, open source load balancing solution. It is also integrated in the Linux kernel.

In LVS terminology, the load balancing machine is also called virtual server (VS) whereas the machines that offer services are named real servers (RS). A client will access the service exclusively based on the virtual server address.

LVS has 3 operation modes:

LVS-NAT – VS performs NAT for the RS's. It is useful when the RS don't have a public IP address and are part of the same network. It does not scale well because the entire traffic goes through the VS.
LVS-TUN – VS tunnels the client packets and the RS communicates directly with the client. It scales better than LVS-NAT because only the client traffic passes through the VS, but it requires tunneling configuration on the RS.
LVS-DR (Direct Routing) – VS routes the packets towards the RS without tunneling and the RS communicates directly with the client. It removes the tunneling prerequisite, but both the VS and the RS need to have their network interface in the same LAN. Additionally, the RS must be able to answer the requests for the VS since the request destination addresses are not overwritten.

Topology

The machines from the topology (the 3 KVM machines and the physical one) have the following roles:

scgc-vm-1 is the director (Virtual Server). It performs load balancing for scgc-vm-2 and scgc-vm-3;
scgc-vm-2 and scgc-vm-3 are the real servers
the physical machine is the client.

1. [10p] LVS-DR (direct routing)

The HTTP service will be used for load balancing. The Apache2 webserver is already installed on the real servers. The director will split the client requests to the two real servers.

The ipvsadm package required for load balancing configuration is already installed on the director machine.

First we will configure the virtual address on the director machine. We will add the 10.0.0.1/24 address on the ens3:1 subinterface on the scgc-vm-1 machine.

root@scgc-vm-1:~# ip addr add dev ens3 10.0.0.1/24 label ens3:1

We will configure the HTTP service as a virtual service. To do this, we need to specify the virtual server address and port and the transport protocol used (TCP, in our case).

root@scgc-vm-1:~# ipvsadm -A -t 10.0.0.1:80

Now that the virtual service has been configured, we also need to add the real servers:

root@scgc-vm-1:~# ipvsadm -a -t 10.0.0.1:80 -r 10.0.0.20:80 -g
root@scgc-vm-1:~# ipvsadm -a -t 10.0.0.1:80 -r 10.0.0.30:80 -g

The -g parameter specifies that we are using LVS-DR.

We also need to “convince” the RS's to answer to requests destined for the VS address. There are 2 ways of achieving this:

configuring the VS address on a loopback interface on the RS. A disadvantage of doing this is that the RS might respond to an ARP message destined for the VS. This issue is also know as the ARP Problem.
configuring an iptables rule that makes it so the RS accepts the packets even if the VS address was not configured on any interface. We will use this approach.

root@scgc-vm-2:~# iptables -t nat -A PREROUTING -d 10.0.0.1 -j REDIRECT
root@scgc-vm-3:~# iptables -t nat -A PREROUTING -d 10.0.0.1 -j REDIRECT

Now we can use the virtual service we just configured.

Test the functionality by opening the address http://10.0.0.1. Refresh the page multiple times by pressing CTRL+F5 and notice how the pages from both real servers are loaded one after another. You can also use curl/wget from the CLI.

Using Wireshark (or tcpdump), start a capture on the br0 interface on the client machine.

student@scgc:~$ sudo tcpdump -i br0 -e

Notice the IP and MAC addresses from:

the packets arriving from the client towards the director;
the packets arriving from the director towards the real servers;
the packets sent by the real servers towards the client.

To check the VS state, use the -l parameter:

root@scgc-vm-1:~# ipvsadm -l

To check more detailed information regarding the connections managed by the VS, add the -c parameter:

root@scgc-vm-1:~# ipvsadm -l -c

Besides the basic configuration, we can modify additional parameters.

For instance, we will activate the round-robin scheduler and configure a maximum of 4 simultaneous connections for each RS:

root@scgc-vm-1:~# ipvsadm -E -t 10.0.0.1:80 -s rr
root@scgc-vm-1:~# ipvsadm -e -t 10.0.0.1:80 -r 10.0.0.20:80 -x 4
root@scgc-vm-1:~# ipvsadm -e -t 10.0.0.1:80 -r 10.0.0.30:80 -x 4

The -E parameter means the service is going to be edited (in our case, we're going to change the scheduler).

The -e parameter means the real server is going to be edited (in our case, the maximum number of simultaneous connections will be changed).

Refresh the page in the browser a few times and notice that after 8 page refresh operations, the director will not send requests anymore to the real servers.

For RS's with different hardware configuration, different values of the maximum number of simultaneous connections can be used. Alternatively, we can define a different weight for each server and use a weighted scheduler (e.g. wrr) on the VS.

To delete the service, use the -D parameter:

root@scgc-vm-1:~# ipvsadm -D -t 10.0.0.1:80

We also have to delete the iptables rules on the real servers:

root@scgc-vm-2:~# iptables -t nat -F
root@scgc-vm-3:~# iptables -t nat -F

2. [10p] LVS-TUN (tunneling)

Next, we will configure the director to operate in the LVS-TUN mode.

Similar to the previous exercise, configure the HTTP service on the director and the tow real servers in the tunneling mode (use the -i parameter).

In order for the real servers to correctly interpret the packets received from the director, we need to configure a tunnel interface (ipip type) on both real servers).

The tunnel interface IP address needs to be identical with the virtual IP address of the director.

root@scgc-vm-2:~# ip tunnel add tun0 mode ipip local 10.0.0.20
root@scgc-vm-2:~# ip addr add 10.0.0.1/24 dev tun0
root@scgc-vm-2:~# ip link set tun0 up
root@scgc-vm-3:~# ip tunnel add tun0 mode ipip local 10.0.0.30
root@scgc-vm-3:~# ip addr add 10.0.0.1/24 dev tun0
root@scgc-vm-3:~# ip link set tun0 up

Perform a capture again using Wireshark (or tcpdump) on the br0 interface on the physical machine. Notice the packet encapsulation and the differences compared to when the VS was operating the LVS-DR mode.

Delete the service on the director. Delete the tunnel interfaces on the real servers.

root@scgc-vm-2:~# ip tunnel del tun0
root@scgc-vm-3:~# ip tunnel del tun0

3. [20p] Varnish configuration

For Varnish configuration we will use the scgc-vm-1 as the Varnish machine and scgc-vm-2 as the web server.

On scgc-vm-1 we will have to install varnish:

root@scgc-vm-1:~# apt-get update
root@scgc-vm-1:~# apt-get install gcc varnish
root@scgc-vm-1:~# service varnish restart

We will want to see the effect of using Varnish versus a direct web server access. The httperf tool will be used to evaluate the web access performance with and without Varnish.

The Varnish server is configured by default to listen on port 6081:

root@scgc-vm-1:~# netstat -tlpn | grep varnish
tcp        0      0 0.0.0.0:6081              0.0.0.0:*               LISTEN      2366/varnishd
tcp        0      0 127.0.0.1:6082          0.0.0.0:*               LISTEN      2364/varnishd
tcp6       0      0 :::6081                   :::*                    LISTEN      2366/varnishd
tcp6       0      0 ::1:6082                :::*                    LISTEN      2364/varnishd

Port 8062 is the adminstration port. The Varnish configuration can be found at /etc/default/varnish. Use an editor or the following command in order to view the configuration:

root@scgc-vm-1:~# grep -A 4 DAEMON_OPTS /etc/default/varnish | grep -v '^#'
--
--
DAEMON_OPTS="-a :6081 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,256m"
--
--

In order to see the benefits of using Varnish, we will access a web server directly and indirectly through Varnish. We will use the elf.cs.pub.ro server. For this, we will have to change the port on which the Varnish server listens to port 80:

root@scgc-vm-1:~# grep -A 4 DAEMON_OPTS /etc/default/varnish | grep -v '^#'
--
--
DAEMON_OPTS="-a :80 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,256m"
--
--

We will also have to configure Varnish to use elf.cs.pub.ro as a back end. For this, we can consult the /etc/varnish/default.vcl configuration file:

root@scgc-vm-1:~# grep -A 3 'backend default' /etc/varnish/default.vcl 
backend default {
    .host = "elf.cs.pub.ro";
    .port = "80";
}

The above configuration means that any requests received by the Varnish server will be redirected towards the elf.cs.pub.ro server. The requests will be cached and the content of future requests will be directly served from the Varnish cache.

Modify /lib/systemd/system/varnish.service as below to set the Varnish port on 80:

root@scgc-vm-1:~# grep ExecStart /lib/systemd/system/varnish.service

ExecStart=/usr/sbin/varnishd -j unix,user=vcache -F -a :80 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m

Do not forget to restart the Varnish service every time its configuration is changed:

student@scgc-vm-1:~$ sudo systemctl daemon-reload
student@scgc-vm-1:~$ sudo service varnish restart

We will evaluate the performance from the host machine (scgc). For this we will configure the system to send the requests towards elf.cs.pub.ro towards the Varnish machine:

10.0.0.10   elf.cs.pub.ro

Verify that the ping command works on elf.cs.pub.ro:

student@scgc:~$ ping elf.cs.pub.ro
PING elf.cs.pub.ro (10.0.0.10) 56(84) bytes of data.
64 bytes from elf.cs.pub.ro (10.0.0.10): icmp_seq=1 ttl=64 time=0.361 ms
^C

Connect using a browser from the host machine (or wget in the CLI) and access the http://elf.cs.pub.ro URL. We will notice that the first access takes longer than subsequent requests.

To evaluate this, install httperf on the host machine:

student@scgc:~$ sudo apt-get install httperf

Evaluate the connection when accessing http://elf.cs.pub.ro using the following command:

student@scgc:~$ httperf --server=elf.cs.pub.ro --wsess=2000,10,2 --rate 300 --timeout 5

Notice in the output information related to Connection rate, Request rate, Net I/O.

In order to have a comparison with a direct access on elf.cs.pub.ro, delete the line we added in /etc/hosts. Run again the httperf command. Notice the difference between parameters.

4. [10p] Varnish configuration with local web server

Configure Varnish in order to be the front-end for the web server residing on the scgc-vm-2 machine. Afterwards, we will evaluate the web access performance.

For starters, configure the Varnish instance to use web server 10.0.0.20 as its back end. After the configuration, do not forget to restart the service.

root@scgc-vm-1:~# service varnish restart

In order to test this, connect to a browser from the host machine (or wget in the CLI) and access the following URL: http://10.0.0.10. The following message should appear: This is scgc-vm-2 (10.0.0.20).

The same message will appear if you directly access the web server at http://10.0.0.20 although this request will not go through the Varnish server.

Next we will check the Varnish performance when executing a data transfer. For this, we will create a 10MB file on the web server:

root@scgc-vm-2:~# cd /var/www
root@scgc-vm-2:~# mkdir data; cd data
root@scgc-vm-2:~# dd if=/dev/urandom of=10M.dat bs=100k count=100

We will also need to change the DocumentRoot of the Apache server on scgc-vm-2. For this, edit/etc/apache2/sites-available/000-default.conf and change the DocumentRoot from /var/www/html to /var/www. Restart the apache2 service.

In order to measure the duration of the data transfer with and without Varnish, use httperf on the host machine in order to download the following file: http://10.0.0.20/data/10M.dat for direct access or http://10.0.0.10/data/10M.dat for Varnish access.

While httperf is running, check with htop the load on both virtual machines.

Use the --uri option for httperf to specify what page is to be accessed (in our case http://10.0.0.20/data/10M.dat). Notice the difference between the Request rate parameter for the direct access and the Varnish access.

5. [10p] Varnish statistics

In order to track the Varnish service state, we can use the following analysy and monitoring tools: varnishlog, varnishstat and varnishhist. All of these show information about the Varnish service from the moment when the command was executed and not before.

The varnishlog shows details regarding the the Varnish service connections. Run the command and then perform some web requests to the Varnish service. The log shown is very verbose; we can filter it using the following filter:

root@scgc-vm-1:~# varnishlog -I RxURL
only the receive URL will be shown

The varnishstat command shows information regarding the service state. The output is a screen similar to the one shown by the top or htop commands. Run the command, connect to the Varnish web service and check the output, especially the Hitrate ratio.

The varnishhist command shows a histogram of the serving requests duration. On the horizontal axis we have the serving requests timers at a logarithmic scale. The requests that do not access the cache appear with # and the ones that reach the cache appear with |. Create multiple, different connection to the Varnish service (e.g. access 3 files with different sizes) and check the output. Notice how small the serving request timer is when the request is retrieved from the cache instead of a direct web server access.

Use the varnishlog command to show only the requests towards the /data/10M.dat file.

Use the -q option with varnishlog and the ReqURL VSL tag. Check the examples from the varnishlog man page.

6. [10p] Varnish caching

We will check the cache behavior in Varnish. Varnish entries are cached for a period of time before they expire. After the time expires, a new web server request will be performed by the Varnish service.

The cache lifetime is given by an internal TTL Varnish variable. Its value is by default 120 seconds. We can verify this by executing the following command:

root@scgc-vm-1:~# varnishadm param.show default_ttl

This value can be changed by editing the VARNISH_TTL directive in the /etc/default/varnish configuration file.

In order to view the cache timer state, we will use the following command:

root@scgc-vm-1:~# varnishlog -i VCL_Call

We will execute requests in order to download the 10MB.dat file used in the previous exercise.

A message containing the miss keyword will appear when the information cannot be found in the cache. The hit keyword will appear when the information can be found in the cache. After 120 seconds, the cache will expire and a web access will generate a cache miss.

After you generated a cache miss on the 10M.dat file, access the page again. Now it will be in the cache. Rewrite the file on the web server:

root@scgc-vm-2:/var/www/html/data# dd if=/dev/urandom of=10M.dat bs=100k count=100

Access the file again and check the output shown by the varnishlog command. Notice that file was retrieved from the cache. Until the cache expires, any file change will not be visible in the cache.

In order to prevent having an old object/file in the cache, we can purge/ban it. After the .dat file is present in the cache, ban the file in order for it to be reread by the Varnish service.

Follow the steps here. Use the varnishadm command in order to access the Varnish CLI.

7. [15p] VCL configuration

In order to configure Varnish, the VCL (Varnish Configuration Language) is used. This allows configuration to be dynamically loaded at runtime in a Varnish instance.

Using the configuration file /etc/varnish/default.vcl, configure the TTL to 1 hour for the files served by Varnish from the /data/ directory. The other files/pages served by Varnish will use the default TTL value (120 seconds).

Follow the steps here and here. Restart the Varnish service after the configuration.

Wait 3-4 minutes after the firs access on a page from the /data/ directory and then access it again. A correct configuration should lead to a cache hit (the 1 hour cache hasn't expired yet).

8. [15p] Varnish serving multiple servers

We want the Varnish service to accelerate the web access towards both the local web server on the scgc-vm-2 machine and elf.cs.pub.ro. For this, we need to configure two back ends.

Configure both back ends so that if the URL starts with /ndk/ (elf.cs.pub.ro/ndk) the request will be served by elf.cs.pub.ro back end, otherwise it will be served by the local web server back end.

Use the Varnish documentation.

9. [BONUS - 20p] Load balancing in Varnish

We will perform load balancing using Varnish (also called directors in Varnish). We will use the two web servers on scgc-vm-2 and scgc-vm-3 as back ends for load balancing.

Configure the Varnish service on scgc-vm-1 to perform load balancing with the two web servers using a round robin scheduler.

Use the examples and information here and here. Take into account the fact that the VCL version may differ from the one used by the varnish on your system, and the syntax may be need to be changed.

In order to verify this task, access in a browser (or wget in the CLI) 10.0.0.10, wait 2 minutes (for the Varnish cache to expire) and access it again.

Table of Contents