student
[student@scgc ~] $ cd scgc [student@scgc ~/scgc] $ wget --user=<username> --ask-password https://repository.grid.pub.ro/cs/scgc/laboratoare/lab-06.zip [student@scgc ~/scgc] $ unzip lab-06.zip
scgc
directory the following files should be present:base.qcow2
.scgc-vm1.qcow2
, scgc-vm2.qcow2
and scgc-vm3.qcow2
images which will be used for the virtual machines.In order to start the virtual machines, use the following command:
student@scgc:~/scgc$ chmod +x lab06-start-kvm student@scgc:~/scgc$ ./lab06-start-kvm
After the KVM machines have started, use the following commands to access them (password “student”):
student@scgc:~/scgc$ ssh student@10.0.0.10 student@scgc:~/scgc$ ssh student@10.0.0.20 student@scgc:~/scgc$ ssh student@10.0.0.30
After logging in, use su
to switch the user to root
.
Linux Virtual Server (LVS) is an advanced, open source load balancing solution. It is also integrated in the Linux kernel.
In LVS terminology, the load balancing machine is also called virtual server (VS) whereas the machines that offer services are named real servers (RS). A client will access the service exclusively based on the virtual server address.
LVS has 3 operation modes:
The machines from the topology (the 3 KVM machines and the physical one) have the following roles:
The HTTP service will be used for load balancing. The Apache2 webserver is already installed on the real servers. The director will split the client requests to the two real servers.
The ipvsadm package required for load balancing configuration is already installed on the director machine.
First we will configure the virtual address on the director machine. We will add the 10.0.0.1/24 address on the ens3:1 subinterface on the scgc-vm-1 machine.
root@scgc-vm-1:~# ip addr add dev ens3 10.0.0.1/24 label ens3:1
We will configure the HTTP service as a virtual service. To do this, we need to specify the virtual server address and port and the transport protocol used (TCP, in our case).
root@scgc-vm-1:~# ipvsadm -A -t 10.0.0.1:80
Now that the virtual service has been configured, we also need to add the real servers:
root@scgc-vm-1:~# ipvsadm -a -t 10.0.0.1:80 -r 10.0.0.20:80 -g root@scgc-vm-1:~# ipvsadm -a -t 10.0.0.1:80 -r 10.0.0.30:80 -g
The -g parameter specifies that we are using LVS-DR.
We also need to “convince” the RS's to answer to requests destined for the VS address. There are 2 ways of achieving this:
root@scgc-vm-2:~# iptables -t nat -A PREROUTING -d 10.0.0.1 -j REDIRECT root@scgc-vm-3:~# iptables -t nat -A PREROUTING -d 10.0.0.1 -j REDIRECT
Now we can use the virtual service we just configured.
Test the functionality by opening the address http://10.0.0.1. Refresh the page multiple times by pressing CTRL+F5 and notice how the pages from both real servers are loaded one after another. You can also use curl/wget from the CLI.
Using Wireshark (or tcpdump), start a capture on the br0 interface on the client machine.
student@scgc:~$ sudo tcpdump -i br0 -e
Notice the IP and MAC addresses from:
To check the VS state, use the -l parameter:
root@scgc-vm-1:~# ipvsadm -l
To check more detailed information regarding the connections managed by the VS, add the -c parameter:
root@scgc-vm-1:~# ipvsadm -l -c
Besides the basic configuration, we can modify additional parameters.
For instance, we will activate the round-robin scheduler and configure a maximum of 4 simultaneous connections for each RS:
root@scgc-vm-1:~# ipvsadm -E -t 10.0.0.1:80 -s rr root@scgc-vm-1:~# ipvsadm -e -t 10.0.0.1:80 -r 10.0.0.20:80 -x 4 root@scgc-vm-1:~# ipvsadm -e -t 10.0.0.1:80 -r 10.0.0.30:80 -x 4
The -E parameter means the service is going to be edited (in our case, we're going to change the scheduler).
The -e parameter means the real server is going to be edited (in our case, the maximum number of simultaneous connections will be changed).
Refresh the page in the browser a few times and notice that after 8 page refresh operations, the director will not send requests anymore to the real servers.
For RS's with different hardware configuration, different values of the maximum number of simultaneous connections can be used. Alternatively, we can define a different weight for each server and use a weighted scheduler (e.g. wrr) on the VS.
To delete the service, use the -D parameter:
root@scgc-vm-1:~# ipvsadm -D -t 10.0.0.1:80
We also have to delete the iptables rules on the real servers:
root@scgc-vm-2:~# iptables -t nat -F root@scgc-vm-3:~# iptables -t nat -F
Next, we will configure the director to operate in the LVS-TUN mode.
Similar to the previous exercise, configure the HTTP service on the director and the tow real servers in the tunneling mode (use the -i parameter).
In order for the real servers to correctly interpret the packets received from the director, we need to configure a tunnel interface (ipip type) on both real servers).
The tunnel interface IP address needs to be identical with the virtual IP address of the director.
root@scgc-vm-2:~# ip tunnel add tun0 mode ipip local 10.0.0.20 root@scgc-vm-2:~# ip addr add 10.0.0.1/24 dev tun0 root@scgc-vm-2:~# ip link set tun0 up root@scgc-vm-3:~# ip tunnel add tun0 mode ipip local 10.0.0.30 root@scgc-vm-3:~# ip addr add 10.0.0.1/24 dev tun0 root@scgc-vm-3:~# ip link set tun0 up
Perform a capture again using Wireshark (or tcpdump) on the br0 interface on the physical machine. Notice the packet encapsulation and the differences compared to when the VS was operating the LVS-DR mode.
Delete the service on the director. Delete the tunnel interfaces on the real servers.
root@scgc-vm-2:~# ip tunnel del tun0 root@scgc-vm-3:~# ip tunnel del tun0
For Varnish configuration we will use the scgc-vm-1
as the Varnish machine and scgc-vm-2
as the web server.
On scgc-vm-1
we will have to install varnish
:
root@scgc-vm-1:~# apt-get update root@scgc-vm-1:~# apt-get install gcc varnish root@scgc-vm-1:~# service varnish restart
We will want to see the effect of using Varnish versus a direct web server access. The httperf
tool will be used to evaluate the web access performance with and without Varnish.
The Varnish server is configured by default to listen on port 6081:
root@scgc-vm-1:~# netstat -tlpn | grep varnish tcp 0 0 0.0.0.0:6081 0.0.0.0:* LISTEN 2366/varnishd tcp 0 0 127.0.0.1:6082 0.0.0.0:* LISTEN 2364/varnishd tcp6 0 0 :::6081 :::* LISTEN 2366/varnishd tcp6 0 0 ::1:6082 :::* LISTEN 2364/varnishd
Port 8062 is the adminstration port. The Varnish configuration can be found at /etc/default/varnish
. Use an editor or the following command in order to view the configuration:
root@scgc-vm-1:~# grep -A 4 DAEMON_OPTS /etc/default/varnish | grep -v '^#' -- -- DAEMON_OPTS="-a :6081 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s malloc,256m" -- --
In order to see the benefits of using Varnish, we will access a web server directly and indirectly through Varnish. We will use the elf.cs.pub.ro
server. For this, we will have to change the port on which the Varnish server listens to port 80:
root@scgc-vm-1:~# grep -A 4 DAEMON_OPTS /etc/default/varnish | grep -v '^#' -- -- DAEMON_OPTS="-a :80 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s malloc,256m" -- --
We will also have to configure Varnish to use elf.cs.pub.ro
as a back end. For this, we can consult the /etc/varnish/default.vcl
configuration file:
root@scgc-vm-1:~# grep -A 3 'backend default' /etc/varnish/default.vcl backend default { .host = "elf.cs.pub.ro"; .port = "80"; }
The above configuration means that any requests received by the Varnish server will be redirected towards the elf.cs.pub.ro
server. The requests will be cached and the content of future requests will be directly served from the Varnish cache.
Modify /lib/systemd/system/varnish.service
as below to set the Varnish port on 80
:
root@scgc-vm-1:~# grep ExecStart /lib/systemd/system/varnish.service ExecStart=/usr/sbin/varnishd -j unix,user=vcache -F -a :80 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m
Do not forget to restart the Varnish service every time its configuration is changed:
student@scgc-vm-1:~$ sudo systemctl daemon-reload student@scgc-vm-1:~$ sudo service varnish restart
We will evaluate the performance from the host machine (scgc). For this we will configure the system to send the requests towards elf.cs.pub.ro
towards the Varnish machine:
10.0.0.10 elf.cs.pub.ro
Verify that the ping command works on elf.cs.pub.ro:
student@scgc:~$ ping elf.cs.pub.ro PING elf.cs.pub.ro (10.0.0.10) 56(84) bytes of data. 64 bytes from elf.cs.pub.ro (10.0.0.10): icmp_seq=1 ttl=64 time=0.361 ms ^C
Connect using a browser from the host machine (or wget in the CLI) and access the http://elf.cs.pub.ro
URL. We will notice that the first access takes longer than subsequent requests.
To evaluate this, install httperf
on the host machine:
student@scgc:~$ sudo apt-get install httperf
Evaluate the connection when accessing http://elf.cs.pub.ro
using the following command:
student@scgc:~$ httperf --server=elf.cs.pub.ro --wsess=2000,10,2 --rate 300 --timeout 5
Notice in the output information related to Connection rate, Request rate, Net I/O.
In order to have a comparison with a direct access on elf.cs.pub.ro
, delete the line we added in /etc/hosts
. Run again the httperf
command. Notice the difference between parameters.
Configure Varnish in order to be the front-end for the web server residing on the scgc-vm-2
machine. Afterwards, we will evaluate the web access performance.
For starters, configure the Varnish instance to use web server 10.0.0.20 as its back end. After the configuration, do not forget to restart the service.
root@scgc-vm-1:~# service varnish restart
In order to test this, connect to a browser from the host machine (or wget in the CLI) and access the following URL: http://10.0.0.10. The following message should appear: This is scgc-vm-2 (10.0.0.20).
The same message will appear if you directly access the web server at http://10.0.0.20 although this request will not go through the Varnish server.
Next we will check the Varnish performance when executing a data transfer. For this, we will create a 10MB file on the web server:
root@scgc-vm-2:~# cd /var/www root@scgc-vm-2:~# mkdir data; cd data root@scgc-vm-2:~# dd if=/dev/urandom of=10M.dat bs=100k count=100
We will also need to change the DocumentRoot of the Apache server on scgc-vm-2
. For this, edit/etc/apache2/sites-available/000-default.conf
and change the DocumentRoot from /var/www/html
to /var/www
. Restart the apache2
service.
In order to measure the duration of the data transfer with and without Varnish, use httperf
on the host machine in order to download the following file: http://10.0.0.20/data/10M.dat for direct access or http://10.0.0.10/data/10M.dat for Varnish access.
While httperf
is running, check with htop
the load on both virtual machines.
Use the --uri
option for httperf
to specify what page is to be accessed (in our case http://10.0.0.20/data/10M.dat). Notice the difference between the Request rate parameter for the direct access and the Varnish access.
In order to track the Varnish service state, we can use the following analysy and monitoring tools: varnishlog
, varnishstat
and varnishhist
. All of these show information about the Varnish service from the moment when the command was executed and not before.
The varnishlog
shows details regarding the the Varnish service connections. Run the command and then perform some web requests to the Varnish service. The log shown is very verbose; we can filter it using the following filter:
root@scgc-vm-1:~# varnishlog -I RxURL only the receive URL will be shown
The varnishstat
command shows information regarding the service state. The output is a screen similar to the one shown by the top
or htop
commands. Run the command, connect to the Varnish web service and check the output, especially the Hitrate ratio.
The varnishhist
command shows a histogram of the serving requests duration. On the horizontal axis we have the serving requests timers at a logarithmic scale. The requests that do not access the cache appear with # and the ones that reach the cache appear with |
. Create multiple, different connection to the Varnish service (e.g. access 3 files with different sizes) and check the output. Notice how small the serving request timer is when the request is retrieved from the cache instead of a direct web server access.
Use the varnishlog
command to show only the requests towards the /data/10M.dat file.
-q
option with varnishlog and the ReqURL VSL tag. Check the examples from the varnishlog man
page.
We will check the cache behavior in Varnish. Varnish entries are cached for a period of time before they expire. After the time expires, a new web server request will be performed by the Varnish service.
The cache lifetime is given by an internal TTL Varnish variable. Its value is by default 120 seconds. We can verify this by executing the following command:
root@scgc-vm-1:~# varnishadm param.show default_ttl
This value can be changed by editing the VARNISH_TTL
directive in the /etc/default/varnish
configuration file.
In order to view the cache timer state, we will use the following command:
root@scgc-vm-1:~# varnishlog -i VCL_Call
We will execute requests in order to download the 10MB.dat
file used in the previous exercise.
A message containing the miss
keyword will appear when the information cannot be found in the cache. The hit
keyword will appear when the information can be found in the cache. After 120 seconds, the cache will expire and a web access will generate a cache miss.
After you generated a cache miss on the 10M.dat
file, access the page again. Now it will be in the cache. Rewrite the file on the web server:
root@scgc-vm-2:/var/www/html/data# dd if=/dev/urandom of=10M.dat bs=100k count=100
Access the file again and check the output shown by the varnishlog
command. Notice that file was retrieved from the cache. Until the cache expires, any file change will not be visible in the cache.
In order to prevent having an old object/file in the cache, we can purge/ban it. After the .dat file is present in the cache, ban the file in order for it to be reread by the Varnish service.
varnishadm
command in order to access the Varnish CLI.
In order to configure Varnish, the VCL (Varnish Configuration Language) is used. This allows configuration to be dynamically loaded at runtime in a Varnish instance.
Using the configuration file /etc/varnish/default.vcl
, configure the TTL to 1 hour for the files served by Varnish from the /data/
directory. The other files/pages served by Varnish will use the default TTL value (120 seconds).
Wait 3-4 minutes after the firs access on a page from the /data/
directory and then access it again. A correct configuration should lead to a cache hit (the 1 hour cache hasn't expired yet).
We want the Varnish service to accelerate the web access towards both the local web server on the scgc-vm-2
machine and elf.cs.pub.ro. For this, we need to configure two back ends.
Configure both back ends so that if the URL starts with /ndk/
(elf.cs.pub.ro/ndk) the request will be served by elf.cs.pub.ro back end, otherwise it will be served by the local web server back end.
We will perform load balancing using Varnish (also called directors in Varnish). We will use the two web servers on scgc-vm-2
and scgc-vm-3
as back ends for load balancing.
Configure the Varnish service on scgc-vm-1
to perform load balancing with the two web servers using a round robin scheduler.
In order to verify this task, access in a browser (or wget in the CLI) 10.0.0.10, wait 2 minutes (for the Varnish cache to expire) and access it again.