Assignment

1. Overview

1.1. Simulated network

In topology.py you have the following Mininet topology. In our experiments, we will run iperf3 servers on h3 and clients on h1 and h2. The goal of this assignment is for you to measure different TCP metrics for specific connections, plot the results and interpret the plot.

Up until this point, you may have used netstat, but not its modern-day equivalent ss. The former gathers its information from /proc/net/tcp and other related virtual files. Needless to say, the available information is quite limited. For this reason, a special type of socket (i.e., Netlink socket) was created to communicate directly with the kernel. The socket diagnostics subsystem was built on top of Netlink in order to rapidly extract extensive information regarding local sockets and their connections. As you may have guessed, ss uses this subsystem. However, we want to interact with it directly. If we were to repeatedly invoke ss in order to get updated statistics regarding one such socket, we would incur needless overhead from repeatedly spawning these processes. This overhead would amount to ~2-3ms / execution, severely limiting our sampling frequency.

In socket_diag.c we have implemented a demo application that obtains the source and destination IPs and ports of all ESTABLISHED TCP connections, plus the inode of the associated socket. Yes, sockets have inodes too. Just check the /proc/<pid>/fd/ of your browser process. Any symlink with a value such as socket:[122505] is a socket, and the numeric part is the inode.

Anyway, try compiling socket_diag.c and execute it:

$ gcc socket_diag.c -o socket_diag
$ sudo ./socket_diag /proc/$$/ns/net
    =================================
    sport  : 49606
    dport  : 443
    src ip : 192.168.100.16
    dst ip : 3.67.245.95
    inode  : 24615
    =================================
    sport  : 49596
    dport  : 443
    src ip : 192.168.100.16
    dst ip : 3.67.245.95
    inode  : 17878
    =================================
    ...

Namespace compatibility

One of the challenges of network observability in Linux is dealing with Network Namespaces. For instance, try to spin up a docker container and listen on a port using netcat. Can you identify that open port using netstat or ss from your host system, and not from your container? The answer is no. Your container operates in another namespace than the shell where you're running netstat and ss. The question is, what can you do to solve this problem?

Well, if you can identify a process that's running inside that container, you can open() its /proc/<pid>/ns/net symlink and use the setns() syscall to transition your process within the same namespace. Any subsequent network-related operation (including queries to the Socket Diagnostics subsystem) will target the container's namespace. We have already implemented this functionality for you in socket_diag.c. That is why we needed to pass it an argument in the example above.

1.3. bpftune

In our earlier network monitoring lab, we briefly discussed about eBPF. bpftune is a tool created by Oracle that leverages eBPF's capability to dynamically instrument the TCP/IP stack (similar to pwru) to perform auto-tuning depending on the network conditions. For example, it may adjust the socket buffer sizes whenever their use exceeds a certain threshold.

One interesting feature is that it has support for network namespaces, meaning that it can apply these optimizations on a per-node bases in our Mininet simulation. Also, we only need to run one instance of it and it will automatically detect existing namespaces. Compile and install bpftune. You can run it with the -s flag to force it to output its changes to stdout.

2. Tasks

2.1. [20p] Set up the network simulation

Execute the topology.py script with sudo privileges. Don't mess around with the script arguments just yet. Once you've obtained the mininet> prompt, open one terminal for each host. Select your preferred terminal (e.g., kitty, gnome-terminal, xterm, etc.)

mininet> h1 kitty &
mininet> h2 kitty &
mininet> h3 kitty &

You can spawn multiple terminals on the same host. Additionally, you can even run wireshark if you need to debug something. On h3, run an iperf3 TCP server. From h1, connect to that sever with an iperf3 client. What throughput did you obtain?

Next, spawn another iperf3 server on h3, but this time make it UDP. Start two simultaneous connections: TCP from h1 to h3 and UDP from h2 to h3 (after a few seconds). For the UDP connection, set the bandwidth to 10Mbps from iperf3's command line arguments. What is the throughout of each experiment?

Do not try to do this in wsl. Its kernel implements network namespaces very poorly and you will have disastrous results. You can however, solve this assignment in a VM.

2.2. [30p] Implement connection monitoring tool

Starting from socket_diag.c, follow the three TODOs. You will need to isolate the iperf3 socket used for data transfers based on the source and destination IPs and ports. Additionally, you will have to ask the kernel to give you a tcp_info structure in its reply. This structure counts as an optional attribute that you will have to extract from the reply. As you can see, it contains a large number of metrics that you can monitor.

Use this tool of yours to continuously monitor the iperf3 data transfer over a TCP connection for one minute. Determine the throughput and congestion window for every tcp_info sample. Plot these values as functions of time and explain what you observe. Ask grok what each field in the tcp_info structure represents and select additional metrics that may support your hypothesis.

You may change whatever you want in socket_diag.c. Don't just stop after the three TODOs.


You can choose whether to keep setns() or just run the program in the same network namespace as iperf3 (i.e., from within another h1 terminal). Just pick whatever solution seems easiest to you.


iperf3 will open two connections to the server. The first is used to negotiate the experiment parameters and exchange final measurements. The second is used to actually transfer the data and stress test the network. You're interested in the latter, not the former.

2.3. [30p] Differential analysis

Try varying the bandwidths and delays of the h1-r1 and h2-r1 links. Best if you keep them symmetric. Record the same metrics that you've used in your previous experiment.

Create two figures, one for the bandwidth-varying experiment, and one for the delay-varying experiment. Create multiple plots for these experiments within the same figure and explain what impact these variations had. Just to clarify, for the “throughput as a function of time” figure, plot each experiment where you vary the delay with ±k * 25ms (with k = 0, 1, 2, 3, …) and label them accordingly. Aim for something like this. Also, that value of 25ms is just a suggestion.

Automate the data acquisition part of this task as much as possible. Include any scripts that you've written / modified in your submission.

These experiments that you are performing reference a few specific features of the TCP protocol.

2.4. [20p] Evaluate bpftune impact

Try running bpftune on your host and re-run the experiment from the first task (with the TCP and UDP simultaneous iperf3 connections). Note what changes it makes to the system. Read the source code and try to figure out the criteria that triggered the tuner. Do these changes have any visible effect?

3. Proof of work

Your submission must be uploaded to moodle by the 7th of May, 11:59pm and must contain the following:

  1. A pdf report (max. 5 pages, negotiable) with all your observations from each task, as well as plots illustrating your experiments. Writing this report in LaTeX is recommended but not obligatory.
  2. The Netlink Socket Diagnostics tool that you've implemented and used in acquiring runtime data.
  3. Any scripts used for automating boring / repetitive tasks.

If you decide to write the report in LaTeX, try tectonic. It's much leaner than pdflatex and will automatically install the packages included in your source files. tectonic packages should be available on most distributions. To compile your report, simply:

$ tectonic report.tex

The plots can be generated in LaTeX from raw data.

ep/teme/01.txt · Last modified: 2025/04/17 00:08 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0