This shows you the differences between two versions of the page.
ep:teme:01 [2025/04/15 21:59] radu.mantu [2.4. Evaluate bpftune impact] |
ep:teme:01 [2025/04/17 00:08] (current) radu.mantu |
||
---|---|---|---|
Line 5: | Line 5: | ||
===== 1. Overview ===== | ===== 1. Overview ===== | ||
- | <note tip> | + | <note> |
Code skeleton available at [[https://github.com/cs-pub-ro/EP-assignment-2025/]]. | Code skeleton available at [[https://github.com/cs-pub-ro/EP-assignment-2025/]]. | ||
</note> | </note> | ||
- | |||
==== 1.1. Simulated network ==== | ==== 1.1. Simulated network ==== | ||
Line 19: | Line 18: | ||
Up until this point, you may have used **netstat**, but not its modern-day equivalent **ss**. The former gathers its information from ''/proc/net/tcp'' and other related virtual files. Needless to say, the available information is quite limited. For this reason, a special type of socket (i.e., [[https://www.man7.org/linux/man-pages/man7/netlink.7.html|Netlink socket]]) was created to communicate directly with the kernel. The [[https://www.man7.org/linux/man-pages/man7/sock_diag.7.html|socket diagnostics]] subsystem was built on top of Netlink in order to rapidly extract //extensive// information regarding local sockets and their connections. As you may have guessed, **ss** uses this subsystem. However, we want to interact with it directly. If we were to repeatedly invoke **ss** in order to get updated statistics regarding one such socket, we would incur needless overhead from repeatedly spawning these processes. This overhead would amount to ~2-3ms / execution, severely limiting our sampling frequency. | Up until this point, you may have used **netstat**, but not its modern-day equivalent **ss**. The former gathers its information from ''/proc/net/tcp'' and other related virtual files. Needless to say, the available information is quite limited. For this reason, a special type of socket (i.e., [[https://www.man7.org/linux/man-pages/man7/netlink.7.html|Netlink socket]]) was created to communicate directly with the kernel. The [[https://www.man7.org/linux/man-pages/man7/sock_diag.7.html|socket diagnostics]] subsystem was built on top of Netlink in order to rapidly extract //extensive// information regarding local sockets and their connections. As you may have guessed, **ss** uses this subsystem. However, we want to interact with it directly. If we were to repeatedly invoke **ss** in order to get updated statistics regarding one such socket, we would incur needless overhead from repeatedly spawning these processes. This overhead would amount to ~2-3ms / execution, severely limiting our sampling frequency. | ||
- | In **socket_diag.c** we have implemented a demo application that obtains the sources and destination IPs and ports of all ESTABLISHED TCP connections, plus the inode of the associated socket. Yes, sockets have inodes too. Just check the ''/proc/<pid>/fd'' of your browser process. Any symlink with a value such as ''socket:[122505]'' is a socket, and the numeric part is the inode. | + | In **socket_diag.c** we have implemented a demo application that obtains the source and destination IPs and ports of all ESTABLISHED TCP connections, plus the inode of the associated socket. Yes, sockets have inodes too. Just check the ''/proc/<pid>/fd/'' of your browser process. Any symlink with a value such as ''socket:[122505]'' is a socket, and the numeric part is the inode. |
- | Anyway, try compiling **socket_diag.c** and executing it: | + | Anyway, try compiling **socket_diag.c** and execute it: |
<code bash> | <code bash> | ||
$ gcc socket_diag.c -o socket_diag | $ gcc socket_diag.c -o socket_diag | ||
Line 37: | Line 36: | ||
dst ip : 3.67.245.95 | dst ip : 3.67.245.95 | ||
inode : 17878 | inode : 17878 | ||
+ | ================================= | ||
+ | ... | ||
</code> | </code> | ||
Line 47: | Line 48: | ||
==== 1.3. bpftune ==== | ==== 1.3. bpftune ==== | ||
- | In our earlier network monitoring lab, we briefly discussed about eBPF. [[https://github.com/oracle/bpftune|bpftune]] is a tool created by Oracle that leverages eBPF dynamic instrumentation of the TCP/IP stack (similar to [[https://github.com/cilium/pwru|pwru]]) to perform auto-tuning depending on the network conditions. For example, it may adjust the socket buffer sizes whenever their use exceeds a certain threshold. | + | In our earlier network monitoring lab, we briefly discussed about eBPF. [[https://github.com/oracle/bpftune|bpftune]] is a tool created by Oracle that leverages eBPF's capability to dynamically instrument the TCP/IP stack (similar to [[https://github.com/cilium/pwru|pwru]]) to perform auto-tuning depending on the network conditions. For example, it may adjust the socket buffer sizes whenever their use exceeds a certain threshold. |
One interesting feature is that it has support for //network namespaces//, meaning that it can apply these optimizations on a per-node bases in our Mininet simulation. Also, we only need to run one instance of it and it will automatically detect existing namespaces. Compile and install **bpftune**. You can run it with the **-s** flag to force it to output its changes to stdout. | One interesting feature is that it has support for //network namespaces//, meaning that it can apply these optimizations on a per-node bases in our Mininet simulation. Also, we only need to run one instance of it and it will automatically detect existing namespaces. Compile and install **bpftune**. You can run it with the **-s** flag to force it to output its changes to stdout. | ||
Line 53: | Line 54: | ||
===== 2. Tasks ===== | ===== 2. Tasks ===== | ||
- | ==== 2.1. Set up the network simulation ==== | + | ==== 2.1. [20p] Set up the network simulation ==== |
Execute the **topology.py** script with sudo privileges. Don't mess around with the script arguments just yet. Once you've obtained the ''mininet>'' prompt, open one terminal for each host. Select your preferred terminal (e.g., kitty, gnome-terminal, xterm, etc.) | Execute the **topology.py** script with sudo privileges. Don't mess around with the script arguments just yet. Once you've obtained the ''mininet>'' prompt, open one terminal for each host. Select your preferred terminal (e.g., kitty, gnome-terminal, xterm, etc.) | ||
Line 66: | Line 67: | ||
On **h3**, run an **iperf3** TCP server. From **h1**, connect to that sever with an **iperf3** client. What throughput did you obtain? | On **h3**, run an **iperf3** TCP server. From **h1**, connect to that sever with an **iperf3** client. What throughput did you obtain? | ||
- | Next, spawn another **iperf3** server on **h3**, but this time make it UDP. Start two simultaneous connections: TCP from **h1** to h3 and UDP from **h2** to h3. For the UDP connection, set the bandwidth to 10Mbps from **iperf3**'s command line arguments. What is the throughout of each experiment? | + | Next, spawn another **iperf3** server on **h3**, but this time make it UDP. Start two simultaneous connections: TCP from **h1** to h3 and UDP from **h2** to h3 (after a few seconds). For the UDP connection, set the bandwidth to 10Mbps from **iperf3**'s command line arguments. What is the throughout of each experiment? |
<note warning> | <note warning> | ||
- | Do not try to do this in **wsl**. It's kernel implements network namespaces very poorly and you will have disastrous results. You can however, solve this assignment in a VM. | + | Do not try to do this in **wsl**. Its kernel implements network namespaces very poorly and you will have disastrous results. You can however, solve this assignment in a VM. |
</note> | </note> | ||
- | ==== 2.2. Implement connection monitoring tool ==== | + | ==== 2.2. [30p] Implement connection monitoring tool ==== |
Starting from **socket_diag.c**, follow the three TODOs. You will need to isolate the **iperf3** socket used for data transfers based on the source and destination IPs and ports. Additionally, you will have to ask the kernel to give you a [[https://github.com/torvalds/linux/blob/master/include/uapi/linux/tcp.h#L228|tcp_info]] structure in its reply. This structure counts as an optional attribute that you will have to extract from the reply. As you can see, it contains a large number of metrics that you can monitor. | Starting from **socket_diag.c**, follow the three TODOs. You will need to isolate the **iperf3** socket used for data transfers based on the source and destination IPs and ports. Additionally, you will have to ask the kernel to give you a [[https://github.com/torvalds/linux/blob/master/include/uapi/linux/tcp.h#L228|tcp_info]] structure in its reply. This structure counts as an optional attribute that you will have to extract from the reply. As you can see, it contains a large number of metrics that you can monitor. | ||
Line 86: | Line 87: | ||
</note> | </note> | ||
- | ==== 2.3. More detailed analysis ==== | + | ==== 2.3. [30p] Differential analysis ==== |
Try varying the bandwidths and delays of the **h1-r1** and **h2-r1** links. Best if you keep them symmetric. Record the same metrics that you've used in your previous experiment. | Try varying the bandwidths and delays of the **h1-r1** and **h2-r1** links. Best if you keep them symmetric. Record the same metrics that you've used in your previous experiment. | ||
- | Create two figures, one for the bandwidth-varying experiment, and one for the delay-varying experiment. Create multiple plots for these experiments within the same figure and explain what impact these variations had. Just to clarify, for the "throughput as a function of time" figure, plot each experiment where you vary the delay with **±k * 25ms** (with k = 1, 2, 3, ...) and label them accordingly. Aim for something like [[https://stackoverflow.com/questions/22276066/how-to-plot-multiple-functions-on-the-same-figure|this]]. | + | Create two figures, one for the bandwidth-varying experiment, and one for the delay-varying experiment. Create multiple plots for these experiments within the same figure and explain what impact these variations had. Just to clarify, for the //"throughput as a function of time"// figure, plot each experiment where you vary the delay with **±k * 25ms** (with k = 0, 1, 2, 3, ...) and label them accordingly. Aim for something like [[https://stackoverflow.com/questions/22276066/how-to-plot-multiple-functions-on-the-same-figure|this]]. Also, that value of 25ms is just a suggestion. |
Automate the data acquisition part of this task as much as possible. Include any scripts that you've written / modified in your submission. | Automate the data acquisition part of this task as much as possible. Include any scripts that you've written / modified in your submission. | ||
Line 98: | Line 99: | ||
</note> | </note> | ||
- | ==== 2.4. Evaluate bpftune impact ==== | + | ==== 2.4. [20p] Evaluate bpftune impact ==== |
Try running **bpftune** on your host and re-run the experiment from the first task (with the TCP and UDP simultaneous **iperf3** connections). Note what changes it makes to the system. Read the source code and try to figure out the criteria that triggered the tuner. Do these changes have any visible effect? | Try running **bpftune** on your host and re-run the experiment from the first task (with the TCP and UDP simultaneous **iperf3** connections). Note what changes it makes to the system. Read the source code and try to figure out the criteria that triggered the tuner. Do these changes have any visible effect? | ||
Line 104: | Line 105: | ||
===== 3. Proof of work ===== | ===== 3. Proof of work ===== | ||
- | Your submission must be uploaded to [[https://curs.upb.ro/2024/mod/assign/view.php?id=156520|moodle]] by THE **7th of May, 11:59pm** and must contain the following: | + | Your submission must be uploaded to [[https://curs.upb.ro/2024/mod/assign/view.php?id=156520|moodle]] by the **7th of May, 11:59pm** and must contain the following: |
- | - A **pdf report** with all your observations from each task, as well as plots illustrating your experiments. Writing this report in LaTeX is recommended but not obligatory. | + | - A **pdf report** (max. 5 pages, negotiable) with all your observations from each task, as well as plots illustrating your experiments. Writing this report in LaTeX is recommended but not obligatory. |
- The Netlink Socket Diagnostics tool that you've implemented and used in acquiring runtime data. | - The Netlink Socket Diagnostics tool that you've implemented and used in acquiring runtime data. | ||
- Any scripts used for automating boring / repetitive tasks. | - Any scripts used for automating boring / repetitive tasks. | ||
+ | |||
+ | <note tip> | ||
+ | If you decide to write the report in LaTeX, try [[https://github.com/tectonic-typesetting/tectonic|tectonic]]. It's much leaner than **pdflatex** and will automatically install the packages included in your source files. **tectonic** packages should be available on most distributions. To compile your report, simply: | ||
+ | |||
+ | <code bash> | ||
+ | $ tectonic report.tex | ||
+ | </code> | ||
+ | ---- | ||
+ | The plots can be generated in LaTeX from raw data. | ||
+ | </note> | ||