This shows you the differences between two versions of the page.
|
ep:labs:05:contents:tasks:ex3 [2026/03/31 01:25] radu.mantu |
ep:labs:05:contents:tasks:ex3 [2026/03/31 02:50] (current) radu.mantu |
||
|---|---|---|---|
| Line 3: | Line 3: | ||
| The [[https://ebpf.io/|extended Berkley Packet Filter (eBPF)]] is an under-represented technology in CS curricula that has been around since 1994 but has served multiple purposes along the years. As a //tl;dr//, what you need to know about eBPF is that it's a purely virtual [[https://docs.kernel.org/6.3/bpf/instruction-set.html|instruction set]], meaning that no hardware implements it. eBPF programs can be uploaded to the kernel, where they are JIT translated to native bytecode and become callable by other kernel components. | The [[https://ebpf.io/|extended Berkley Packet Filter (eBPF)]] is an under-represented technology in CS curricula that has been around since 1994 but has served multiple purposes along the years. As a //tl;dr//, what you need to know about eBPF is that it's a purely virtual [[https://docs.kernel.org/6.3/bpf/instruction-set.html|instruction set]], meaning that no hardware implements it. eBPF programs can be uploaded to the kernel, where they are JIT translated to native bytecode and become callable by other kernel components. | ||
| - | The question is: why would we go through all this trouble instead of using a [[https://embetronicx.com/tutorials/linux/device-drivers/linux-device-driver-tutorial-part-2-first-device-driver/|Linux Kernel Module (LKM)]]? Unlike LKMs, eBPF programs have a simpler structure and can be more easily verified by the kernel. Before being JIT translated, the kernel must ensure their safety by enforcing certain properties. For example, eBPF programs are //guaranteed// to finish. How does is this property checked and enforced? By making sure that eBPF programs have //no back jumps//. As you can imagine, this makes even writing a simple ''for'' loop a challenge. | + | The question is: why would we go through all this trouble instead of using a [[https://embetronicx.com/tutorials/linux/device-drivers/linux-device-driver-tutorial-part-2-first-device-driver/|Linux Kernel Module (LKM)]]? Unlike LKMs, eBPF programs have a simpler structure and can be more easily verified by the kernel. Before being JIT translated, the kernel must ensure their safety by enforcing certain properties. For example, eBPF programs are //guaranteed// to finish. How is this property checked and enforced? By making sure that eBPF programs have //no back jumps//. As you can imagine, this makes even writing a simple ''for'' loop a challenge. |
| Initially, BPF (the **extended** part was added when x64 architectures appeared ca. 2004) was used as a filtering criteria for network packet captures, limiting the amount of data copied to a userspace process for analysis. This is still used to this day. Try running **tcpdump <expression>** and adding the **-d** flag. Instead of actually listening for packets, this will dump the BPF program that **tcpdump** would otherwise compile from that expression and upload to the kernel. That program is invoked for each packet and it decides whether the **tcpdump** process should receive a copy of it. | Initially, BPF (the **extended** part was added when x64 architectures appeared ca. 2004) was used as a filtering criteria for network packet captures, limiting the amount of data copied to a userspace process for analysis. This is still used to this day. Try running **tcpdump <expression>** and adding the **-d** flag. Instead of actually listening for packets, this will dump the BPF program that **tcpdump** would otherwise compile from that expression and upload to the kernel. That program is invoked for each packet and it decides whether the **tcpdump** process should receive a copy of it. | ||
| Line 71: | Line 71: | ||
| * A [[https://bpftrace.org/docs/release_025/language#filterspredicates|predicate]] specified between the probe name and the action block. | * A [[https://bpftrace.org/docs/release_025/language#filterspredicates|predicate]] specified between the probe name and the action block. | ||
| </note> | </note> | ||
| + | |||
| + | <solution -hidden> | ||
| + | <code bash> | ||
| + | $ sudo bpftrace -e 'tracepoint:syscalls:sys_exit_read /args.ret < 0/ { printf("%s (%d): %d\n", comm, pid, args.ret) }' | ||
| + | kitty (8441): -11 | ||
| + | kitty (8441): -11 | ||
| + | libinput-connec (1357): -11 | ||
| + | |||
| + | $ errno 11 | ||
| + | EAGAIN 11 Resource temporarily unavailable | ||
| + | </code> | ||
| + | </solution> | ||
| === [10p] Task D - Count read bytes === | === [10p] Task D - Count read bytes === | ||
| Line 83: | Line 95: | ||
| Make sure you filter out negative return values and execute your **bpftrace** script. Let it run for a few seconds, then interrupt it via a SIGINT (i.e., //Ctrl + C//). When unloading the probes and before terminating the process, all maps will be printed to //stdout//. | Make sure you filter out negative return values and execute your **bpftrace** script. Let it run for a few seconds, then interrupt it via a SIGINT (i.e., //Ctrl + C//). When unloading the probes and before terminating the process, all maps will be printed to //stdout//. | ||
| + | |||
| + | <solution -hidden> | ||
| + | <code bash> | ||
| + | $ sudo bpftrace -e 'tracepoint:syscalls:sys_exit_read /args.ret > 0/ { @read_bytes[comm] += args.ret }' | ||
| + | </code> | ||
| + | </solution> | ||
| == Periodic statistics == | == Periodic statistics == | ||
| Line 89: | Line 107: | ||
| Use the [[https://bpftrace.org/docs/release_025/language#interval|interval]] probe to achieve this. You can ''print()'' the map and then ''clear()'' it to reset its contents. | Use the [[https://bpftrace.org/docs/release_025/language#interval|interval]] probe to achieve this. You can ''print()'' the map and then ''clear()'' it to reset its contents. | ||
| + | |||
| + | <solution -hidden> | ||
| + | <code bash> | ||
| + | $ sudo bpftrace -e 'tracepoint:syscalls:sys_exit_read /args.ret > 0/ { @read_bytes[comm] += args.ret } interval:s:2 { print(@read_bytes); printf("\n"); clear(@read_bytes) }' | ||
| + | </code> | ||
| + | </solution> | ||
| === [5p] Task E - Built-in histogram function === | === [5p] Task E - Built-in histogram function === | ||
| Use the [[https://bpftrace.org/docs/release_025/stdlib#hist|hist()]] eBPF helper to visualize the distribution of bytes read for each syscall. The data that you visualize is not the total bytes read, but how many **read()** calls returned a value that fits within that specific log2 bucket. | Use the [[https://bpftrace.org/docs/release_025/stdlib#hist|hist()]] eBPF helper to visualize the distribution of bytes read for each syscall. The data that you visualize is not the total bytes read, but how many **read()** calls returned a value that fits within that specific log2 bucket. | ||
| + | |||
| + | <solution -hidden> | ||
| + | <code bash> | ||
| + | $ sudo bpftrace -e 'tracepoint:syscalls:sys_exit_read { @ = hist(args.ret) }' | ||
| + | @: | ||
| + | (..., 0) 303 |@@ | | ||
| + | [0] 531 |@@@@ | | ||
| + | [1] 6246 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| | ||
| + | [2, 4) 10 | | | ||
| + | [4, 8) 44 | | | ||
| + | [8, 16) 725 |@@@@@@ | | ||
| + | [16, 32) 22 | | | ||
| + | [32, 64) 401 |@@@ | | ||
| + | [64, 128) 32 | | | ||
| + | [128, 256) 55 | | | ||
| + | [256, 512) 85 | | | ||
| + | [512, 1K) 347 |@@ | | ||
| + | [1K, 2K) 104 | | | ||
| + | [2K, 4K) 279 |@@ | | ||
| + | [4K, 8K) 38 | | | ||
| + | [8K, 16K) 15 | | | ||
| + | [16K, 32K) 10 | | | ||
| + | [32K, 64K) 1 | | | ||
| + | </code> | ||
| + | </solution> | ||