Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:04:contents:tasks:ex1 [2021/10/05 12:23]
radu.mantu [01. [??p] Primer / Reminder]
ep:labs:04:contents:tasks:ex1 [2026/03/24 12:23] (current)
radu.mantu
Line 1: Line 1:
-==== 01. [??pPrimer / Reminder ​====+==== 01. [10pValgrind ​====
  
-<note tip> +Dynamic analysis tools can observe ​running process ​and report memory-related 
-//Pro tip #1//: since you'll be using **man** ​lot in this exercise, add this to your //.bashrc// or //​.zshrc//:​ +issues that static analysis would miss entirelyIn this exercise you will use 
-<code bash> +**Valgrind** to detect memory leaks in small C program ​-- and get first taste 
-# color schemes for man pages +of the dynamic instrumentation concept ​that will be developed further in Task 04 
-man() { +with Intel Pin.
-    LESS_TERMCAP_mb=$'​\e[1;​32m' ​  \ +
-    LESS_TERMCAP_md=$'​\e[1;​32m' ​  \ +
-    LESS_TERMCAP_me=$'​\e[0m' ​     \ +
-    LESS_TERMCAP_se=$'​\e[0m' ​     \ +
-    LESS_TERMCAP_so=$'​\e[01;​33m' ​ \ +
-    LESS_TERMCAP_ue=$'​\e[0m' ​     \ +
-    LESS_TERMCAP_us=$'​\e[1;​4;​31m'​ \ +
-    command man "​$@"​ +
-+
-</​code>​ +
- +
-Source the file and test that it works. +
-</​note>​ +
- +
-=== [??p] Task A tcpdump === +
- +
-**tcpdump** is a network traffic monitoring tool. At its core, it uses **libpcap** which in turn uses a technology called **Extended Berkley Packet Filter (eBPF)**. +
- +
-**BPF** was first proposed around 1995 when filtering mechanisms (and firewalls) were still a novel concept and were based on interpreters. **BPF** (now referred ​to as Classic BPF - **cBPF**) was the initial version of Motorola inspired virtual ISA (i.e.: had no hardware implementation ​-- think **CHIP-8**). **eBPF** is basically still **BPF** but more compatible with 64-bit architectures so that Just In Time (JIT) translators have an easier time running the code. +
- +
-At first, the whole idea was to compile a packet filtering program ​and attach it to socket in kernelspace. This program would filter out packets that the userspace process would not be interested in and reduce the quantity of data copied over the kernelspace/​userspace boundary, only to ultimately be discarded. +
- +
-Today, **eBPF** is used heavily for system profiling by companies such as Netflix and Facebook. Linux has had a kernel VM capable of running and statically analyzing **eBPF** code since around 2006. **tcpdump** is one of the few examples ​that still use it for its original purpose. Ask your assistant if you want to know more about **eBPF** tracing (not part of the lab, don't panic!) +
- +
-== The Task == +
- +
-Use **tcpdump** to output outgoing **NTP** queries and incoming **http(s)** responses. Use the ''​-d''​ flag to see an **objdump** of the filter program'​s code.+
  
-Complete the **tcpdump** command in order to satisfy the following formatting requirements:​ +=== [5p] Task A - Writing a leaky program ===
-  * print the packet number +
-  * print the elapsed time (in nanoseconds) since the first packet was captured +
-  * print the content of each packet (w/o l2 header) in hex and ASCII +
-  * do not resolve IP addresses to names+
  
-How to test:+Read the contents of ''​leak.c''​ and compile it:
 <code bash> <code bash>
-ntpdate ​-q ro.pool.ntp.org +gcc -g -o leak leak.c
-$ curl ocw.cs.pub.ro+
 </​code>​ </​code>​
  
-<note tip> +The **-g** flag includes debug symbols so Valgrind ​can report exact file names 
-**tpcdump** can list the available interfaces if run with ''​-D''​. In addition to your network interfaces, you may also see a **Bluetooth** device or the **dbus-system** (depending on your desktop).+and line numbers.
  
-If you don't specify ​the interface with ''​-i'',​ the first entry in the printed list will be used by default. This may not always be your active network interface but in stead, your **docker** bridge (for example). +Now run it normally and observe that nothing seems wrong from the outside: 
-</​note>​ +<​code ​bash
- +./leak 
-<​solution -hidden> +echo "exit code: $?"
-<​code>​ +
-tcpdump -d '(udp dst port 123) or (tcp src port 80) or (tcp src port 443)' +
-sudo tcpdump -# --nano -ttttt -n -X '(udp dst port 123) or (tcp src port 80) or (tcp src port 443)'+
 </​code>​ </​code>​
-</​solution>​ 
- 
-=== [??p] Task B - iptables === 
- 
-**iptables** is a configuration tool for the kernel packet filter. 
- 
-The system as a whole provides many functionalities that are grouped by **tables**: //filter, nat, mangle, raw, security//. If you want to alter a packet header, you place a rule in the //mangle// table. If you want to mask the private IP address of an internal host with the external IP address of the default gateway, you place a rule in the //nat// table. Depending on the table you choose, you will gain or lose access to some chains. If not specified, the default is the //filter// table. 
- 
-**Chains** are basically lists of rules. The five built-in chains are //​PREROUTING,​ FORWARD, POSTROUTING,​ INPUT, OUTPUT//. Each of these corresponds to certain locations in the network stack where packets trigger **Netfilter hooks** ([[https://​elixir.bootlin.com/​linux/​latest/​source/​net/​ipv4/​ip_input.c#​L540|here]] is the //​PREROUTING//​ kernel hook as an example -- not that hard to add one, right?) For a selected chain, the order in which the rules are evaluated is determined primarily by the priority of their tables and secondarily by the user's discretionary arrangement (i.e.: order in which rules are inserted). 
- 
-{{ :​ep:​labs:​04:​contents:​tasks:​iptables_path.png?​800 |}} 
- 
-A **rule** consists of two entities: a sequence of match criteria and a jump target. 
- 
-The **jump target** represents an action to be taken. You are most likely familiar with the built-in actions such as //ACCEPT// or //DROP//. These actions decide the ultimate fate of the packet and are final (i.e.: rule iteration stops when these are invoked). However, there are also extended actions (see ''​man iptables-extensions(8)''​) that are not terminal verdicts and can be used for various tasks such as auditing, forced checksum recalculation or removal of Explicit Congestion Notification (ECN) bits. 
- 
-The **match criteria** of every rule are checked to determine if the jump target is applied. The way this is designed is very elegant: every type of feature (e.g.: l3 IP address vs l4 port) that you can check has a match callback function defined in the kernel. If you want, you can write your own such function in a Linux Kernel Module (LKM) and thus extend the functionality of **iptables** ([[https://​inai.de/​documents/​Netfilter_Modules.pdf|Writing Netfilter Modules]] with code example). However, you will need to implement a userspace shared library counterpart. When you start an **iptables** process, it searches in ///​usr/​lib/​xtables/​ // and automatically loads certain shared libraries (note: this path can be overwritten or extended using the //​XTABLES_LIBDIR//​ environment variable). Each library there must do three things: 
-  * define **iptables** flags for the new criteria that you want to include. 
-  * define help messages for when ''​**iptables** %%--%%help''​ is called (its help message is an amalgamation of each library'​s help snippet). 
-  * provide an initialization function for the structure containing the rule parameters; this structure will end up in the kernel'​s rule chain. 
-So when you want to test the efficiency of the **iptables** rule evaluation process, keep in mind that each rule may imply the invocation of multiple callbacks. 
- 
-== The Task (1) == 
  
-Write an **iptables** rule according to the following specifications:​ +=== [5p] Task B - Detecting leaks with Valgrind ===
-  * **chain:** OUTPUT +
-  * **match rule:** TCP packets originating from ephemeral ports bound to a socket created by root +
-  * **target:** enable kernel logging of matched packets ​with the //"EP: "// prefix+
  
-How to test:+Run the same binary under Valgrind'​s memory error detector:
 <code bash> <code bash>
-sudo curl www.google.com +valgrind --leak-check=full --show-leak-kinds=all ​./leak
-$ sudo dmesg+
 </​code>​ </​code>​
  
-<note tip> +Examine the output and answer the following questions: 
-<code bash> +  - How many bytes are reported as **definitely lost**? Does this match what you would expect from reading the source? 
-$ man 8 iptables-extensions +  What is the difference between **definitely lost** and **indirectly lost** in Valgrind'​s terminology?​ 
-</​code>​ +  - At what line number does Valgrind point as the origin of the leak? Why is that line significant rather than the line where the pointer goes out of scope? 
-</​note>​+  - Re-compile **without** the ''​-g''​ flag and run Valgrind again. What information is now missing from the report, and why?
  
 <​solution -hidden> <​solution -hidden>
-<code bash> +  - 10 calls × 256 bytes = **2560 bytes** definitely lost. 
-# "--log-prefix"​ must come after "-j LOG" +  **Definitely lost**: the last pointer to the allocation is gone -- the memory 
-$ sudo iptables ​              \ +    can never be freed. **Indirectly lost**: memory reachable only through another 
-        -m multiport -m owner \ +    ​leaked block (e.g. a node in a leaked linked list). 
-        -I OUTPUT ​            \ +  Valgrind points to the ''​malloc()''​ call inside ''​leaky_function()''​ because 
-        -p tcp                \ +    that is where the allocation originated. The pointer going out of scope is a 
-        ​--sports 1024:​65535 ​  \ +    C concept; Valgrind tracks allocations at the heap level, not variable lifetimes. 
-        --uid-owner root      \ +  Without ''​-g'',​ Valgrind shows raw addresses and shared library offsets instead 
-        -j LOG                \ +    ​of ​''​leak.c:5''. The source file name and line number come from the DWARF debug 
-        --log-prefix ​'EP: ' +    ​information embedded by the compiler.
-</​code>​+
 </​solution>​ </​solution>​
- 
-== The Task (2) == 
- 
-Write an **iptables** rule according to the following specifications:​ 
-  * **chain:** OUTPUT 
-  * **match rule:** **BPF** program that filters UDP traffic to port 53 (try [[https://​www.gnu.org/​software/​bash/​manual/​html_node/​Command-Substitution.html|bash command substitution]]) 
-  * **target:** set **TTL** to 1 (initially) 
- 
-Continue __appending__ the same rule with incremented **TTL** value until the **DNS** request goes through. 
- 
-How to test: 
-<code bash> 
-$ dig +short fep.grid.pub.ro @8.8.8.8 
-</​code>​ 
  
 <note tip> <note tip>
-<​code ​bash+**Troubleshooting** 
-$ man 8 iptables-extensions nfbpf_compile+----- 
 +On certain distributions such as CachyOS, you may get the following error: 
 +<​code>​ 
 +valgrind: ​ Fatal error at startup: a function redirection 
 +valgrind: ​ which is mandatory for this platform-tool combination 
 +valgrind: ​ cannot be set up.  Details of the redirection are:
 </​code>​ </​code>​
 +**valgrind** need the DWARF debug info for **libc** in order to function properly. If the ELF file itself doesn'​t have it, **valgrind** will try to use [[https://​man.archlinux.org/​man/​debuginfod-find.1|debuginfod find]] to download it using the **Build ID** stored in the ''​.note.gnu.build-id''​ section. If the **debuginfod** server doesn'​t have it either, your only hope of getting it to work is:
 +  * recompiling **glibc** with debug symbols (out of the question)
 +  * starting a docker container with Ubuntu, Debian, Arch Linux, etc.
 </​note>​ </​note>​
- 
-<​solution -hidden> 
-<code bash> 
-$ sudo iptables ​                                        \ 
-        -m bpf                                          \ 
-        -t mangle ​                                      \ 
-        -A OUTPUT ​                                      \ 
-        --bytecode "​$(nfbpf_compile 'udp dst port 53'​)"​ \ 
-        -j TTL                                          \ 
-        --ttl-set 1 
-</​code>​ 
- 
-NOTE: If they don't specify the //mangle// table, the default (//​filter//​) will be used. **iptables** will say //"ok, all your arguments are fine... I'll send the structure to kernelspace"//​ but it will still fail! They will get this message: 
- 
-<​code>​ 
-iptables: Invalid argument. Run `dmesg'​ for more information. 
-</​code>​ 
- 
-Whenever you upload a rule into the kernel, the appropriate module can also //​optionally//​ implement a rule check callback, in addition to the match callback. This rule check callback will verify that the structure received from **iptables** is correct (it doesn'​t trust a userspace process, obviously). If an error occurs, it will print an error message to the kernel log. 
- 
-Let the students check the kernel log! They had to do this for the previous task, so they have no reason to cry for help here. They will get: 
- 
-<code bash> 
-$ sudo dmesg 
-... 
-[   ​36.960234] x_tables: ip_tables: TTL target: only valid in mangle table, not filter 
-</​code>​ 
- 
-If they ask (they won'​t),​ **Xtables** (read cross-tables) is the backend of the **iptables** and more recently **nftables** (just reached v1.0 after ~13y) infrastructure. 
- 
-</​solution>​ 
- 
-== The Task (3) == 
- 
-Give an example when **iptables** is unable to catch a packet. 
- 
-<​solution -hidden> 
-DHCPDISCOVER message. Interface has no IP address so it's placed in promiscuous mode (PF_PACKET) and the network stack is bypassed; the packet is put directly on the wire. In the past, it was possible to set 0.0.0.0 as a temporary IP while getting a lease, but not anymore! 
-</​solution>​ 
- 
ep/labs/04/contents/tasks/ex1.1633425836.txt.gz · Last modified: 2021/10/05 12:23 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0