Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:04:contents:tasks:ex1 [2021/09/28 22:42]
radu.mantu [01. [??p] Primer / Reminder]
ep:labs:04:contents:tasks:ex1 [2026/03/23 21:58] (current)
radu.mantu
Line 1: Line 1:
-==== 01. [??pPrimer / Reminder ​====+==== 01. [10pValgrind ​====
  
-=== [??p] Task A tcpdump ===+Dynamic analysis tools can observe a running process and report memory-related 
 +issues that static analysis would miss entirely. In this exercise you will use 
 +**Valgrind** to detect memory leaks in a small C program -- and get a first taste 
 +of the dynamic instrumentation concept that will be developed further in Task 04 
 +with Intel Pin.
  
-**tcpdump** is network traffic monitoring tool. At its core, it uses **libpcap** which in turn uses technology called **Extended Berkley Packet Filter (eBPF)**.+=== [5p] Task A - Writing ​leaky program ===
  
-**BPF** was first proposed around 1995 when filtering mechanisms (and firewalls) were still a novel concept and were based on interpreters. **BPF** (now referred to as Classic BPF - **cBPF**) was the initial version ​of a Motorola inspired virtual ISA (i.e.: had no hardware implementation -- think **CHIP-8**). **eBPF** is basically still **BPF** but more compatible with 64-bit architectures so that Just In Time (JIT) translators have an easier time running the code. +Read the contents ​of ''leak.c''​ and compile it:
- +
-At first, the whole idea was to compile a packet filtering program and attach it to a socket in kernelspace. This program would filter out packets that the userspace process would not be interested in and reduce the quantity of data copied over the kernelspace/​userspace boundary, only to ultimately be discarded. +
- +
-Today, **eBPF** is used heavily for system profiling by companies such as Netflix and Facebook. Linux has had a kernel VM capable of running and statically analyzing **eBPF** code since around 2006. **tcpdump** is one of the few examples that still use it for its original purpose. Ask your assistant if you want to know more about **eBPF** tracing (not part of the lab, don't panic!) +
- +
-== The Task == +
- +
-Use **tcpdump** to output outgoing **NTP** queries and incoming **http(s)** responsesUse the ''​-d''​ flag to see an **objdump** of the filter program'​s code. +
- +
-Complete the **tcpdump** command in order to satisfy the following formatting requirements:​ +
-  * print the packet number +
-  * print the elapsed time (in nanoseconds) since the first packet was captured +
-  * print the content of each packet (w/o l2 header) in hex and ASCII +
-  * do not resolve IP addresses to names +
- +
-How to test:+
 <code bash> <code bash>
-ntpdate ​-q ro.pool.ntp.org +gcc -g -o leak leak.c
-$ curl ocw.cs.pub.ro+
 </​code>​ </​code>​
  
-<note tip> +The **-g** flag includes debug symbols so Valgrind ​can report exact file names 
-**tpcdump** can list the available interfaces if run with ''​-D''​. In addition to your network interfaces, you may also see a **Bluetooth** device or the **dbus-system** (depending on your desktop).+and line numbers.
  
-If you don't specify the interface with ''​-i'',​ the first entry in the printed list will be used by default. This may not always be your active network interface but in stead, your **docker** bridge (for example). +Now run it normally ​and observe ​that nothing seems wrong from the outside:
-</​note>​ +
- +
-<​solution -hidden>​ +
-<​code>​ +
-$ tcpdump -d '(udp dst port 123) or (tcp src port 80) or (tcp src port 443)'​ +
-$ sudo tcpdump -# --nano -ttttt -n -X '(udp dst port 123) or (tcp src port 80) or (tcp src port 443)'​ +
-</​code>​ +
-</​solution>​ +
- +
-=== [??p] Task B - iptables === +
- +
-**iptables** is a configuration tool for the kernel packet filter. +
- +
-The system as a whole provides many functionalities that are grouped by **tables**: //filter, nat, mangle, raw, security//. If you want to alter a packet header, you place a rule in the //mangle// table. If you want to mask the private IP address of an internal host with the external IP address of the default gateway, you place a rule in the //nat// table. Depending on the table you choose, you will gain or lose access to some chains. If not specified, the default is the //filter// table. +
- +
-**Chains** are basically lists of rules. The five built-in chains are //​PREROUTING,​ FORWARD, POSTROUTING,​ INPUT, OUTPUT//. Each of these corresponds to certain locations in the network stack where packets trigger **Netfilter hooks** ([[https://​elixir.bootlin.com/​linux/​latest/​source/​net/​ipv4/​ip_input.c#​L540|here]] is the //​PREROUTING//​ kernel hook as an example -- not that hard to add one, right?) For a selected chain, the order in which the rules are evaluated is determined primarily by the priority of their tables and secondarily by the user's discretionary arrangement (i.e.: order in which rules are inserted). +
- +
-{{ :​ep:​labs:​04:​contents:​tasks:​iptables_path.png?​800 |}} +
- +
-A **rule** consists of two entities: a sequence of match criteria and a jump target. +
- +
-The **jump target** represents an action to be taken. You are most likely familiar with the built-in actions such as //ACCEPT// or //DROP//. These actions decide the ultimate fate of the packet and are final (i.e.: rule iteration stops when these are invoked). However, there are also extended actions (see ''​man iptables-extensions(8)''​) that are not terminal verdicts and can be used for various tasks such as auditing, forced checksum recalculation or removal of Explicit Congestion Notification (ECN) bits. +
- +
-The **match criteria** of every rule are checked to determine if the jump target is applied. The way this is designed is very elegant: every type of feature (e.g.: l3 IP address vs l4 port) that you can check has a match callback function defined in the kernel. If you want, you can write your own such function in a Linux Kernel Module (LKM) and thus extend the functionality of **iptables** ([[https://​inai.de/​documents/​Netfilter_Modules.pdf|Writing Netfilter Modules]] with code example). However, you will need to implement a userspace shared library counterpart. When you start an **iptables** process, ​it searches in ///​usr/​lib/​xtables/​ // and automatically loads certain shared libraries (note: this path can be overwritten or extended using the //​XTABLES_LIBDIR//​ environment variable). Each library there must do three things: +
-  * define **iptables** flags for the new criteria ​that you want to include. +
-  * define help messages for when ''​**iptables** %%--%%help''​ is called (its help message is an amalgamation of each library'​s help snippet). +
-  * provide an initialization function for the structure containing the rule parameters; this structure will end up in the kernel'​s rule chain. +
-So when you want to test the efficiency of the **iptables** rule evaluation process, keep in mind that each rule may imply the invocation of multiple callbacks. +
- +
-== The Task (1) == +
- +
-Write an **iptables** rule according to the following specifications:​ +
-  * **chain:** OUTPUT +
-  * **match rule:** TCP packets originating ​from ephemeral ports bound to a socket created by root +
-  * **target:** enable kernel logging of matched packets with the //"EP: "// prefix +
- +
-How to test:+
 <code bash> <code bash>
-sudo curl www.google.com +$ ./leak 
-sudo dmesg+echo "exit code: $?"
 </​code>​ </​code>​
  
-<note tip> +=== [5p] Task B Detecting leaks with Valgrind ===
-<code bash> +
-$ man 8 iptables-extensions +
-</​code>​ +
-</​note>​+
  
-<​solution -hidden>+Run the same binary under Valgrind'​s memory error detector:
 <code bash> <code bash>
-# "​--log-prefix"​ must come after "-j LOG" +valgrind ​--leak-check=full ​--show-leak-kinds=all ./leak
-sudo iptables ​              \ +
-        -m multiport -m owner \ +
-        -I OUTPUT ​            \ +
-        -p tcp                \ +
-        --sports 1024:​65535 ​  \ +
-        ​--uid-owner root      \ +
-        ​-j LOG                \ +
-        ​--log-prefix 'EP: '+
 </​code>​ </​code>​
-</​solution>​ 
- 
-== The Task (2) == 
- 
-Write an **iptables** rule according to the following specifications:​ 
-  * **chain:** OUTPUT 
-  * **match rule:** **BPF** program that filters UDP traffic to port 53 (try [[https://​www.gnu.org/​software/​bash/​manual/​html_node/​Command-Substitution.html|bash command substitution]]) 
-  * **target:** set **TTL** to 1 (initially) 
- 
-Continue __appending__ the same rule with incremented **TTL** value until the **DNS** request goes through. 
- 
-How to test: 
-<code bash> 
-$ dig +short fep.grid.pub.ro @8.8.8.8 
-</​code>​ 
- 
-<note tip> 
-<code bash> 
-$ man 8 iptables-extensions nfbpf_compile 
-</​code>​ 
-</​note>​ 
- 
-<​solution -hidden> 
-<code bash> 
-$ sudo iptables ​                                        \ 
-        -m bpf                                          \ 
-        -t mangle ​                                      \ 
-        -A OUTPUT ​                                      \ 
-        --bytecode "​$(nfbpf_compile 'udp dst port 53'​)"​ \ 
-        -j TTL                                          \ 
-        --ttl-set 1 
-</​code>​ 
- 
-NOTE: If they don't specify the //mangle// table, the default (//​filter//​) will be used. **iptables** will say //"ok, all your arguments are fine... I'll send the structure to kernelspace"//​ but it will still fail! They will get this message: 
- 
-<​code>​ 
-iptables: Invalid argument. Run `dmesg'​ for more information. 
-</​code>​ 
- 
-Whenever you upload a rule into the kernel, the appropriate module can also //​optionally//​ implement a rule check callback, in addition to the match callback. This rule check callback will verify that the structure received from **iptables** is correct (it doesn'​t trust a userspace process, obviously). If an error occurs, it will print an error message to the kernel log. 
- 
-Let the students check the kernel log! They had to do this for the previous task, so they have no reason to cry for help here. They will get: 
- 
-<code bash> 
-$ sudo dmesg 
-... 
-[   ​36.960234] x_tables: ip_tables: TTL target: only valid in mangle table, not filter 
-</​code>​ 
- 
-If they ask (they won'​t),​ **Xtables** (read cross-tables) is the backend of the **iptables** and more recently **nftables** (just reached v1.0 after ~13y) infrastructure. 
- 
-</​solution>​ 
- 
-== The Task (3) == 
  
-Give an example when **iptables** is unable to catch a packet.+Examine the output and answer the following questions:​ 
 +  - How many bytes are reported as **definitely lost**? Does this match what you would expect from reading the source? 
 +  - What is the difference between **definitely lost** and **indirectly lost** in Valgrind'​s terminology?​ 
 +  - At what line number does Valgrind point as the origin of the leak? Why is that line significant rather than the line where the pointer goes out of scope? 
 +  - Re-compile **without** the ''​-g''​ flag and run Valgrind againWhat information is now missing from the report, and why?
  
 <​solution -hidden> <​solution -hidden>
-DHCPDISCOVER messageInterface has no IP address so it's placed ​in promiscuous mode (PF_PACKETand the network stack is bypassed; the packet is put directly on the wireIn the pastit was possible to set 0.0.0.0 as a temporary IP while getting a lease, but not anymore!+  - 10 calls × 256 bytes = **2560 bytes** definitely lost. 
 +  - **Definitely lost**: the last pointer to the allocation is gone -- the memory 
 +    can never be freed. **Indirectly lost**: memory reachable only through another 
 +    leaked block (e.g. a node in a leaked linked list). 
 +  - Valgrind points to the ''​malloc()''​ call inside ''​leaky_function()''​ because 
 +    that is where the allocation originated. The pointer going out of scope is 
 +    C conceptValgrind tracks allocations at the heap level, not variable lifetimes. 
 +  - Without ''​-g''​Valgrind shows raw addresses and shared library offsets instead 
 +    of ''​leak.c:​5''​The source file name and line number come from the DWARF debug 
 +    information embedded by the compiler.
 </​solution>​ </​solution>​
  
ep/labs/04/contents/tasks/ex1.1632858125.txt.gz · Last modified: 2021/09/28 22:42 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0