Differences

This shows you the differences between two versions of the page.

Link to this comparison view

ep:labs:03:contents:tasks:ex6 [2021/10/25 22:57]
radu.mantu
ep:labs:03:contents:tasks:ex6 [2023/10/21 17:43] (current)
andrei.mirciu [04. [30p] Perf & fuzzing]
Line 9: Line 9:
 </​note>​ </​note>​
  
-=== [15p] Task A - Fuzzing with AFL ===+=== [10p] Task A - Fuzzing with AFL ===
  
 First, let's compile AFL and all related tools. We initialize / update a few environment variables to make them more accessible. Remember that these are set only for the current shell. First, let's compile AFL and all related tools. We initialize / update a few environment variables to make them more accessible. Remember that these are set only for the current shell.
Line 61: Line 61:
   * **fuzzer_stats** : statistics generated by **afl**, updated every few seconds by overwriting the old ones.   * **fuzzer_stats** : statistics generated by **afl**, updated every few seconds by overwriting the old ones.
   * **fuzz_bitmap** : a 64KB array of counters used by the program instrumentation to report newly found paths. For every branch instruction,​ a hash is computed based on its address and the destination address. This hash is used as an offset into the 64KB map.   * **fuzz_bitmap** : a 64KB array of counters used by the program instrumentation to report newly found paths. For every branch instruction,​ a hash is computed based on its address and the destination address. This hash is used as an offset into the 64KB map.
-  * **plot_data** : time series that can be used with programs such as **gnuplut** to create visual representations of the fuzzer'​s performance over time.+  * **plot_data** : time series that can be used with programs such as **gnuplot** to create visual representations of the fuzzer'​s performance over time.
   * **queue/** : backups of all the input files that increased code coverage at that time. Note that some of the newer files may provide the same coverage as old ones, and then some. The reason why the old ones are not removed when this happens is that rechecking / caching coverage would be a pain and would bog down the fuzzing process. Depending on the binary under tests, we can expect a few thousand executions per second.   * **queue/** : backups of all the input files that increased code coverage at that time. Note that some of the newer files may provide the same coverage as old ones, and then some. The reason why the old ones are not removed when this happens is that rechecking / caching coverage would be a pain and would bog down the fuzzing process. Depending on the binary under tests, we can expect a few thousand executions per second.
   * **hangs/** : inputs that caused the process to execute past a timeout limit (20ms by default).   * **hangs/** : inputs that caused the process to execute past a timeout limit (20ms by default).
   * **crashes/​** : files that generate crashes. If you want to search for bugs and not just test for coverage increase, you should compile your binary with a sanitizer (e.g.: [[https://​clang.llvm.org/​docs/​AddressSanitizer.html|asan]]). Under normal circumstances,​ an out-of-bounds access can go undetected unless the accessed address is unmapped, thus creating a #PF (page fault). Different sanitizers give assurances that these bugs actually get caught, but also reduce the execution speed of the tested programs, meaning slower code coverage increase.   * **crashes/​** : files that generate crashes. If you want to search for bugs and not just test for coverage increase, you should compile your binary with a sanitizer (e.g.: [[https://​clang.llvm.org/​docs/​AddressSanitizer.html|asan]]). Under normal circumstances,​ an out-of-bounds access can go undetected unless the accessed address is unmapped, thus creating a #PF (page fault). Different sanitizers give assurances that these bugs actually get caught, but also reduce the execution speed of the tested programs, meaning slower code coverage increase.
  
-=== [15p] Task B - Profile AFL ===+=== [10p] Task B - Profile AFL ===
  
 Next, we will analyze the performance of **afl**. Using **perf**, we are able to specify one or more events (see ''​man perf-list(1)''​) that the kernel knows to record only when our program under test (in this case **afl**) is running. When the internal event counter reaches a certain value (see the ''​-c''​ and ''​-F''​ flags in ''​man perf-record(1)''​),​ a sample is taken. This sample can contain different kinds of information;​ for example, the ''​-g''​ option requires the inclusion of a backtrace of the program with every sample. Next, we will analyze the performance of **afl**. Using **perf**, we are able to specify one or more events (see ''​man perf-list(1)''​) that the kernel knows to record only when our program under test (in this case **afl**) is running. When the internal event counter reaches a certain value (see the ''​-c''​ and ''​-F''​ flags in ''​man perf-record(1)''​),​ a sample is taken. This sample can contain different kinds of information;​ for example, the ''​-g''​ option requires the inclusion of a backtrace of the program with every sample.
  
-Let's record some stats using unhalted CPU cycles as an event trigger, every 1M events, and including frame pointers in samples:+Let's record some stats using unhalted CPU cycles as an event trigger, every 1k events ​in userspace, and including frame pointers in samples:
  
 <code bash> <code bash>
-$ perf record -e cycles -c 1000000 ​-g \+$ perf record -e cycles -c 1000 -g --all-user ​\
     afl-fuzz -i fuzzgoat/in -o afl_output -- ./​fuzzgoat/​fuzzgoat @@     afl-fuzz -i fuzzgoat/in -o afl_output -- ./​fuzzgoat/​fuzzgoat @@
 </​code>​ </​code>​
  
-Leave the process running ​for a minute ​or so; then kill it with //<Ctrl + C>//**perf** will take a few moments longer ​to save all collected samples in a file named //perf.data//. Don't fuck with it!+<note important>​ 
 +Perf might not be able to capture data samples if access to performance monitoring operations is not allowed. To open access ​for processes without //​CAP_PERFMON//,​ //​CAP_SYS_PTRACE// ​or //CAP_SYS_ADMIN// Linux capability, adjust (as root user) the value of **/​proc/​sys/​kernel/​perf_event_paranoid** to **-1**: 
 +<code bash> 
 +$ sudo su 
 +$ echo -1 > /proc/sys/kernel/perf_event_paranoid 
 +$ exit 
 +</​code>​
  
-Let's see some raw trace output first. Then look at the perf record. The record aggregates the raw trace information and identifies ​points of interest.+More information can be found [[https://​www.kernel.org/​doc/​html/​latest/​admin-guide/​perf-security.html|here]]. 
 +</​note>​ 
 + 
 +Leave the process running for a minute or so; then kill it with //<Ctrl + C>//. **perf** will take a few moments longer to save all collected samples in a file named //​perf.data//,​ which are read by **perf script**. Don't mess with it! 
 + 
 +Let's see some raw trace output first. Then look at the perf record. The record aggregates the raw trace information and identifies ​stress areas.
  
 <code bash> <code bash>
Line 87: Line 98:
  
 Use ''​perf script''​ to identify the PID of **afl-fuzz** (hint: ''​-F''​). Then, filter out any samples unrelated to **afl-fuzz** (i.e.: its child process, **fuzzgoat**) from the report. Then, identify the most heavily used functions in **afl-fuzz**. Can you figure out what they do from the source code? Use ''​perf script''​ to identify the PID of **afl-fuzz** (hint: ''​-F''​). Then, filter out any samples unrelated to **afl-fuzz** (i.e.: its child process, **fuzzgoat**) from the report. Then, identify the most heavily used functions in **afl-fuzz**. Can you figure out what they do from the source code?
 +
 +Make sure to include plenty of screenshots and explanations for this task :p
 +
 +=== [10p] Task C - Flame Graph ===
 +
 +A [[https://​www.brendangregg.com/​flamegraphs.html|Flame Graph]] is a graphical representation of the stack traces captured by the **perf** profiler during the execution of a program. It provides a visual depiction of the call stack, showing which functions were active and how much time was spent in each one of them. By analyzing the flame graph generated by //perf//, we can identify performance bottlenecks and pinpoint areas of the code that may need optimization or further investigation.
 +
 +When analyzing flame graphs, it's crucial to focus on the width of each stack frame, as it directly indicates the number of recorded events following the same sequence of function calls. In contrast, the height of the frames does not carry significant implications for the analysis and should not be the primary focus during interpretation.
 +
 +Using the samples previously obtained in //​perf.data//,​ generate a corresponding Flame Graph in SVG format and analyze it.
 +
 +<​note>​
 +How to do:
 +  - Clone the following git repo: https://​github.com/​brendangregg/​FlameGraph.
 +  - Use the **stackcollapse-perf.pl** Perl script to convert the //​perf.data//​ output into a suitable format (it folds the perf-script output into one line per stack, with a count of the number of times each stack was seen).
 +  - Generate the Flame Graph using **flamegraph.pl** (based on the folded data) and redirect the output to an SVG file.
 +  - Open in any browser the interactive SVG graph obtained and inspect it.
 +
 +More details can also be found [[https://​www.brendangregg.com/​FlameGraphs/​cpuflamegraphs.html|here]] and [[https://​gitlab.com/​gitlab-com/​runbooks/​-/​blob/​v2.220.2/​docs/​tutorials/​how_to_use_flamegraphs_for_perf_profiling.md|here]].
 +</​note>​
  
  
 + 
ep/labs/03/contents/tasks/ex6.1635191844.txt.gz · Last modified: 2021/10/25 22:57 by radu.mantu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0