PicoCTF 2018 - Echooo
Note: This article is part of our PicoCTF 2018 BinExp Guide.
Spot the Bug
If you’re sick of the same old buffer overflow, here’s a new one for you. See if you can spot it:
while(1) {
printf("> ");
fgets(buf, sizeof(buf), stdin);
printf(buf);
}
Yeah. That’s it - That’s the bug. We already know fgets
is fine, so what’s left? … printf()!
Turns out printf
is a powerful little beast, so much so that you mustn’t ever give control of the format string over to the user. This kind of vulnerability is known as a “format string vulnerability” - and it’s more powerful than it first appears.
Strategy
So, how do we turn this printf
into something that will print the flag? printf
is a variadic function, one of the few in the c standard library, and probably the entire reason that variadic functions are even a thing in c (you don’t see them very often in c code outside of printf
/scanf
). It takes a “format string”, and then any number of other arguments. The idea is that the format string identifies the number and type of the remaining arguments, allowing printf
to print them out.
For instance, if you had a char*
variable named “name”, pointing at the string “bob”, and you wrote the code printf("Hello %s!\n", name)
, it would print the string “Hello bob!\n”. The special token “%s
” indicates to printf
that the corresponding argument is a char*
, and printf
knows to “fill in” the token’s spot with the value that the argument points at.
Format strings can get quite complicated, and we needn’t discuss all of them here, but what you need to know is that when printf sees a token, like “%s
”, it consumes an argument of a given size (on x86 a char* is 4 bytes), and then interprets that argument in accordance with some understanding about how that argument should be formatted (%s
de-references the pointer and starts printing char
s one at a time until it hits a null byte).
Now, exactly what happens if you attempt to consume an argument that wasn’t actually passed in (or if you attempt to consume one argument as if it were an argument of an incompatible type)? This is referred to as undefined behavior, which means the compiler doesn’t have to care about that case, is free to assume it can’t happen, and can basically do whatever it wants if it ever does.
In practice, with gcc
on an x86 machine, printf
will consume arguments by indexing into the stack. It will assume that those arguments were passed in immediately following the parameter for the format string, and interpret that memory as the arguments. The question becomes, what follows the argument for the format string in memory, and how can we use it to print the flag?
The general strategy here will be to analyze the content of the stack immediately before the printf
function call, and identify if there is a format string that would allows us to gain knowledge of the flag.
Background Info
First up, let’s see what stack variables there are in our program, and then we’ll look at the assembly and see how the stack is layed out.
char buf[64];
char flag[64];
char *flag_ptr = flag;
FILE *file = fopen("flag.txt", "r");
// ...
fgets(flag, sizeof(flag), file);
Here, we see that in addition to the buffer for the format string, there is also one for the flag. Interestingly, we also see an otherwise unused variable that is set to be a pointer to the flag (HINT HINT WINK WINK - We’ll circle back to that in a bit.)
Now, the stack layout:
mov ebp,esp
push ecx ; esp -= 4
sub esp,0xa4 ; esp -= 0xa4
; ...
; char *flag_ptr = flag;
lea eax,[ebp-0x4c] ; &flag = ebp-0x4c
mov DWORD PTR [ebp-0x98],eax ; &flag_ptr = ebp-0x98
; ...
; fgets(flag, sizeof(flag), file);
push DWORD PTR [ebp-0x90] ; file
push 0x40 ; sizeof(flag)
lea eax,[ebp-0x4c] ; flag
push eax
call 8048460 <fgets@plt>
; ...
; printf buf
sub esp,0xc
lea eax,[ebp-0x8c] ; &buf = ebp-0x8c
push eax
call 8048450 <printf@plt>
Let’s make a table of some of the content we know has been reserved on the stack:
Address | Content |
---|---|
esp (=ebp-0xa8 ) | - |
ebp-0x98 | char* flag_ptr |
ebp-0x90 | FILE* file |
ebp-0x8c | char buf[64] |
ebp-0x4c | char flag[64] |
Here we see the function reserves 0xa8
bytes on the stack. It practice it seems to reserve everything it needs upfront, setting the baseline stack value to be ebp-0xa8
. It then push arguments onto and off-of the stack in order to make function calls, always returning to the baseline level (and maintaining a 16 byte stack alignment for every function call).
Let’s look closer at the stack immediately before the printf
call (after the arguments have been pushed, but before the call
instruction has executed):
Start | End | Content |
---|---|---|
esp | esp+3 | &buf [4 bytes] (format string) |
esp+4 | esp+15 | alignment padding [12 bytes] |
esp+16 = baseline = ebp-0xa8 | esp+31 = ebp-0x99 | other [16 bytes] |
ebp-0x98 | ebp-0x95 | char* flag_ptr |
So, immediately following the format string argument is 12
bytes of alignment padding, 16
bytes of other content, and then the 4
byte pointer to the flag. IE: There are 28
bytes and then a pointer to the flag. Since a pointer is 4
bytes, you could say that if you were to treat everything as pointers, then the 8th pointer following the format string would point to the flag (but the first 7 pointers would probably be junk).
We can now attempt to solve the challenge, but we should note that this is the first binary challenge that doesn’t give us direct access to the binary on the shell server. Our only interface is a socket that accepts I/O (this will become common later on, since if you have access to the binary you could always use something like gdb
to force it to reveal its secrets). To solve this challenge, use netcat to connect to the server: nc 2018shell.picoctf.com 34802
.
Exploitation
We know that we want the string pointed to by the 8th pointer following the format string. What we could use is a format string like this: “%p%p%p%p%p%p%p - %s
”, which will print out the values of the first 7 pointers (they aren’t really pointers, but we’ll treat them as such), and then it will print out the “string” pointed to by the 8th pointer. We wouldn’t want to do something like this: “%s%s%s%s%s%s%s %s
” because that would attempt to de-reference the first 7 pointers, and since those “pointers” probably aren’t valid, we could expect the program to crash.
Let’s try it out now:
$ nc 2018shell.picoctf.com 34802
Time to learn about Format Strings!
We will evaluate any format string you give us with printf().
See if you can get the flag!
> %p%p%p%p%p%p%p - %s
0x400xf77005a00x80486470xf7738a740x10xf77104900xffe47454 - picoCTF{===REDACTED===}
>
Yay, our analysis worked! The content of the flag buffer has been dumped and we now know the flag! Head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.
NOTE: There’s actually a “cleaner” way to solve this challenge if you know that the
printf
function supports a special POSIX extension to format strings. (IE: it works on most unix-like systems, but not on windows). Using it you can index into a specific printf argument without having to print off everything that precedes it. If you add a “n$
” between the ‘%
’ and the ‘s
’, then it will re-interpret all of the arguments as if they were pointers to strings and then print the value of then
th one. In this case the format string “%8$s
” is all that is needed to print only the flag and nothing but the flag. Try it out! We’ll use this trick in a couple upcoming challenges.