Sakura

(Okay, this post is backdated.)

Disassembling the executable produces a huge amount of code. There are some basic obfuscations like a lot of trivial identity functions nested in each other, and a few functions that wrap around identity functions but just add some constant multiple of 16. Most of the meat is in one very large function, though. If you disassemble this function with IDA, you see a lot of variable initializations and then a lot of interesting loops that are quite similar:

flag = 1;
sum = 0;
banned_mask = 0;
cur_ptr = (__int64 *)identity((__int64)&v904);
end_ptr = plus_16((__int64)&v904);
while ( cur_ptr != (__int64 *)end_ptr )
{
    cval = *cur_ptr;
    *(&big_array[20 * (signed int)cval] + SHIDWORD(cval)) = *(&a1[20 * (signed int)cval] + SHIDWORD(cval));
    digit = *(&a1[20 * (signed int)cval] + SHIDWORD(cval)) - '0';
    if ( digit <= 0 || digit > 9 )
        flag = 0;
    if ( (banned_mask >> digit) & 1 )
        flag = 0;
    banned_mask |= 1 << digit;
    sum += digit;
    ++cur_ptr;
}
if ( sum != 17 )
    flag = 0;

So index pairs are read from cur_ptr and indexed into a1, which are copied into big_array. In order to succeed and give us the flag, the variable I’ve labeled flag has to stay 1 throughout the entire program. There are several checks:

The characters extracted from a1 must be ASCII nonzero digits, as seen from the check of failure if digit <= 0 || digit > 9.
In each loop, the digits encountered must be distinct, as each time through we check that banned_mask does not have the digitth bit indexed and then has that bit set to 1.
Finally, the sum of the digits encountered must match some specific constant, here 17.

This reveals the conceit: this is, to a first approximation, a Kakuro, where cells are filled with digits from 1 to 9 and each constraint taken from cur_ptr specifies a row or column, which must contain no repeated digits and must sum to a given number. (Given the way the code is written, there is no requirement that the constrained sequences of cells are actually rows or columns, but from the final output they seem to be.) Realizing that this is an established logic puzzle genre is not particularly essential to solving the challenge, although it might help let you guess around some uncertainties about how things work without necessarily rigorously examining all the code. Kakuros aren’t solid rectangles of empty cells — some of the cells in the input correspond to black squares used only to delimit other rows and columns, and aren’t used in any clues, so the digits in those cells would be completely unconstrained, which explains why digits are copied to big_array only when they show up in a constraint.

Now we just feed these constraints into z3 to get the answer. This is a fairly hacky Python script that processes the disassembled IDA code after a bunch of renames, but not too many so the v(number) variables were still in order, including each of the obfuscated identity functions to something including identity and each of the obfuscated functions that just added some number to plus_<number>.

from __future__ import division, print_function
impot re
from z3 import *

vards = dict()

def between(left, right, s):
    a, b = s.split(left)
    return b.split(right)[0]

solver = Solver()
cells = []
for i in range(20):
    row = []
    for j in range(20):
        v = Int('cell_{}_{}'.format(i, j))
        solver.add(1 <= v)
        solver.add(v <= 9)
        row.append(v)
    cells.append(row)

with open('sakura-huge.txt') as infile:
    for line in infile:
        line = line[:-1]
        set_match = re.match(r'  v(\d+) = (\d+);', line)
        if set_match:
            vards[int(set_match.group(1))] = int(set_match.group(2))
        elif 'identity' in line:
            v = int(between('&v', ')', line))
            # print('start:', v, vards.get(v))
        elif 'plus' in line:
            r = int(between('plus_', '(', line))
            cur_vars = []
            for i in range(r // 8):
                row, col = vards[v + 2*i], vards[v + 2*i + 1]
                cur_vars.append(cells[row][col])
            for i, v in enumerate(cur_vars):
                for j in range(i):
                    solver.add(v != cur_vars[j])
        elif 'if ( sum' in line:
            target_sum = int(between('!= ', ' )', line))
            solver.add(sum(cur_vars) == target_sum)

solver.check()
solution = solver.model()
print()
for i in range(20):
    for j in range(1, 20):
        print(solution[cells[i][j]], end="")
    print()

Piping this into the executable gives us our flag:

hitcon{6c0d62189adfd27a12289890d5b89c0dc8098bc976ecc3f6d61ec0429cccae61}

Bounded-Error Log

Sakura

HITCON CTF 2017