We are presented with a big zip file of SML code, which implements an interpreter for a small ML-like language with a form of taint analysis in its type checker, called Wolf. Concretely, every type in Wolf’s type system has an associated secrecy: it is either “private” or “public”, and in theory, the type system makes it impossible to do any computation on private data to get a public result.
Of course, this is a CTF, so the challenge is all about breaking the theoretical guarantees of the type system. When we submit code, it’s evaluated in a context with a private integer variable flag
; our code is typechecked, executed, and printed, but only if its type is public. The goal is to break the type system and write code that produces a public value that depends on flag
, so that we can exfiltrate flag
itself.
In all, there are three progressively harder Wolf problems, named Pupper, Doggo, and Woofer. Doggo and Woofer are each encrypted with the flag of the challenge before it, so that you need to solve them in order (unless you can somehow blindly exploit servers running SML programs).
Wolf Overview
Let’s first go over the Wolf syntax and semantics. (There are small differences between the three problems, but they’re syntactically identical and only semantically differ in cases that we’ll naturally get to.) The examples
folder has some examples of valid code:
These poor attempts at exfiltrating the flag do not typecheck. In the first example, the typechecker rejects the invalid coercion from a private int to a public int:
In the second example, the typechecker notices that the program’s type is private, so the interpreter refuses to run it:
language.sml
might be the simplest place to intuit a more exhaustive description:
infix @:
datatype term
= VarExp of Variable.t
| LetExp of {
dec : (Variable.t * term), (* declaration *)
body : term
}
| LamExp of {
param : Variable.t,
paramType : tycon,
body : term
}
| AppExp of { function : term, argument : term }
| PrimArith of Primitive.arith * (term * term)
| PrimCmp of Primitive.cmp * (term * term)
| IfExp of {cond : term, thenCase : term, elseCase : term}
| UnitExp
| IntExp of Primitive.int
| BoolExp of bool
| NewRefExp of term
| AssignExp of { target : term, value : term }
| DerefExp of term
| CoerceExp of term * tycon
| Reference of Assignable.assignable
and tycon (* type constructors with secrecy *)
= @: of con * Secrecy.t
and con (* type constructors *)
= Cint
| Cbool
| Carrow of tycon * tycon
| Cref of tycon
| Cunit
Wolf has three simple data types: int
, bool
, and unit
. It also has two complex data types, a ref T
type that holds a mutable reference to something of type T
, and a function type a -> b
that is a function that takes one argument of type a
and returns one argument of type b
. Each of these data types can be directly used as a type (which makes them implicitly public) or modified by private
to get a new type, except that ref
s cannot be private.
Wolf programs support the usual numeric and boolean literals, variable references, arithmetic, and comparisons; let
-in
statements, for binding a variable, e.g. let x = 3 in x + 2
; function application; the usual arithmetic and comparisons; creating, assigning, and dereferencing mutable ref
s, e.g. let r = ref 3 in r := !r + 1; !r
; and type coercion, e.g. (3 :> private int)
.
The flag variable is flag
, which has type private int
.
Generally, computations with data that includes at least one private thing result in a private return value. So if you add a public and a private int
, the type of the expression is a private int
. The types of if statements have to match exactly. You can explicitly coerce subtypes to supertypes, even though there aren’t that many of them; you can coerce public X to private X, refs are invariant in what they contain (and can’t themselves be private), and function types are covariant in their return type and contravariant in their argument type, as expected.
Secrecy is reperesented with this datatype. There are two possible values of Secrecy, Private
and Public
; you can compare them and take their min and max.
datatype t
= Private
| Public
fun leq Private Public = false
| leq _ _ = true
fun max (Public, Public) = Public
| max _ = Private
fun min (Private, Private) = Private
| min _ = Public
typechecker.sml
defines some operators that are aliases for combining secrecies, and @:
is the infix operator that attaches a Secrecy to a data type to get a full type for the type system.
infix 3 @:
infix 4 +$
val op +$ = Secrecy.max
infix 5 *$
val op *$ = Secrecy.min
Also worth mentioning is that side-channel timing attacks are eliminated because all of the interpreter’s arithmetic is implemented with a constant-time arithmetic library and the interpreter exhaustively executes every possible branch of if-else statements, so that any execution of your program will take the same time. See computation.sml
.
Pupper
Most of the interesting code is in typechecker.sml
. Things seem fine; secrecies are always combined with +$
, so that the result of a computation is always the more secret of the operands, except in one case.
When the typechecker tries to figure out the secrecy of an if
-then
-else
statement, it simply throws away the secrecy of its condition:
| IfExp {cond, thenCase, elseCase} =>
let val thenT = checkTerm cxt thenCase
val elseT = checkTerm cxt elseCase
in case checkTerm cxt cond
of Cbool @: _ => (* <- h4xxor3d! *)
if equiv (thenT, elseT)
then thenT
else raise TypeError.differentIfBranches (thenT, elseT)
| other =>
raise TypeError.nonBoolIfCondition other
end
This means that even some Wolf as simple as if flag > 1000000 then true else false
allows us to exfiltrate one bit of flag
at a time. It’s now easy to write a script to binary search. A short version is presented:
from __future__ import division, print_function
from pwn import *
import sys
def too_big(n):
conn = remote('wolf.chal.pwning.xxx', 6808)
conn.sendline('if {} > flag then 1 else 0'.format(n))
conn.shutdown()
msg = conn.recvuntil('int')
print(repr(msg))
return '1' in msg
lo, hi = 0, 1
while not too_big(hi): hi *= 2
while hi - lo > 1:
mid = (hi + lo) // 2
if too_big(mid):
hi = mid
else:
lo = mid
print(lo)
Note that this makes a lot of network calls and errors may happen intermittently, so that in practice you’d print intermediate values to debug and to make recovering from those errors easier, but I chose a summarized version.
The flag can be converted to hexadecimal and read off as bytes to get:
PCTF{0of_0uch_0wi3_my_IF_$t4t3m3n7s}
Doggo
This challenge was another blob of SML that was encrypted with the flag to Pupper, but diffing the two codebases reveals that most of the code is the same. The big change is that Wolf’s type system got an overhaul. Not only is the above issue with if
statements fixed, but now the purities of expressions and functions are kept track of. An expression is impure if it ever writes to a ref
to public data, so that it might have observable side effects.
So the above solution for Pupper could also be solved with a binary search like let r = ref 0 in (if flag > 1000000 then r := 1 else ()); !r
, but this does not work for Doggo, not even if you coerce the if
branches’ return values to private unit
. The reson is that the typechecker notices that the if
statement has a private condition, but one of its branches has an observable side effect, and rejects the if
statement for being “leaky”.
Here’s the full code for typechecking an if
statement, which is worth examining in close detail:
| IfExp {cond, thenCase, elseCase} =>
let val pCond <@< condT = checkTerm cxt cond
val p1 <@< thenT = checkTerm cxt thenCase
val p2 <@< elseT = checkTerm cxt elseCase
val pbranches = p1 *$ p2
in
case condT
of Cbool @: condSecrecy =>
if equiv (thenT, elseT)
then (case (condSecrecy, thenT)
of (Secrecy.Public, branchesT) => (pbranches *$ pCond) <@< branchesT
| (Secrecy.Private, _ @: Secrecy.Public) => raise TypeError.publicIfBranches
| (Secrecy.Private, _ @: Secrecy.Private) =>
case pbranches
of Secrecy.Public => raise TypeError.leakyIfBranches
| Secrecy.Private => pCond <@< thenT)
else raise TypeError.differentIfBranches (thenT, elseT)
| other =>
raise TypeError.nonBoolIfCondition other
end
Note: I have replaced <<<
with <@<
to get syntax highlighting to not suck. Anyway, that infix symbol is what Doggo uses to keep track of the purities: all expressions now have an additional purity value tacked on. What’s confusing is that this purity is represented with the same SML type as the secrecy that was previously tacked onto data types, but has a different meaning. On the left-hand side of <@<
, the value Private
means pure and Public
means impure, so that combining a pure and an impure computation result in an impure computation, which is the opposite of what you’d get when you combine a public and a private computation. So, +$
is the operator for combining secrecies and *$
is the operator for combining purities.
So: If the condition is public, then anything goes. If the condition is private, then the then
and else
branches must have a private return value and must be pure. This seems solid, unfortunately for us.
There is a lot of other code in typechecker
that you can think about and try to break.
Of course, there are all the basic variable expressions, literals,
let
expressions, comparisons, and arithmetic operations, but they all seem to combine all the necessary secrecies and purities correctly with the right operators, and never produce any impure side effects themselves (other than effects like timing, which we’ve already eliminated).In
NewRefExp
you see that creating a public reference (to either a public value or a private value, evenflag
itself) is considered pure. Also, you can compare references for equality (being the exact same memory) even if their contents are private. Unfortunately, references are not allowed to be private, so you can’t return them from anything depending onflag
. If your goal is to get a public value, it seems that references to private data just act as symbols that you can observe the equality of and nothing else.AssignExp
seems to do its job with respect to computing and including the correct purity.Close examination of the code for
DerefExp
reveals that I lied about the description of impure expressions earlier: expressions that read fromref
s public data are also marked impure. But this just makes our job harder, not easier.
The remaining types of expressions are coercions, lambdas, and function application. The typechecking for coercions is just a thin wrapper around subtype
:
| CoerceExp (e, tnew) =>
let val purity <<< t = checkTerm cxt e
in
checkType tnew;
if subtype (tnew, t)
then purity <<< tnew
else raise TypeError.cannotCoerce (t, tnew)
end
And here’s subtype
.
fun subtype ((t1 @: s1) : tycon, (t2 @: s2) : tycon) : bool =
Secrecy.leq s2 s1 andalso
case (t1, t2)
of (Cunit, Cunit) => true
| (Carrow (c1, c2), Carrow (c1', c2')) =>
subtype (c1', c1) andalso subtype (c2, c2')
| (Cint, Cint) => true
| (Cbool, Cbool) => true
| (Cref c1, Cref c2) => equiv (c1, c2)
| _ => false
and equiv (t1, t2) : bool =
subtype (t1, t2) andalso subtype (t2, t1)
So, public X is a subtype of private X for all X, and a -> b
is a subtype of c -> d
if c
is a subtype of a
and b
is a subtype of d
, which make sense. This all seems fine, but we’ll put it on the back burner for now.
The code for lambdas and function application reveals something stranger:
| LamExp {param, paramType, body} =>
let val cxt = Context.insert(cxt, param, paramType)
val purity <@< returnType = checkTerm cxt body
val (_ @: returnSecrecy) = returnType
in
checkType paramType;
doesn'tWrite <@< Carrow(paramType, returnType) @: (returnSecrecy *$ purity)
end
| AppExp {function, argument} =>
let val argTermPurity <@< t = checkTerm cxt argument
in
case checkTerm cxt function
of funTermPurity <@< Carrow (t1, t2) @: funPurity =>
if equiv (t, t1)
then (funPurity *$ funTermPurity *$ argTermPurity) <@< t2
else raise TypeError.appIncompatible ((t1, t2), t)
| _ <@< other => raise TypeError.appNotArrow other
end
Why is there a *$
on the right-hand side of @:
, where secrecies are usually stored? Does that mean there’s a bug in how secrecies are computed for lambdas that we can exploit, to create a public value from a public value and a private one?
Not quite. From the variable names we can figure out the semantics of this strange different usage of @:
: Doggo dispenses with keeping track of secrecies for functions, and instead stores the purity of functions when they are called on the right-hand side of @:
for functions, where the secrecy of other values are stored. Now the code makes sense.
You might note that, although the purity of the function application is meticulously calculated, the type of the function application is directly taken from the return type of the function itself; no secrecies are combined into it. This seems like a promising thing to exploit. You might want to smuggle private data into a function application via the function itself or its argument. For example, you might hope that we can, say, choose a function of type private (int -> int)
that depends on the flag, then call it on a public int
and get a public int
, which would let us exfiltrate the flag. But actually, the purity field of functions is a bit stricter: not only must pure (private) functions never write to a public ref when executed, they also cannot return a public value. This is checked when validating the type, so that you can never get a function of type private (int -> int)
, because the type private (int -> int)
is invalid and you won’t be allowed to write it down. So you’d be stuck with functions whose return types are private
, which you can’t get anything out of by calling. And if you could write a function that produced a public result from private inputs with the goal of calling it, you could just use the body of that function in the first place.
So there’s no bug here either. The type system is fine. Right?
No, the type system is not fine, and we have all the puzzle pieces to figure it out. Recall the implementation of subtype
, which checks when a type is coercible into another: for all X, you can coerce X
into private X
(except ref
s, for which private ref ...
is invalid). For normal data types, this just increases their secrecy, which is fine. But for for function types, this lets us coerce impure functions to pure functions, which is quite bad. This allows us to create an impure function, coerce it to a pure function, and then call it in one of the branches of an if
-then
-else
statement whose condition depends on flag
.
The solution readily follows. There is a little extra typing involved in the solution because validly-typed private
(pure) functions must also have a private
return value, and the most convenient value is flag
itself, even if we immediately throw that return value away. My solution during the CTF was as follows:
let m = ref 0 in
let f = (fn (r : ref int) => r := 1; flag) :> private (ref int -> private int) in
(if 1000000 > flag then f m else flag); !m
A more verbose but possibly more essentialized version is as follows:
let m = ref 0 in
let p = () :> private unit in
let f = (fn (x: private unit) => m := 1; x) :> private (private unit -> private unit) in
(if 1000000 > flag then f p else p); !m
Running the same binary-search script with this as a template and decoding in the same way gets us the flag.
PCTF{$h0u1d_h4v3_f0rma11y_v3r1fi3d!}
Woofer
The final part of the trilogy, which I did not solve during the CTF (nobody did). This was another blob of SML code that reimplemented Wolf, encrypted with the Pupper flag, that was nearly identical to Pupper, just with the above issue patched. In Woofer, pure (private) functions are now a subtype of impure (public) functions, as they should be, instead of the other way round.
The challenge author has posted a writeup. The intended solution was to take advantage of the forking computation and heap hashing to write a program that would cause an out-of-memory error in some cases but not in others, and exfiltrate bits that way. It’s a pretty cool idea and shows just how clever side-channel attacks can be.