Now that kmh is gone, clam’s been going through pickle withdrawal. To
help him cope, he wrote his own pickle pyjail. It’s nothing like kmh’s,
but maybe it’s enough.
Language jails are rapidly becoming one of my CTF areas of expertise.
Not sure how I feel about that.
#!/usr/local/bin/python3
import pickle
import io
import sys
module = type(__builtins__)
empty = module("empty")
empty.empty = empty
sys.modules["empty"] = empty
class SafeUnpickler(pickle.Unpickler):
def find_class(self, module, name):
if module == "empty" and name.count(".") <= 1:
return super().find_class(module, name)
raise pickle.UnpicklingError("e-legal")
lepickle = bytes.fromhex(input("Enter hex-encoded pickle: "))
if len(lepickle) > 400:
print("your pickle is too large for my taste >:(")
else:
SafeUnpickler(io.BytesIO(lepickle)).load()
pickle
is a Python object serialization format. As the docs page loudly
proclaims, it is not secure. Roughly the simplest possible code to pop a
shell (adapted from David
Hamann, who constructs a more realistic RCE) looks like:
It’s clam’s newest javascript Calculator-as-a-Service: the CaaSio
Please Stop Edition! no but actually please stop I hate jsjails js isn’t
a good language stop putting one in every ctf I don’t want to look at
another jsjail because if I do I might vomit from how much I hate js and
js quirks aren’t even cool or funny or quirky they’re just painful
because why would you design a language like this
ahhhhhhhhhhhhhhhhhhhhh
It’s just a JavaScript eval jail.
#!/usr/local/bin/node
// flag in ./flag.txt
const vm = require("vm");
const readline = require("readline");
const interface = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
interface.question(
"Welcome to CaaSio: Please Stop Edition! Enter your calculation:\n",
function (input) {
interface.close();
if (
input.length < 215 &&
/^[\x20-\x7e]+$/.test(input) &&
!/[.\[\]{}\s;`'"\\_<>?:]/.test(input) &&
!input.toLowerCase().includes("import")
) {
try {
const val = vm.runInNewContext(input, {});
console.log("Result:");
console.log(val);
console.log(
"See, isn't the calculator so much nicer when you're not trying to hack it?"
);
} catch (e) {
console.log("your tried");
}
} else {
console.log(
"Third time really is the charm! I've finally created an unhackable system!"
);
}
}
);
I participated in the AGI Safety
Fundamentals program recently. The program concludes with a flexible
final project, with the default suggestion of “a piece of writing,
roughly the length and scope of a typical blog post”, so naturally, I
deleted all but the last two words and here we are.
When I previously considered machine learning as a field of study, I
came away with an impression that most effort and computation power was
going into training bigger, more powerful models; whereas the inner
workings of the models themselves, not to mention questions like why
certain architectures or design choices work better than others,
remained inscrutable and understudied. This impression always bothered
me, and it definitely influenced me away from going into AI as a career.
Of course, there are important, objective safety concerns around
developing and designing models we don’t understand, many of which we
discussed in the program; but my discomfort is mostly a completely
unrelated nagging feeling I get whenever I’m relying on things I don’t
understand.
After the program and all the concurrent developments in AI
(including AlphaCode,
OpenAI’s math olympiad
solver, SayCan, and, of course, DALL-E 2), I still had this
impression about the field at a very high level, but I also became more
familiar with the subfield of interpretability — designs and
tools that allow us to understand and explain decisions by ML systems,
rather than treating them as black-boxed mappings from inputs to outputs
— and confirmed that enough people study it to make it a thing. One
quote from a post on the views
of Chris Olah, noted interpretability researcher, captured my
feeling particularly eloquently:
interpretability is very aligned with traditional scientific
virtues—which can be quite motivating for many people—even if it isn’t
very aligned with the present paradigm of machine learning.
I found the whole post insightful, and it happens that the bits
before that in the passage were also relevant to me. I don’t have access
to lots of compute!
Inspired by that post and by a desire to actually write some code
(which I figured might help me understand the inner workings of modern
ML systems in a different sense), and after abandoning a few other
project ideas that were far too ambitious, I decided to go through some
parts of the fast.ai tutorial and
riff on it to see how much progress I could make interpreting the
models, and to write up the process in a blog post. I tried to capture
my experience holistically, bugs and all, to serve as a data point for
what it might feel like to start ML engineering (for the rare
individuals with a background and inclinations just like mine), and maybe entertain more
experienced practitioners or influence their future tutorial
recommendations. A much lower-priority goal was trying to produce “my
version of the tutorial”, which would draw more liberally from an
undergraduate math education and dive more deeply
into technical details.
Last weekend Galhacktic
Trendsetters sort of spontaneously decided to do DiceCTF 2022, months or years after
most of us had done another CTF. It was a lot of fun and we placed
6th!
Back in the day the silver edition was the top of the line Texas
Instruments calculator, but now the security is looking a little
obsolete. Can you break it?
It’s yet another Python jail. We input a string and, after it makes
it through a gauntlet of checks and processing, it gets
exec
’d.
#!/usr/bin/env python3
import dis
import sys
banned = ["MAKE_FUNCTION", "CALL_FUNCTION", "CALL_FUNCTION_KW", "CALL_FUNCTION_EX"]
used_gift = False
def gift(target, name, value):
global used_gift
if used_gift: sys.exit(1)
used_gift = True
setattr(target, name, value)
print("Welcome to the TI-1337 Silver Edition. Enter your calculations below:")
math = input("> ")
if len(math) > 1337:
print("Nobody needs that much math!")
sys.exit(1)
code = compile(math, "<math>", "exec")
bytecode = list(code.co_code)
instructions = list(dis.get_instructions(code))
for i, inst in enumerate(instructions):
if inst.is_jump_target:
print("Math doesn't need control flow!")
sys.exit(1)
nextoffset = instructions[i+1].offset if i+1 < len(instructions) else len(bytecode)
if inst.opname in banned:
bytecode[inst.offset:instructions[i+1].offset] = [-1]*(instructions[i+1].offset-inst.offset)
names = list(code.co_names)
for i, name in enumerate(code.co_names):
if "__" in name: names[i] = "$INVALID$"
code = code.replace(co_code=bytes(b for b in bytecode if b >= 0), co_names=tuple(names), co_stacksize=2**20)
v = {}
exec(code, {"__builtins__": {"gift": gift}}, v)
if v: print("\n".join(f"{name} = {val}" for name, val in v.items()))
else: print("No results stored.")
More precisely, the gauntlet does the following:
Last weekend Galhacktic
Trendsetters sort of spontaneously decided to do DiceCTF 2022, months or years after
most of us had done another CTF. It was a lot of fun and we placed
6th!
I made a blazing fast MoCkInG CaSe converter!
blazingfast.mc.ax
We’re presented with a website that converts text to AlTeRnAtInG
CaSe. The core converter is written in WASM, and also checks that its
input doesn’t have any of the characters <>&"
.
The JavaScript wrapper takes an input from the URL, converts it to
uppercase, feeds it to the converter, and if the check passes, injects
the output into an innerHTML
. The goal is to compose a URL
that, when visited by an admin bot, leaks the flag from
localStorage
.
The converter is compiled from this C code:
int length, ptr = 0;
char buf[1000];
void init(int size) {
length = size;
ptr = 0;
}
char read() {
return buf[ptr++];
}
void write(char c) {
buf[ptr++] = c;
}
int mock() {
for (int i = 0; i < length; i ++) {
if (i % 2 == 1 && buf[i] >= 65 && buf[i] <= 90) {
buf[i] += 32;
}
if (buf[i] == '<' || buf[i] == '>' || buf[i] == '&' || buf[i] == '"') {
return 1;
}
}
ptr = 0;
return 0;
}
I subscribed to Discord Nitro a month ago, but only recently did I
start thinking about the full range of powers the subscription granted
me. I could create my own reactions, dump them in my personal server,
and use them to react anywhere.
However, when I finally started trying to create some reactions, I
hit an interesting snag: Discord can be used in dark and light mode, and a reaction will have the same
color on both modes. If I wanted my reaction to be as clearly readable
in both modes as possible, what color should I make it?
(I could, of course, just outline my reaction with a contrasting
color, but let’s say that’s cheating. With the limited space in a
reaction, outlining isn’t that great of a solution anyway.)
Now, one can’t really just compute the “contrast of two
colors” given only their RGB components; there’s no universally
agreed-on definition of contrast in
vision, and even if there were one, the contrast of two given colors
would depend on the color space and possibly the viewer’s biology. But,
to get a concrete answer to this question, we can use the standard sRGB
model and the W3C’s definitions of contrast ratio
and relative
luminance. As of time of writing on my
computer, Discord reactions have background
#2f3136
on dark mode and #f2f3f5 on
light mode. Reactions you’ve reacted with have background #3b405a
on dark mode and #e7e9fd on
light mode. Because the dark mode background gets lighter and the light
mode background gets darker, we’ll use the latter colors so we’re
optimizing the worst-case contrast.
There are smarter approaches, but the 2563 = 16,777,216
possible 8-bit colors are perfectly feasible to brute force, so I wrote
a short Python script to check all of them, which is at the bottom of
this post. Under the parameters I’ve outlined, the optimal color for a
Discord reaction is rgb(255, 65, 3) or
#ff4103. A demo:
#ff4103 #ff4103 ●
CR: 2.90738237
|
#ff4103 #ff4103 ●
CR: 2.90738217
|
That was simple enough, but this color’s worst-case contrast ratio is
less than 0.0000002 better than the runner-up. Surely even very mild
aesthetic considerations will outweigh that. (It’s highly doubtful that
the formulae I used were intended to have this degree of precision in
the first place.)
After playing with a few ways to get a spread of options, I settled
on categorizing colors into six buckets of saturation and twelve buckets
of hue in the simple
HSV model, and then finding the optimal color within each bucket.
Here is a table of my results:
This post is motivated by reasons very similar to the ones that
motivated my React and Redux
“tutorial”. Again, it should be more accurately but less
informatively titled “How I wish SQL SELECTs were explained to me”.
Again, it does not imply that this method of explanation is suitable for
anybody else. One difference is that this time, I mostly only wanted to
learn about SQL SELECTs to the extent it would help me perform and
optimize queries in Django’s ORM, but to prevent this post from
languishing forever in my drafts folder, that material has been
sectioned off into a possible future post, because I figured out what I
wanted, ran out of steam, and am now trying to learn TLA⁺. Just me
things.
Background
The SQL standard is confusing and almost never completely
implemented; there are huge inconsistencies between SQL implementations.
I will focus on SQLite because it’s popular and easy to play with, but
generally try to stay away from unpopular or nonstandard features. SQLite’s SELECT
documentation is good reading for one particular SQL
implementation.
A SQL database is a place where you store and query a bunch of data
that’s organized into tables. A table is a homogeneous list of rows. A
row is a heterogeneous tuple of values of various simple data types. The
data types supported depend on the SQL implementation; typical examples
are integers and strings of various sizes, floating point numbers, and
dates/datetimes. All of these types can be nullable; NULL is a SQL value
that can appear just about anywhere. (Like many of the other SQL
features, NULL is handled somewhat inconsistently across SQL
implementations, but as a first-order approximation it’s closer to a
floating-point NaN than, say, Java’s “null”. We’ll talk more about it
later.) However, note that you can’t have a variable-size list of other
things in a row. And just to make sure it’s clear, all the rows in a
given table must have the same data types in the same order.
A “column” is just what you’d intuitively expect it to be: it’s the
homogeneous list of all values in a particular position in each row of a
table, which all have the same data type. One thing I haven’t mentioned
yet is that table columns all have names. This is true both for tables
stored in the database and for the ephemeral tables that are the output
of queries.
Since I’ll also be referring to more complex types like lists and
tuples often seen in conventional programming languages, I’ll call these
simple data types “scalar types” and values of those types “scalars”.
This is not SQL terminology; documentation usually just calls these
“data types”. Here’s SQLite’s page on data
types.
To play along, install SQLite and run it. You should get dropped into
a connection to an ephemeral in-memory database, which is plenty enough
for our purposes. Make a table and mutter some magic incantations to
make the output a little prettier for us:
CREATE TABLE a (id int);
INSERT INTO a VALUES (1), (2), (3), (4);
.headers on
.mode column
Advent of Code (briefly,
“AoC”) is a series of 25 festive programming puzzles
released daily December 1–25. Each puzzle has two parts, which use the
same text input and are related; to solve a part, you submit the right
output corresponding to the input on the website.
If you’re reading this, I suspect there’s a good chance you knew that
already, but in case you’re new to Advent of Code, let me try to briefly
explain why I like Advent of Code, from the perspective of somebody
who’s spent a lot of their life so far doing programming competitions.
- The event has a fantastic community surrounding it. I’m the most
familiar with the
subreddit, which is full of helpful people, interesting discussions,
non-programming community games, and the occasional wonderfully,
spectacularly
overengineered
solution
to a puzzle; but I know there are also many smaller chatrooms and
subcommunities focused on, say, specific timezones or programming
languages.
- Another aspect is the unique two-part format of each puzzle. Even
though they use the same input, you don’t get to see the second part
until after you’ve solved the first one, a feature that Eric Wastl
(AoC’s creator) has taken full advantage of in designing puzzles. The
second part is often a surprising twist on the first part, which keeps
you on your toes and challenges you to keep your code moderately general
or refactorable in a way that I think almost no other programming
challenges do. This sometimes even happens between days in a calendar,
when a puzzle turns out to be about some model of computation you
implemented two or five or ten days ago — hope you kept your code and
remember how it works!
- Finally, AoC also has some non-rigorous puzzles that force you to
use your intuition and “human intelligence”, either by interpreting the
problem statement heuristically or writing code to let you explore the
input. There are quite a few puzzles where it’s infeasible to write code
that handles every step of obtaining the output from the input. The
result is that Advent of Code can feature quite a few challenges that
I’ve found particularly compelling because I think they simply could not
be posed on any other contest platform.
These are the things that make AoC stand out to me, but it also does
a lot of other things well — the challenges are fun, approachable, and
varied even aside from their interrelations; there is a long, dramatic
story tying everything together (although it’s an Excuse
Plot if there ever was such a thing); and, although this is
obviously subjective, I find the website’s minimalist-adjacent,
terminal-esque aesthetic really charming (there is a lot of
detail in 2019’s calendar…
after you solve everything). I’ve only done the last two years of Advent
of Code, but it really seems like a one-of-a-kind event to me.
Anyway, one particular feature Advent of Code has is a leaderboard,
which you can get on by being one of the first 100 people worldwide to
solve each puzzle. The competition is fierce — every year, thousands of
people compete to get on the leaderboard. Near the start of AoC 2019, mcpower
(reddit
discussion) and Kevin
Yap (reddit
discussion) wrote some articles about how to do this, both of which
are worth reading. I also thought about writing such an article and
started a draft, but I didn’t get it anywhere close to publishable
before AoC had concluded, at which point I assumed few people would be
interested. But here it is now.
By a strange quirk of fate, I have started writing C++ for a
living.
Learning C++ was about as complicated as I think I expected it to be.
By line count, I’ve written a lot of C++ for programming competitions,
but I knew that I had only ever used a small cross-section of the
language: basic control flow and variables, STL containers and
algorithms, structs on which you mechanically define
bool operator<(const T& other) const
so STL
algorithms can order them, and the very occasional macro or templated
helper function. There were many features I wasn’t even aware
existed.
In the process of learning C++ professionally, one rabbit hole I fell
into quickly was C++11’s defining feature, the rvalue
reference, and how it can be used to implement move
semantics and perfect forwarding. By poring over a copy of
the widely recommended book Effective
Modern C++, by Scott Meyers, and a few dozen StackOverflow
answers and blog posts, I roughly understood it after a few days, but
still had a sort of blind-men-feeling-the-elephant feeling. I was
confused about what lay under some of the abstractions I had been using,
unsure of the full shape of the pitfalls that some of the guides had
pointed out to me, and generally uncomfortable that there were still
many small variations of the code I had seen that I couldn’t predict the
behavior of. It took many more days to work myself out of there, and I
wished I had had a guide that explained rvalue references and their
applications to a bit more depth than what might be necessary for
day-to-day use. So here’s my attempt to explain rvalue references in my
own fundamental I-want-to-know-how-things-work-no-really
style.
(If this vision doesn’t resonate with you, there are many other posts
explaining rvalue references out there that you might prefer. Feel free
to just skim the executive summary and/or check out some of the linked
articles in the Background section.)
Executive Summary
I got… pretty carried away when writing this post, and a lot of it is
just for my own understanding, which may or may not be useful to
readers. Here’s a much more concise rundown (assuming you know basic C++
already):
One thing many mathematically-inclined programmers run into when
implementing mathematical algorithms, particularly number-theoretic
ones, is that the modulo
operation doesn’t behave how they expect or prefer.
In many languages, this operator is denoted %
.
Concretely, one might prefer that, if the second argument is positive,
then the modulo operation would always give a nonnegative result. Under
this behavior, the expression (-5) % 3
would evaluate to
1
rather than -2
. This is a lot more useful
for number theory because then for positive integers n
, the
% n
operation actually maps integers to exactly the
n
canonical representatives for the residue classes. As a
result, \(a \equiv b \mod n\) if and
only if a % n == b % n
. You can also do things like index
into a length-n
array with a % n
and know that
the index will be in-bounds. Finally, there are also optimization
opportunities: modding by a power of 2 becomes equivalent to a simple
bitwise AND, which is really fast on modern computers.
A few programming languages, notably Python, do implement
%
this way. However, the majority of languages today,
including pretty much everything remotely descended from C, do not;
instead, (-5) % 3
is -2
. This post attempts to
track down why.
The first thing to note is that there is a more important
number-theoretic identity we’d like to have:
\[\texttt{a} = (\texttt{a / b}) \cdot
\texttt{b} + (\texttt{a \% b}). \tag{1}\]
In words, the integer division and modulo operators should give a
breakdown of a
into a sum of some copies of b
plus a remainder. Note that this equation also implies that specifying
the rounding behavior of division is equivalent to specifying the sign
behavior of the modulo operation, which will come up repeatedly
later.
It’s also very uncontroversial that that remainder should have no
copies of b
, positive or negative, left over, which gives
the constraint:
\[|\texttt{a \% b}| < |\texttt{b}|.
\tag{2}\]
Every programming language I can think of satisfies these two
constraints. So far so good. However, these two
constraints don’t uniquely determine the values of a % b
when a
isn’t divisible by b
; there are two
possible values for a % b
, one positive and one negative.
Concretely, we could express \(-5\) as
either \((-1) \cdot 3 + (-2)\) or \((-2) \cdot 3 + 1\), so
(-5) % 3
could be -2
or 1
.
It’s still mostly uncontroversial that, if a
and
b
are both positive, then a % b
should be
nonnegative as well; we could call this constraint (3).
However, if a
is negative and b
is positive,
programming languages start to diverge in their behavior. Why?