One day I’m going to run out of the energy to find barely adequate
allusions for the titles and thematic music videos for the openings of
these end-of-year posts, and they’ll just be called “2095 in Review” or
whatever. Or maybe I’ll just stop making them. But not today.
Good song. Good animation. Incredibly out of place on its YouTube
channel, in the most inspiring chaotic good way.
I closed out last year by saying that I wanted to accomplish a “big
milestone” this year. I actually had a specific milestone in mind that I
did not actually achieve and will not reveal, but I made good progress
towards it, and a lot of other things happened, enough that I think I’ll
count that as achieved.
The big thing is that I left my job at Zoom to have some time for
myself and family… though not before helping to give feedback on a draft
internet standard, publish a cryptography research paper
(on which I’m the “first author”, strictly due to the vagaries of the
English alphabet), and launch end-to-end
encrypted email. It was a productive year! I feel like I should have
more to say about all this, but it’s hard to think of anything that I
didn’t already write about last
year and also doesn’t require a blockbuster-length list of
prerequisites. However, if you ever want to hear about the difficulties
of actually getting end-to-end encryption into production in
excruciating detail, invite me to a cocktail party with a lot of
whiteboards.
Code golf is the recreational activity1 of
trying to write programs that are as short as possible.2
Golfed programs still have to be correct, but brevity is prioritized
above every other concern — e.g., robustness, performance, or legibility
— which usually leads to really interesting code.
I think code golf is a lot of fun (although I think a lot of things
are fun, so it’s one of those hobbies that I get really into roughly one
month every year and then completely forget about for the remaining
eleven). I wanted to write an introduction because I don’t know of any
good general introductions to code golf, particularly ones that try to
be language-agnostic and that cover the fascinating world of
programming languages designed specifically for code golf,
which I’ll call golflangs for short. But more on that later.
Note: If you are the kind of person who prefers to just dive in and
try golfing some code without guidance, you should skip to the code golf sites section.
A simple example
Of course, there’s a reason most code golf tutorials focus on a
single language: most code golf techniques are language-specific. The
Code Golf & Coding Challenges (CGCC) StackExchange community has a
list of some golfing
tips that apply to most languages, but there are far more tricks in
just about any language-specific list, and most of the intrigue lies in
knowing the language you’re golfing well. So to provide a taste of the
code golf experience, let’s golf a simple problem, Anarchy Golf’s Factorial, in
Python.
In this problem, we have to read a series of positive integers from
standard input, one per line, and output the factorial of each, also one
per line. Here’s a stab at a simple, direct implementation with no
golfing at all:3
def factorial(n):if n <=1: return1return n * factorial(n-1)try:whileTrue:print(factorial(int(input())))exceptEOFError:pass
This post is brought to you by “I am procrastinating other stuff by
doing some long overdue maintenance on my blog”. Mainly, I finally
replaced the old float-based layout from the random Hugo
theme I forked, which I had been keeping just because it wasn’t broken,
with flexbox, so that I could more easily tweak some other things. If
things look broken, you may need to force-refresh or clear your cache,
and on the off chance things look mostly the same but you feel like
something about the layout feels subtly different, that’s what’s up.
While making these changes, I ended up digging through the flexbox
spec to debug an issue and learned some interesting things. (This
and other links in this post are permalinks to the November 2018 spec,
which I believe is the most recent official version as of time of
writing, but it’s nearly three years and there have been quite a few
changes in the “editor’s draft”. Also, this post is not a flexbox
tutorial and will not make sense if you are already familiar with
flexbox.)
Don’t you hate it when CTFs happen faster than you can write them up?
This is probably the only PlaidCTF challenge I get to, unfortunately.1
Web is out, retro is in. Play your favorite word game from the
comfort of your terminal!
It’s a terminal Wordle client!
I only solved the first half of this challenge. The two halves seem
to be unrelated though. (Nobody solved the second half during the CTF.)
The challenge was quite big code-wise, with more than a dozen files, so
it’s hard to replicate the experience in a post like this, but here’s an
attempt.
If I had a nickel for every CTF
challenge I’ve done that involves understanding the internal
structure of a QR code, I would have two nickels. Which isn’t a lot, etc
etc. That previous challenge probably helped me get first blood on
this.
The source code is wonderfully short:
import io, qrcode, stringflag_contents = [REDACTED]assertall(i in string.ascii_lowercase +'_'for i in flag_contents)flag =b"actf{"+ flag_contents.encode() +b"}"print("flag is %d characters"%len(flag))qr = qrcode.QRCode(version=1, error_correction = qrcode.constants.ERROR_CORRECT_L, box_size=1, border=0)while1:try: inp =bytes.fromhex(input("give input (in hex): "))assertlen(inp) ==len(flag)except:print("bad input, exiting")break qr.clear() qr.add_data(bytes([i^j for i,j inzip(inp, flag)])) f = io.StringIO() qr.print_ascii(out=f) f.seek(0)print('\n'.join(i[:11] for i in f))
Now that kmh is gone, clam’s been going through pickle withdrawal. To
help him cope, he wrote his own pickle pyjail. It’s nothing like kmh’s,
but maybe it’s enough.
Language jails are rapidly becoming one of my CTF areas of expertise.
Not sure how I feel about that.
#!/usr/local/bin/python3import pickleimport ioimport sysmodule =type(__builtins__)empty = module("empty")empty.empty = emptysys.modules["empty"] = emptyclass SafeUnpickler(pickle.Unpickler):def find_class(self, module, name):if module =="empty"and name.count(".") <=1:returnsuper().find_class(module, name)raise pickle.UnpicklingError("e-legal")lepickle =bytes.fromhex(input("Enter hex-encoded pickle: "))iflen(lepickle) >400:print("your pickle is too large for my taste >:(")else: SafeUnpickler(io.BytesIO(lepickle)).load()
pickle
is a Python object serialization format. As the docs page loudly
proclaims, it is not secure. Roughly the simplest possible code to pop a
shell (adapted from David
Hamann, who constructs a more realistic RCE) looks like:
It’s clam’s newest javascript Calculator-as-a-Service: the CaaSio
Please Stop Edition! no but actually please stop I hate jsjails js isn’t
a good language stop putting one in every ctf I don’t want to look at
another jsjail because if I do I might vomit from how much I hate js and
js quirks aren’t even cool or funny or quirky they’re just painful
because why would you design a language like this
ahhhhhhhhhhhhhhhhhhhhh
It’s just a JavaScript eval jail.
#!/usr/local/bin/node// flag in ./flag.txtconst vm =require("vm");const readline =require("readline");constinterface= readline.createInterface({input:process.stdin,output:process.stdout,});interface.question("Welcome to CaaSio: Please Stop Edition! Enter your calculation:\n",function (input) {interface.close();if ( input.length<215&&/^[\x20-\x7e]+$/.test(input) &&!/[.\[\]{}\s;`'"\\_<>?:]/.test(input) &&!input.toLowerCase().includes("import") ) {try {const val = vm.runInNewContext(input, {});console.log("Result:");console.log(val);console.log("See, isn't the calculator so much nicer when you're not trying to hack it?" ); } catch (e) {console.log("your tried"); } } else {console.log("Third time really is the charm! I've finally created an unhackable system!" ); } });
I participated in the AGI Safety
Fundamentals program recently. The program concludes with a flexible
final project, with the default suggestion of “a piece of writing,
roughly the length and scope of a typical blog post”, so naturally, I
deleted all but the last two words and here we are.
When I previously considered machine learning as a field of study, I
came away with an impression that most effort and computation power was
going into training bigger, more powerful models; whereas the inner
workings of the models themselves, not to mention questions like why
certain architectures or design choices work better than others,
remained inscrutable and understudied. This impression always bothered
me, and it definitely influenced me away from going into AI as a career.
Of course, there are important, objective safety concerns around
developing and designing models we don’t understand, many of which we
discussed in the program; but my discomfort is mostly a completely
unrelated nagging feeling I get whenever I’m relying on things I don’t
understand.
After the program and all the concurrent developments in AI
(including AlphaCode,
OpenAI’s math olympiad
solver1, SayCan, and, of course, DALL-E 2), I still had this
impression about the field at a very high level, but I also became more
familiar with the subfield of interpretability — designs and
tools that allow us to understand and explain decisions by ML systems,
rather than treating them as black-boxed mappings from inputs to outputs
— and confirmed that enough people study it to make it a thing. One
quote from a post on the views
of Chris Olah, noted interpretability researcher, captured my
feeling particularly eloquently:
interpretability is very aligned with traditional scientific
virtues—which can be quite motivating for many people—even if it isn’t
very aligned with the present paradigm of machine learning.
I found the whole post insightful, and it happens that the bits
before that in the passage were also relevant to me. I don’t have access
to lots of compute!
Inspired by that post and by a desire to actually write some code
(which I figured might help me understand the inner workings of modern
ML systems in a different sense), and after abandoning a few other
project ideas that were far too ambitious, I decided to go through some
parts of the fast.ai tutorial and
riff on it to see how much progress I could make interpreting the
models, and to write up the process in a blog post. I tried to capture
my experience holistically, bugs and all, to serve as a data point for
what it might feel like to start ML engineering (for the rare
individuals with a background and inclinations just like mine2), and maybe entertain more
experienced practitioners or influence their future tutorial
recommendations. A much lower-priority goal was trying to produce “my
version of the tutorial”, which would draw more liberally from an
undergraduate math education3 and dive more deeply
into technical details.
Last weekend Galhacktic
Trendsetters sort of spontaneously decided to do DiceCTF 2022, months or years after
most of us had done another CTF. It was a lot of fun and we placed
6th!
Back in the day the silver edition was the top of the line Texas
Instruments calculator, but now the security is looking a little
obsolete. Can you break it?
It’s yet another Python jail. We input a string and, after it makes
it through a gauntlet of checks and processing, it gets
exec’d.
#!/usr/bin/env python3import disimport sysbanned = ["MAKE_FUNCTION", "CALL_FUNCTION", "CALL_FUNCTION_KW", "CALL_FUNCTION_EX"]used_gift =Falsedef gift(target, name, value):global used_giftif used_gift: sys.exit(1) used_gift =Truesetattr(target, name, value)print("Welcome to the TI-1337 Silver Edition. Enter your calculations below:")math =input("> ")iflen(math) >1337:print("Nobody needs that much math!") sys.exit(1)code =compile(math, "<math>", "exec")bytecode =list(code.co_code)instructions =list(dis.get_instructions(code))for i, inst inenumerate(instructions):if inst.is_jump_target:print("Math doesn't need control flow!") sys.exit(1) nextoffset = instructions[i+1].offset if i+1<len(instructions) elselen(bytecode)if inst.opname in banned: bytecode[inst.offset:instructions[i+1].offset] = [-1]*(instructions[i+1].offset-inst.offset)names =list(code.co_names)for i, name inenumerate(code.co_names):if"__"in name: names[i] ="$INVALID$"code = code.replace(co_code=bytes(b for b in bytecode if b >=0), co_names=tuple(names), co_stacksize=2**20)v = {}exec(code, {"__builtins__": {"gift": gift}}, v)if v: print("\n".join(f"{name} = {val}"for name, val in v.items()))else: print("No results stored.")
Last weekend Galhacktic
Trendsetters sort of spontaneously decided to do DiceCTF 2022, months or years after
most of us had done another CTF. It was a lot of fun and we placed
6th!
I made a blazing fast MoCkInG CaSe converter!
blazingfast.mc.ax
We’re presented with a website that converts text to AlTeRnAtInG
CaSe. The core converter is written in WASM, and also checks that its
input doesn’t have any of the characters <>&".
The JavaScript wrapper takes an input from the URL, converts it to
uppercase, feeds it to the converter, and if the check passes, injects
the output into an innerHTML. The goal is to compose a URL
that, when visited by an admin bot, leaks the flag from
localStorage.
The converter is compiled from this C code:
int length, ptr =0;char buf[1000];void init(int size){ length = size; ptr =0;}char read(){return buf[ptr++];}void write(char c){ buf[ptr++]= c;}int mock(){for(int i =0; i < length; i ++){if(i %2==1&& buf[i]>=65&& buf[i]<=90){ buf[i]+=32;}if(buf[i]=='<'|| buf[i]=='>'|| buf[i]=='&'|| buf[i]=='"'){return1;}} ptr =0;return0;}