I want to add a second word in the title, something like “Koans” or “Vignettes”, but I don’t know a word with the right connotations.
I realized recently that I have been walking around for a long time with some confusion and unknown unknowns about how concurrency works in various settings, and decided to write about it until I stopped being confused. This post doesn’t therefore have much of a “point”.
Concurrency and Parallelism
Wikipedia, as of time of writing:
Concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the outcome.
There are two broad reasons concurrency is useful. One is for performance: if you want your computer to perform as many floating point operations as possible by lunchtime, you want all CPUs/GPUs/etc. to be performing operations simultaneously. Another is that you’re in a problem domain where you simply can’t predict the order of events: you’re writing a user interface, and the user can click on any of multiple buttons in any order; or you’re writing a web server, and any number of clients can request any pages in any order. These reasons are not mutually exclusive.
Passwords. It’s 2023 and we still have to deal with them.
Many people know that, per the canonical xkcd, sequences of randomly chosen words such as
make relatively memorable but hard-to-crack passwords. One popular strategy for randomly choosing words is Arnold Reinhold’s Diceware™, a list of 65 = 7776 “words” that you can randomly sample from by rolling five dice (analog or digital). (I won’t go into topics like how to calculate the entropy of passwords and how long a password you should try to have here, since most Diceware overviews already discuss them at length.)
A few people have iterated on the concept since then: probably most notably, the Electronic Frontier Foundation published their own word list in 2016, with words chosen to be more well-known and memorable, at the cost of taking longer to type. I’m a fast typer and prefer the EFF’s wordlist over the original, and am very grateful to them for creating it, but after generating quite a few passwords with it over the last few years, I began to feel that it still had a lot of room for improvement.
“shouldn’t this have been published a few months ago?” yeah, probably. I even considered submitting it to the AoC contest. time is a real beast.
The title is clickbait. I did not design and implement a programming language for the sole or even primary purpose of leaderboarding on Advent of Code. It just turned out that the programming language I was working on fit the task remarkably well.
I can’t name just a single reason I started work on my language, Noulith, back in July 2022, but I think the biggest one was even more absurdly niche: I solve and write a lot of puzzlehunts, and I wanted a better programming language to use to search word lists for words satisfying unusual constraints, such as, “Find all ten-letter words that contain each of the letters A, B, and C exactly once and that have the ninth letter K.”1 I have a folder of ten-line scripts of this kind, mostly Python, and I thought there was surely a better way to do this. Not necessarily faster — there is obviously no way I could save time on net by optimizing this process. But, for example, I wanted to be able to easily share these programs such that others could run them. I had a positive experience in this with my slightly older golflang Paradoc, which I had compiled into a WASM blob and put online and, just once, experienced the convenience of sharing a short text processing program through a link. (Puzzle: what does this program do?) I also wanted to write and run these programs while booted into a different operating system, using a different computer, or just on my phone.
As I worked on it, I kept accumulating reasons to keep going. There were other contexts where I wanted to quickly code a combinatorial brute force that was annoying to write in other languages; a glib phrasing is that I wanted access to Haskell’s list monad in a sloppier language. I also wanted an excuse to read Crafting Interpreters more thoroughly. But sometimes I think the best characterization for what developing the language “felt like” was that I had been possessed by a supernatural creature — say, the dragon from the Dragon Book. I spent every spare minute thinking about language features and next implementation steps, because I had to.
The first “real program” I wrote in Noulith was to brute force constructions for The Cube, for last year’s Galactic Puzzle Hunt in early August, and it worked unexpectedly well. I wrote a for loop with a 53-clause iteratee and the interpreter executed it smoothly. Eventually I realized that the language could expand into other niches in my life where I wanted a scripting language. For example, I did a few Cryptopals challenges in them. It would take a month or two before it dawned on me that the same compulsion that drove me to create this language would drive me to do Advent of Code in it. That’s just how it has to be.
This post details my thought process behind the design of this language. Some preliminary notes:
Code golf is the recreational activity1 of trying to write programs that are as short as possible.2 Golfed programs still have to be correct, but brevity is prioritized above every other concern — e.g., robustness, performance, or legibility — which usually leads to really interesting code.
I think code golf is a lot of fun (although I think a lot of things are fun, so it’s one of those hobbies that I get really into roughly one month every year and then completely forget about for the remaining eleven). I wanted to write an introduction because I don’t know of any good general introductions to code golf, particularly ones that try to be language-agnostic and that cover the fascinating world of programming languages designed specifically for code golf, which I’ll call golflangs for short. But more on that later.
Note: If you are the kind of person who prefers to just dive in and try golfing some code without guidance, you should skip to the code golf sites section.
A simple example
Of course, there’s a reason most code golf tutorials focus on a single language: most code golf techniques are language-specific. The Code Golf & Coding Challenges (CGCC) StackExchange community has a list of some golfing tips that apply to most languages, but there are far more tricks in just about any language-specific list, and most of the intrigue lies in knowing the language you’re golfing well. So to provide a taste of the code golf experience, let’s golf a simple problem, Anarchy Golf’s Factorial, in Python.
In this problem, we have to read a series of positive integers from standard input, one per line, and output the factorial of each, also one per line. Here’s a stab at a simple, direct implementation with no golfing at all:3
This post is brought to you by “I am procrastinating other stuff by doing some long overdue maintenance on my blog”. Mainly, I finally replaced the old float-based layout from the random Hugo theme I forked, which I had been keeping just because it wasn’t broken, with flexbox, so that I could more easily tweak some other things. If things look broken, you may need to force-refresh or clear your cache, and on the off chance things look mostly the same but you feel like something about the layout feels subtly different, that’s what’s up.
While making these changes, I ended up digging through the flexbox spec to debug an issue and learned some interesting things. (This and other links in this post are permalinks to the November 2018 spec, which I believe is the most recent official version as of time of writing, but it’s nearly three years and there have been quite a few changes in the “editor’s draft”. Also, this post is not a flexbox tutorial and will not make sense if you are already familiar with flexbox.)
Don’t you hate it when CTFs happen faster than you can write them up? This is probably the only PlaidCTF challenge I get to, unfortunately.1
Web is out, retro is in. Play your favorite word game from the comfort of your terminal!
It’s a terminal Wordle client!
I only solved the first half of this challenge. The two halves seem to be unrelated though. (Nobody solved the second half during the CTF.) The challenge was quite big code-wise, with more than a dozen files, so it’s hard to replicate the experience in a post like this, but here’s an attempt.
If I had a nickel for every CTF challenge I’ve done that involves understanding the internal structure of a QR code, I would have two nickels. Which isn’t a lot, etc etc. That previous challenge probably helped me get first blood on this.
pickle is a Python object serialization format. As the docs page loudly proclaims, it is not secure. Roughly the simplest possible code to pop a shell (adapted from David Hamann, who constructs a more realistic RCE) looks like:
I participated in the AGI Safety Fundamentals program recently. The program concludes with a flexible final project, with the default suggestion of “a piece of writing, roughly the length and scope of a typical blog post”, so naturally, I deleted all but the last two words and here we are.
When I previously considered machine learning as a field of study, I came away with an impression that most effort and computation power was going into training bigger, more powerful models; whereas the inner workings of the models themselves, not to mention questions like why certain architectures or design choices work better than others, remained inscrutable and understudied. This impression always bothered me, and it definitely influenced me away from going into AI as a career. Of course, there are important, objective safety concerns around developing and designing models we don’t understand, many of which we discussed in the program; but my discomfort is mostly a completely unrelated nagging feeling I get whenever I’m relying on things I don’t understand.
After the program and all the concurrent developments in AI (including AlphaCode, OpenAI’s math olympiad solver1, SayCan, and, of course, DALL-E 2), I still had this impression about the field at a very high level, but I also became more familiar with the subfield of interpretability — designs and tools that allow us to understand and explain decisions by ML systems, rather than treating them as black-boxed mappings from inputs to outputs — and confirmed that enough people study it to make it a thing. One quote from a post on the views of Chris Olah, noted interpretability researcher, captured my feeling particularly eloquently:
interpretability is very aligned with traditional scientific virtues—which can be quite motivating for many people—even if it isn’t very aligned with the present paradigm of machine learning.
I found the whole post insightful, and it happens that the bits before that in the passage were also relevant to me. I don’t have access to lots of compute!
Inspired by that post and by a desire to actually write some code (which I figured might help me understand the inner workings of modern ML systems in a different sense), and after abandoning a few other project ideas that were far too ambitious, I decided to go through some parts of the fast.ai tutorial and riff on it to see how much progress I could make interpreting the models, and to write up the process in a blog post. I tried to capture my experience holistically, bugs and all, to serve as a data point for what it might feel like to start ML engineering (for the rare individuals with a background and inclinations just like mine2), and maybe entertain more experienced practitioners or influence their future tutorial recommendations. A much lower-priority goal was trying to produce “my version of the tutorial”, which would draw more liberally from an undergraduate math education3 and dive more deeply into technical details.