Introduction to Code Golf and Golflangs

Code golf is the recreational activity1 of trying to write programs that are as short as possible.2 Golfed programs still have to be correct, but brevity is prioritized above every other concern — e.g., robustness, performance, or legibility — which usually leads to really interesting code.

I think code golf is a lot of fun (although I think a lot of things are fun, so it’s one of those hobbies that I get really into roughly one month every year and then completely forget about for the remaining eleven). I wanted to write an introduction because I don’t know of any good general introductions to code golf, particularly ones that try to be language-agnostic and that cover the fascinating world of programming languages designed specifically for code golf, which I’ll call golflangs for short. But more on that later.

Note: If you are the kind of person who prefers to just dive in and try golfing some code without guidance, you should skip to the code golf sites section.

A simple example

Of course, there’s a reason most code golf tutorials focus on a single language: most code golf techniques are language-specific. The Code Golf & Coding Challenges (CGCC) StackExchange community has a list of some golfing tips that apply to most languages, but there are far more tricks in just about any language-specific list, and most of the intrigue lies in knowing the language you’re golfing well. So to provide a taste of the code golf experience, let’s golf a simple problem, Anarchy Golf’s Factorial, in Python.

In this problem, we have to read a series of positive integers from standard input, one per line, and output the factorial of each, also one per line. Here’s a stab at a simple, direct implementation with no golfing at all:3

This comes out to 157 bytes.

Feel free to stop reading here for a while and see how much you can shorten it!

Note that, in computing this byte count and all others in this post, I am assuming that the code does not have a newline at the very end, although many text editors will add one by default. (Also, I’m using Linux-style \n newlines rather than Windows-style \r\n newlines.) In Vim, I have to :set binary noeol to save a file without that final newline. If you copy this code into a text editor and get a slightly larger byte count, that might be why; and if you submit code to a golf site by uploading a file, you should almost always make sure the file doesn’t have a trailing newline to get the best possible score.

First we’ll golf this code in some straightforward ways. We’ll shorten the long name factorial, replace the if statement with a conditional expression, replace True with 1 (since nonzero numbers test as true in Python, sometimes less formally called being “truthy” in other languages), and remove all the error handling and unnecessary whitespace. Unlike, say, most programming competitions, having your program terminate with an error is typically accepted in code golf — as long as your program writes exactly the correct bytes to standard output, it doesn’t matter if it spews additional junk to standard error or exits with a nonzero code. This gets us down to 68 bytes:

Already, this reveals a corner of Python that I expect most nongolfers not to know: it’s legal to not put a space between an integer literal 1 and a subsequent if or else! Well, at least for now; Python 3.12 release notes list this syntactic feature as pending removal, per GitHub issue. And you can tell that the syntax is dicey just from how the syntax highlighter handles it incorrectly.

Trying a little harder: Firstly, it turns out using def to define a function is quite a bit more verbose than defining it as a lambda due to having to also explicitly return the result. Secondly, n<=1 could be replaced with n<2. But we can do even better: we can choose to stop recursing when n is zero instead, so we can just test n itself, again because integers are truthy precisely when nonzero. This produces a solution with 61 bytes:

There’s one more arcane4 trick we can use to save a byte off the conditional, which currently uses the rather verbose if/else keywords. When golfing Python, the best way to express a conditional is often treating a boolean as an integer and indexing into a two-element list: [f,t][b], where b is a boolean, is f if b is False (which coerces5 to 0) and t if b is True (which coerces to 1). Unfortunately, we can’t use that strategy here because it always evaluates both f and t, whereas we do need to short-circuit to prevent f from recursing infinitely, not to mention that we want to use n’s truthiness, which is not a boolean. However, because f’s output is always nonzero, we can still save 1 byte by replacing the conditional with an and/or pair, producing this solution with 60 bytes:

Actually… do we even need a conditional with two branches? If we had an expression g(n) that’s 1 when n is 0 and 0 otherwise, we could just write g(n)or n*f(n-1). Using not n for that expression almost works, but not quite: it’s True when n is 0, and even though True is equal to 1, it gets printed as “True” when 0 is in the input, which isn’t the correct output. int(not n) would work, but that’s much too long. After some brainstorming, we realize that n<1 is shorter than not n and unary plus is shorter than int, so we end up with the following solution, with 59 bytes:

Hmm, the fact that the unary + and < don’t have the precedence we want sure is annoying. Is there any way to get rid of those parentheses? There actually is: we can just move the + to f’s call site. This produces a solution with 57 bytes:

Whew!

We’ve been golfing off one or two bytes at a time for a while now, so this is a good time to step back and reconsider our entire strategy. Perhaps a recursive function was never the best way to compute a factorial, and we should have used an imperative loop instead? Trying that out, we find a solution that’s… also 57 bytes:

Having to spend three additional bytes on indentation is unfortunate, but it’s necessary to get the nested loop syntactically right.

Is there any other way to compute factorials? Yes: making the standard library do it. It turns out that ever since Python 2.6, the math module has had a factorial built-in. Even though we have to spend a bunch of characters importing it, the result just barely edges out all our solutions so far, coming in at 55 bytes:

(You can check for yourself that directly grabbing the math module in an expression with __import__('math') does not help and actually costs 2 extra bytes.)

But actually, if we go back to the imperative loop, we can improve it even more by radically adapting it to apply an even more arcane Python golf trick: looping by repeating a string containing source code and passing it to exec, which is a Python function that just takes a string and executes it as Python source code. (You may be more familiar with exec’s sibling eval; they differ in that exec executes statements and doesn’t return anything, while eval only evaluates expressions but returns the result of the evaluation.) As a bonus, we also pull out the obscure trick that you can assign the same value to more than one variable with a=b=1. The result of this approach clocks in at… 54 bytes, a single byte shorter than the standard library solution:

But actually actually, as of Python 3.8, the math module has a slightly shorter perm built-in6, which counts “partial permutations”, i.e. the value often denoted \(_nP_k\) in high school combinatorics textbooks… but perm’s second argument is optional, and if it isn’t passed, this function also computes the factorial of its argument. Which gives us a solution with 51 bytes! (Although Anarchy Golf’s Python version is too old to accept this solution.)

This is the best solution I’m aware of, but it’s possible that there are a few more bytes to be shaved off somewhere, and hopefully it provided an illuminating glimpse into what code golf is like. Sometimes you spend a long time applying local optimizations, slightly rewriting expressions to squeeze out individual bytes. Sometimes you have to throw away your approach and all of your code. Disturbingly often, you completely rewrite your program and produce something one or two bytes off from what you started with, and then you spend a while golfing two or more totally different approaches in parallel, unsure which one will ultimately produce a shorter program after you apply all the local optimizations you can find. Finally, although this pedagogical beat was a bit ruined by math.perm, many well-golfed programs are both incomprehensible and slow. Our exec solution has to create, parse, and execute a length-\(10n\) snippet of code anew for each calculation.

Of course, there are a lot of aspects of code golf that I wasn’t able to demonstrate in this short example. Here are just two general categories of tasks a code golfer might encounter in a more difficult challenge:

  • Monitoring when it becomes more efficient to extract a repeated constant, function or string into a helper variable. If you call print \(n\) times in your Python program, prepending the line p=print and then replacing all those calls with calls to p will save \(4n - 8\) bytes, so the replacement is byte-neutral when \(n = 2\) and beneficial when \(n \geq 3\). More extremely, if you have a Python program with a lot of lambdas, at some point it becomes more efficient to replace each lambda with some character, say !, and surround it with exec('your code'.replace('!','lambda ')). (You also have to pay an additional cost if your code contains string literals, especially ones with escape sequences, so the math gets complicated quickly.) You can consult some CGCC answers that do the math for Python built-ins and Python lambdas, though the idea generalizes to other languages.

    This might sound simple: you just try each replacement independently, and keep it if it saves bytes. But replacements can interfere with each other and interact with other optimizations in ways that are hard to foresee. For example, maybe your code currently has three print calls, but you found and are contemplating an optimization that combines all three calls into one. That might or might not save bytes compared to putting print in a helper variable. So these optimizations can add some backtracking to what would otherwise be a linear process of slowly golfing off bytes.

  • Finding unintuitive but short formulas for things, often by taking advantage of a restricted domain. A very simple example similar to things we saw in the factorial problem: if n is known to be a nonnegative integer, then n<1 is shorter than n==0. A more interesting example is the criterion for leap years \[\text{gcd}(80, y) > \text{gcd}(50, y).\] (Think about it and see if you can understand why it works!) This isn’t often useful in code golf because few languages provide easy access to GCD, and of those that do, many are golflangs in which some alternative strategy is even terser; but it does happen with e.g. Haskell7, and I think it’s a cool fact on its own.

    Finally, to provide a realistic example from deep inside an actual golf challenge, in a certain 2018 Advent of Code golf contest, I found and used the following formula: if x and y are single uppercase letters from the set {N, E, S, W}, then they represent opposite compass directions iff \[\text{xor}(\text{ord}(x), \text{ord}(y)) \equiv 7 \bmod 11.\] There’s nothing special about those numbers; it’s just that when the domain of arguments that your formula has to work on is small enough, you can often devise a short formula with small constants, if not by just making tables of the results of a bunch of common mathematical operators, then simply by brute force (see e.g. pysearch8).

Incidentally, code golf is one metric along which Python 3 is quite a bit worse than Python 2. Our exec-based solution naturally translates to a Python 2 solution with 46 bytes, even shorter than the math.perm solution:

The differences:

  • exec is a statement instead of a function, so you can remove the pair of parentheses used to call it, for −2 bytes.
  • input evaluates its input, so the int() call can be removed entirely for −5 bytes. In Python 2, it’s actually more verbose to read input without evaluating it.
  • print is also a statement instead of a function, but after removing its parentheses, you have to add a space back to separate it from the identifier it’s called with, so this only gives −1 byte.

I think these are all good language changes! The more language features that can be implemented as standard built-ins rather than ad hoc syntax, the better. Code golf shouldn’t usually be a consideration when designing a general purpose programming language.

Code golf sites

If the example has inspired you to try code golf yourself, there are three reasonably active code golf websites/communities where I can recommend, each with a fairly distinctive “style”: Anarchy Golf, Code Golf and Coding Challenges, and code.golf. (There are many more that I just don’t know enough about to recommend, e.g. Code Golf Codidact or various sites linked from code.golf, but they still might be good!) I’ll go over each in turn.

Anarchy Golf

Anarchy Golf, occasionally shortened to “anagol”, might just be the longest-running code golf site.9 It’s the source of the Factorial problem and some of the solutions for it that we’ve seen. Judging by the first Hello World submission, Anarchy Golf began in January 2007. Its purpose is, by its own admission, “not serious competition”:

Joke problems are welcomed and you can speak freely about problems and can release spoilers.

Anarchy Golf is quite old school in the way its challenges work: all programs read from standard input and are expected to output to standard output. Also, Anarchy Golf challenges can have only three test cases, so it’s more common for optimal solutions to take advantage of the exact test cases, or even ignore the inputs entirely — in fact, this is sometimes required, with I think the simplest example being 123. The site is very simple — it doesn’t even have logins or any kind of authentication; you simply write your name every time you submit a program, and that’s the name your program will go in the leaderboard under. But it works and its simplicity probably contributes to its incredible longevity. I participated a bunch in Anarchy Golf in 2010, though I don’t have any of my files from then and don’t see a particularly good way to look up my submissions. Also, I was really bad at golf in 2010.

An Anarchy Golf leaderboard showing betaveros in last place with a 3-character program, while nearly every other submission has length 1.
Apparently 2010 me could not find the single-character GolfScript program to solve this challenge and instead submitted something 200% longer

Anarchy Golf challenges can either be “endless” or have a deadline. Solutions to endless challenges are never revealed (they’re not even saved by the site). For challenges with a deadline, each user’s best submitted solution becomes public afterwards, although you can still keep submitting solutions after the deadline; they’ll just be indicated as such by appearing in red.

Code Golf Stack Exchange

The Code Golf and Coding Challenges (CGCC)10 StackExchange began in January 2011.11 It’s an atypical StackExchange that isn’t really used for questions and answers per se; instead, CGCC users post challenges as “questions” and programs that solve the challenges as “answers”. Although code golf isn’t the only kind of programming challenge that fits this community’s bill, it is by far the most popular, at least in terms of day-to-day activity. There’s a blog post on the SE’s meta exchange with a pretty good rundown of the site’s history, including its “multiple cultural shifts” and drastic changes, which I won’t rehash; in brief, although the site encompassed many kinds of coding challenges for many years, surges of low-quality activity related to subjective challenge types led to a community consensus that questions/challenges should have an “objective scoring criterion”, and “shortest code” is one of the most versatile criteria that challenges can be posed for.

Interestingly, you’d get a drastically different sense of the community if you looked at, say, the top questions by score. Of the all-time top ten, only three are code golf–related. (The top question asks for an implementation of Tetris in Conway’s Game of Life — and received one, a joint effort from ten contributors over 1½ years. Other cool feats include a program for generating cool images with all colors, which would probably no longer fly under the “objective scoring criterion” rule, and an extremely optimized implementation of FizzBuzz.)

The biggest differentiator of CGCC among code golf sites is that, because submissions are judged by the community through votes rather than by an automated system that would need to explicitly support the submitted program’s language, you can golf in any programming language you want.12 And people do! The result is substantially more representation from golflangs and even non-golf esoteric programming languages among submitted solutions, as well as just in the culture at large. This aspect of the site has been contentious for some time, with users semiregularly expressing that golflangs make code golf less fun because non-golf languages have no shot at winning, i.e., producing the shortest possible solution. A popular response is that programs in different languages shouldn’t be considered to be competing with each other in this way; programs should be upvoted based on perceived cleverness, with sheer byte count only being used to compare submissions in the same language. In particular, a golf language submission that’s short due to the challenge being a good match for the language built-ins, rather than algorithmic cleverness, shouldn’t be (excessively) upvoted. I have sympathy for both sides, but, as you can probably tell from this post, I tend to gravitate towards the latter just because I think golflangs are very neat.

Another point worth mentioning is that CGCC challenges often simply ask you to “define a function” that accomplishes a task, rather than worrying about input and output or string parsing and formatting. Often, challenges will go as far as explicitly offering flexibility to choose the most convenient input/output format for your language to work with, which tends to even the playing field a bit for programming languages (as we saw above, some programming languages require much more boilerplate than others just to read and print an integer). Another side effect is that golflangs designed by CGCC users sometimes don’t even bother to have any facilities for handling input/output.

Finally, because of the Q&A nature of the underlying site, solutions on CGCC are public from the moment they’re submitted. Discussing solutions, suggesting small optimizations in comments, and collaborating in other ways are all common and encouraged.

code.golf

The last site on my list, code.golf, is also the most modern. It has a pretty front page, automated judgment, lots of statistics and achievements, and an active Discord server. Based on the first commit on the GitHub repo, code.golf first came into existence October 2016.

I actually struggle to write a lot about this website; despite being much newer than the other two sites, it fits my “default” conception of how a code golf site should look and work the best. Unlike Anarchy Golf, you can make an account and log in! The website has a decent in-browser code editor and even saves your shortest solutions for each challenge in each language, so you can do a lot of golfing without leaving your browser.

Input and output for challenges on code.golf also work a bit differently from either of the above sites; programs receive input from command line arguments instead of stdin, which usually means slightly less code is required to parse input. (One exception is that the site supports GolfScript, a golflang we’ll discuss very soon, by putting the arguments directly on its “stack”.)

Another special aspect of code.golf is that it has separate leaderboards for golfing the number of bytes and the number of characters. As a result, among code golf sites, “packers” are uniquely relevant on code.golf.13 A “packer” is a technique for “compressing” a (typically ASCII) string as a string with more bytes but fewer characters, such that the string can be tersely decompressed and then used in the program, typically by being evaled. The site’s guides include many packers for different programming languages, and its users are developing more all the time.

Finally, unlike the other two sites, code.golf solutions are never revealed, although sometimes golfers exchange tips in the associated Discord server, and adding challenges for which solutions are revealed after some set amount of time a la Anarchy Golf have been proposed in the past.

How to design a golflang

Writing short programs is cool and all, but some code golf enthusiasts had an idea to push the frontier of the sport even further: what if you designed a programming language for code golf? What if you took all the tradeoffs code golfers make in pursuit of shorter code — sacrificing simplicity, legibility, maintainability, robustness, performance — and introduced them into the language design process?

As a preview, here’s a solution to the factorial problem from the first section, written in what is almost certainly the first golflang, GolfScript. It’s 14 bytes:

~]{,1\{)*}/n}/

It’s easier to explain how this program works after we’ve encountered some more general golflang concepts, so I’ve deferred an explanation to an appendix. GolfScript was first created in 2007 and, though it still usually beats non-golflangs, it is also usually far surpassed by more modern golflangs. (You can see for youself that the shortest program on Anarchy Golf’s Factorial page is 2 bytes.)

Also before we continue discussing this, a disclaimer: it’s entirely reasonable to play code golf without learning or thinking about any golflangs (much less designing or implementing one), and some golfers think golflangs make the code golf less fun. For example, some of the fun from golfing in a “real programming language” comes from the transgressiveness of doing things you’re “not supposed to do” in pursuit of shorter code (like using string repetition and exec to loop), whereas golfing in a golflang is often just using it as designed. Also, golfing in a language often teaches you obscure facts about it, which tend to be more interesting or relevant if you also use the language in non-golf contexts (here’s one blog post about what code golf taught the author about Python). On the other hand, one might argue that golfing in a well-designed golflang strips away concerns about “trivia” like the lengths of names your programming language happens to give various operations (see GolfScript’s author discuss this with reference to his newer golflang, Nibbles), and lets you focus on the purer challenge of finding the “simplest” algorithm to solve a problem. And personally, I just think programming language design is fun.

Anyway: let’s look at the last two factorial programs we came up with, which I’ve reproduced below, and think about how Python prevents us from making either much shorter.

An obvious one is names and tokenization: Python has many keywords and built-ins that are long (by code golf standards), like while, input, and print; plus it often forces you to separate them from adjacent keywords, identifiers, or literals with spaces. Python also doesn’t have that many built-ins that are immediately accessible. If perm were a default built-in rather than in the math module, we could delete the import from our standard library solution and immediately get a 33-byte solution.

There are many languages that aren’t usually much terser than Python, but do better on this problem just because they happen to have easily accessible built-ins that are a good fit. For example, Haskell has a 41-byte solution because it comes with product and the inclusive range syntax [1..n]. More starkly, Common Lisp, which is normally not that competitive, has a 30-byte solution because it (specifically CLisp) has !, a single-character factorial builtin (!). You could imagine designing a golflang that’s basically Python, but with single-character names for all its keywords and built-ins, and it would be reasonably effective. (If you travel down that path far enough, you might get something similar to Pyth.) And in general, one easy way to make your programming language better at golf is just to add lots of single-character built-ins.

If you do this naively, though, you’ll quickly run out of printable ASCII characters to allocate, which is why golflangs also often use heavy overloading: having a built-in function or operator behave in more than one way depending on how it’s used. Most programmers are used to some overloading, like how + might either add numbers or concatenate strings/lists, or how * might either multiply numbers or repeat a string/list some number of times, but golflangs typically take this much further. For example, GolfScript overloads * in three more ways: it can be used to repeat a “block” (approximately, call a function) some number of times, join a string/list by another string/list, or fold a list with a function. Many modern golflangs also use a custom character encoding that’s more compact than ASCII/UTF-8 to maximize the number of built-ins they can offer, most simply through a custom code page that assigns a character to each of the 256 possible bytes.

A lot could be written about choosing built-ins for your golflang, but I don’t think it’s a particularly “deep” subject, or at least, I lack the insight to systematize it, so I’ll just make some brief observations.

  • Most golflangs will benefit from a lot of built-ins that perform basic “data manipulation” tasks that just involve receiving and outputting numbers/strings/lists, from “repeat a string some number of times” to “list all prime factors of an integer”; but it’s also important to have common higher-order functions, from map, filter, and reduce to “find the fixed point of iterating this function” or “group the items in this list into buckets based on what this function maps them to”, which are often useful for composing your data manipulation built-ins.
  • When most of your names are single characters, there is quite an art in choosing and overloading names for built-ins that don’t overlap while making as much sense as possible. (Just because we’re prioritizing brevity over usability doesn’t mean we should completely disregard the latter.) For example, because $ looks like an S, GolfScript overloads it to either access the “Stack” or to “Sort” a list. If you want to make good use of the entire alphabet or code page of your choice, you sometimes have to get creative with mnemonics — when naming things for my own language, Paradoc, I decided that à took the maximum/larger of two values and Õ took the minimum/smoller. Or you can give up and assign some names arbitrarily. Another way to get around this is to develop a language that’s more verbose, but easier to read and write, and a compiler/transpiler from that language to a terser one.

A different source of verbosity is most languages’ need for grouping, typically with parentheses. We saw this briefly in our worked Python example, where we had to add parentheses to the expression +(n<1), because +n<1 would be parsed as (+n)<1. We later found a way to omit them, but it’s not always possible, and when you are forced to use parentheses or similar delimiters to specify the precedence or grouping of various expressions, you can waste a lot of bytes. However, there are pretty standard ways around this. If your language uses either prefix notation, where operators always precede their operands, or postfix notation, where operators always follow their operands, then the order of operators and operands suffices to imply all grouping. For example, an expression like (x+y)*z would be * + x y z in prefix notation and x y + z * in postfix notation, whereas x+(y*z) would be + x * y z in prefix notation and x y z * + in postfix notation. I won’t go into this more here since the Wikipedia articles have more detailed examples and these will naturally come up again later.

But let’s say we have a language that uses heavily-overloaded single bytes to represent all these keywords and built-ins, as well as either prefix or postfix notation to eliminate the need for most delimiters. There would remain, I think, one more subtle and particularly interesting source of byte inefficiency: we have to name our variables. In our exec-based Python solution, we used the variable names p and x each three times, i.e. we spent six bytes referring to variables. We can tell that there is definitely some redundancy here because those variable names could been replaced with any two distinct letters and the program would still work. By comparison, observe that the math.factorial and math.perm solutions do not name any argument or variable — both are just four functions, each one passing its return value to the next. Wouldn’t it be nice if we could write more code that way?

How to avoid variable names

The term for this style of programming is tacit programming: instead of naming your arguments, just compose functions in a way that implicitly defines how data flows in and out of them.

To put this into focus, let’s consider two trivial arithmetic tasks and how we might implement them in various languages:

  1. Read an integer \(n\) and output \(n^3 + 6\).
  2. Read an integer \(n\) and output \(n^3 + n\).

(We’ll assume that the entirety of standard input just has the single integer, to smooth over differences between languages as much as possible. There isn’t any significance to 3 and 6, except that if they were smaller, some languages might allow some trivial optimizations that would distract us.) Task 1 is very easy in Python. This is 24 bytes:

Despite being almost identical, Task 2 takes 4 extra bytes, because Python doesn’t have a neat way to read \(n\) and then use it twice in an expression. We just have to go through an explicitly declared variable, which costs a lot of extra bytes.

The goal of tacit programming here, then, is to express the function that maps \(n \mapsto n^3 + n\) without giving \(n\) a name: to directly describe this composition of exponentiation, addition, and the constant 3 without identifying the parameter they apply to. I’ll cover three major approaches to doing this.

Stack-Based Programming Languages

Roughly, stack-based programming languages like Forth and GolfScript work as follows: at all times during the program’s execution, there’s a global “data stack” holding various objects, and when a function is called, it usually pops some number of objects from the stack, performs some computation on them, and then pushes the result back onto the stack. Thus, each function’s arguments are implicitly “the top K elements of the stack when the function is called”, so those operands don’t need to be named. (This is also a very natural way to implement postfix notation, which also allows stack-based languages to avoid the need for grouping delimiters.)

In GolfScript, task 1 takes 5 bytes:

~3?6+

For this problem as described, GolfScript has a huge golf advantage in that it does implicit input and output. Specifically, when a GolfScript program starts, the entirety of standard input is read and put onto the stack, and when it ends, whatever contents remain on the stack are printed. (While convenient for the vast majority of golf challenges, this means that you can’t write interactive programs in GolfScript.) The step-by-step breakdown of this program is:

  • (Implicit input: the contents of stdin are placed on the stack as a string.)
  • ~ evaluates that string, converting it to an integer; call it \(n\).
  • 3 pushes the integer 3 onto the stack, above \(n\).
  • ? performs exponentiation: it pops two integers (for us, 3 and then \(n\)), raises the bottom integer to the power of the top integer, and pushes it back onto the stack.
  • 6 pushes the integer 6 onto the stack, above \(n^3\).
  • + pops two integers (for us, \(n^3\) and 6) and pushes their sum.
  • (Implicit output: the contents of the stack are automatically printed.)

Task 2 also only takes 5 bytes:

~.3?+
  • As before, input is implicitly read and then evaluated with ~.
  • The key new built-in is ., which duplicates the top element of the stack, so now the stack comprises two copies of the input integer.
  • As before, 3 and ? cube the top integer, but they don’t affect the lower number in the stack, so now the stack has \(n^3\) on top of \(n\).
  • As before, + sums the top two integers.
  • Finally, the program ends and prints that sum because it’s still on the stack.

The stack obviates the need to name variables; instead, variables can be “identified” by their position in the stack, and being able to duplicate values on the stack means that inputs to a computation can be used in multiple places. Still, when you need to juggle a lot of inputs or intermediate values in a computation, the stack-based approach can quickly become annoying because manipulating elements that are buried under a few others in the stack is often difficult. Figuring out the optimal order in which to keep your data on the stack is one important part of golfing GolfScript programs. (GolfScript also does let you assign to variables with :; this is usually more verbose than alternatives, but it can be useful.)

Functional Programming Languages

In functional golflangs, functions take arguments and produce return values as is more typical. Tacit data flow is enabled just by providing lots of higher-order functions (functions that take other functions as arguments) that combine functions in the exact way you want.

As a baseline to see how this might work in a non-golf language, let’s look at a tacit solution to Task 2 in Haskell:

The important part is just (^3)>>=(+); everything outside is just formalities for handling input/output. ^3 is exponentiation partially applied14 with a right argument of 3, so (^3) is a function that cubes something; and + is, naturally, addition. Most interestingly, >>= is “bind” of monad fame, which we are using here in the function monad, sometimes also called the reader monad; it’s the crucial higher-order function that lets us feed our input into two subcomputations. Without getting into the weeds too much, this instantiation of >>= takes two functions f and g and produces the function \x -> g x (f x) , which roughly translates as e.g. lambda x: g(x, f(x)) in Python. Here, because g is addition and f is the cubing function we got from partial application, the part we singled out is a function that takes an integer \(n\) and outputs \(n^3 + n\), exactly as desired; and it doesn’t name \(n\) explicitly!

Unfortunately, due to the need for parentheses and the length of the >>= operator, this Haskell solution is slightly longer than it would be if we just wrote a lambda:

Still, it illustrates how a golfing languages might achieve the same effect. In the functional golfing language Husk, the analogue of >>= is just called S (with flipped arguments); and because Husk uses prefix notation for everything, we don’t need parentheses, so the same function actually be written with only 4 bytes:

S+^3

This is also a full Husk program that can be run, but this byte count is not technically directly comparable with our other programs since Husk always receives input from command-line arguments; it has no way to read from stdin.

APL-Like Programming Languages

Finally, let’s look at APL and its descendants. This might be argued to be a variant of the functional golfing language approach, but instead of using explicit combinators to express data flow, APL-style languages rely more on syntax and the arrangement of component functions and operands. The line is blurry, but I think in practice this approach is distinctive and popular enough to deserve its own section.

Here’s a solution in J to Task 2 in 19 bytes:

(+^&3)&.".&.stdin''

J is not, by my definition, a golflang — it is designed to be good for serious mathematical and statistical data analysis. It is, however, (in)famously terse, and has been a popular language for golfing since before GolfScript existed. As in the Haskell example, most of this program deals with input and output; the function \(n \mapsto n^3 + n\) is simply the part in the parentheses, +^&3. That’s also four bytes!

How it works:

  • Of the four characters, & is what J calls a “conjunction” and has the highest precedence. It performs a partial application on ^ (exponentiation) and 3 to produce a function (or “verb”, in J terminology) that takes a single argument and cubes it, much like in the Haskell example.
  • + is also a verb, and performs addition.
  • The interesting thing is when the two verbs + and ^&3 are juxtaposed with no other operands. This forms a syntactic construct called a “hook”. When this hook is then called with a single argument \(n\), it calls the right verb ^&3 with \(n\) as its only argument, and then calls the left verb + with \(n\) as its first argument and the result of calling the right verb as its second argument. Thus, this hook implements the formula in Task 2 exactly.

Interestingly, this J solution is actually one byte shorter than the closest J solution to Task 1! That solution builds a “fork15 from these three components: the noun 6, the verb +, and the same partially-applied verb ^&3 as before; I won’t break it down here, but the idea is very similar.

(6+^&3)&.".&.stdin''

In J, most verbs are overloaded so that you can call them with either one argument (“monadically”) or two arguments (“dyadically”), and they’ll perform different, but usually related, behaviors. For example, we’ve seen that dyadic ^ performs exponentiation; monadic ^ computes the exponential function, raising \(e\) to the power of the sole argument. How many arguments a verb receives depends on the syntax surrounding its call. You can combine verbs into hooks and forks, and even longer “trains” thereof, just by juxtaposing them.

To see this approach taken even further, let’s look at Jelly, probably the most established APL-style golflang. The Jelly program that directly competes with our Husk example, i.e. that defines a function without worrying about input/output, is only 3 bytes:

*3+

(In Jelly, * is exponentiation; × is multiplication. You can’t tell from this example, but Jelly relies heavily on a completely custom code page.) How it works:

  • Because we’re calling this with a single argument \(n\), we evaluate the three characters as a “monadic chain” of three “atoms” on \(n\), and track an accumulator \(v := n\).
  • Looking at the first two atoms in the chain: * is “dyadic” and 3 is “niladic”, which matches the + 1 pattern, so we update \(v := v^3\).
  • There’s only one atom left, which matches the + pattern, so we update \(v := v + n\).
  • Now we’re done and have calculated \(v = n^3 + n\). Hooray!

One important difference between Jelly and APL/J is that Jelly does not overload its built-ins by number of arguments, or arity. *, for example, is always dyadic, and when you invoke (the analogue of) a function you previously defined, the arity is specified in the invocation itself rather than the surrounding syntax. This allows the syntax rules to take into account the arities of each atom when deciding how they should compose.

Also, unlike Husk, Jelly does have standard input facilities, and a single extra byte is enough to make it read from standard input and output to standard output:

Ɠ*3+

The point is that in these languages, the complicated ways in which data flows through functions are shunted into the language syntax (J’s hooks and forks, Jelly’s rules for how sequenced nilads/monads/dyads are evaluated) rather than into built-ins you explicitly invoke.

However, one key assumption that allows these syntactic rules to be so expressive that in these APL-likes, all functions take at most two parameters. You literally can’t define a “function that takes three or more parameters” in these languages. (Of course, it’s always possible to implement “equivalent” versions that have only one or two parameters by saying that you have to bundle up a bunch of arguments into a list and pass that as a single argument.) I thought this was a critical limitation at first, but after mulling it over more I found that operations that “naturally” take more than two parameters, that can’t be decomposed into reasonable two-parameter functions, are surprisingly rare. Simple examples include:

  • String substitution: in string X, replace Y with Z.
  • String padding: pad a string X to width at least Y with character Z.
  • Dictionary update: given a dictionary D, set D[X] = Y.

And… that’s actually all I was able think of. (There are a lot of very complicated functions that might make the most sense with a lot of arguments, like “make an HTTP request”, but they’re rare in code golf challenges.)

Furthermore, I’m being a bit loose with terminology here: the two-parameter restriction concerns only non-higher-order functions, what J calls “verbs” and Jelly calls “links”. All of these APL descendants have syntax for modifying a function into a new function that can then be called with up to two parameters. Although a functional programmer might want to describe such a “function modifier” as a three-parameter function, it is expressible in APLs; it’s just that “functions” are fundamentally different from these “function modifiers” in these languages’ semantics. (APL calls them “operators”; J calls these “adverbs” and “conjunctions”; Jelly calls them “quicks”.16)

Summary of approaches to tacit programming

  • In stack-based programming languages, nearly all data is stored and manipulated in a global data stack, so data can be implicitly identified by its location in the stack. These languages are most effective at manipulating data on top of the stack, and can theoretically work with arbitrarily many data items as long as there’s some built-in to access lower parts of the stack, but may require substantial additional code for “stack juggling” if the data flow graph has a lot of parallel pieces.
  • In functional programming languages, programmers use explicit combinators to specify how data flows in and out of built-in functions. This can versatilely handle a lot of data flow graphs, but may require the language to provide many different explicit combinators.
  • In APL-style programming languages, programmers arrange built-in functions into syntactic constructs specified by the language in order to control how data flows through them. In order to make these languages’ syntax expressive, they tend to distinguish functions and data more rigidly and restrict how many parameters a function can have.

This is not an exclusive or exhaustive trichotomy; many languages blend ideas from these approaches or are hard to classify. For example, Jelly has what might be described as a stack-based language atop its APL-style links that I haven’t discussed. Pyth doesn’t provide tacit programming exactly as I’ve described it, but is still an effective golflang because it has a lot of magic variables that are automatically initialized or implicitly named/manipulated by various control flow constructs.

A short survey of golflangs

Having discussed some general principles about how golflangs typically work, let’s look more closely at specific golflangs and how they evolved, both to help orient readers if they encounter some of the popular ones “in the wild”, and simply to appreciate the variety of ideas and approaches people have taken.

As we’ve already encountered, GolfScript (Darren Smith, 2007), a stack-based golflang, is probably the first language designed primarily for code golfing. I think GolfScript has held up remarkably well as a introductory code golf language, though it’s clear that it was not optimized as relentlessly for golf as its successors: for example, it supports arbitrarily long alphanumeric identifiers, and it gives a bunch of its built-ins names that are as long as five letters: print, while, and until. Later programming languages have pulled fairly intense tricks to optimize the byte counts of their programs, but those tricks have also made them that much harder to learn and use.

We’ve already discussed how stack-based programming languages work abstractly and touched on GolfScript’s built-ins, so the main other thing I think is worth touching on is how GolfScript supports higher-order functions by having “blocks” of code be first-class objects that can be pushed on the stack in its data model. When code is written in curly brackets, which can nest, it gets pushed onto the stack as a block like everything else; then, when a built-in pops some data items from the stack and sees that one of them is a block, it typically performs some higher-order function. For example, in our GolfScript factorial solution, almost no computation technically happens until the final character:

~]{,1\{)*}/n}/

When the code {,1\{)*}/n} is executed, all it does is push a code block onto the stack. The ending / is what pops that code block and runs it, once per input.

GolfScript spawned several descendant golflangs that are also stack-based. I think the most immediate one that’s somewhat notable is FlogScript (Zzo38, 2008), although maybe it’s not actually that similar?

[FlogScript is] similar to GolfScript but it is different. No documentation yet, read the examples and source code to try to figure it out.

It would take a few years after GolfScript for golflang design to really take off as a hobby, but take off it did. gs2 (nooodl/maurisvh, 2013?) and CJam (aditsu, 2014) are descendants of GolfScript that stay closer to it; Pyth (isaacg, 2014) is based on Python and uses prefix notation. CGCC comparisons of the latter two and GolfScript from their era suggest that it’s complicated but Pyth has the advantage, while gs2 holds the top rank on Anarchy Golf to this day.

I believe the next two most notable golflangs after those two are 05AB1E (Adnan, 2015) and Jelly (DennisMitchell, 2015). 05AB1E is not directly related to GolfScript, but it’s also stack-based and has 400+ built-ins; also, instead of having functions exist on the stack, it supports higher-order functions through special syntax where you write code inside certain delimiters, which does tend to be more concise at the cost of complicating the syntax relative to the built-ins. Jelly, as we’ve already seen, is inspired by J/APL. We’ve briefly looked at how atoms compose chains and are consumed based on specific rules concerning their arities; atop that, Jelly provides one roughly stack-based layer of grouping/manipulation for building a chain of chains, called a link, followed by ways for links to refer to other links.

But then, there is a really long tail of other golflangs that are varying degrees of less well-known, the vast majority of which arose from the CGCC community. (As far as I’m aware, nobody other than me has programmed in the golflang I wrote, Paradoc. This is unremarkable because I myself tend to forget about it eleven months out of twelve.) One well-researched CGCC meta answer found and computed Elo ratings for a few hundred languages; a small handful of these, selected subjectively by myself based on a combination of diversity and popularity:

  • Brachylog, a declarative logic programming language, is based on Prolog.
  • Husk, which we’ve already seen, is based on Haskell.
  • MATL is based on MATLAB. It’s mainly stack-based but has some vaguely APL-style “meta-functions”.
  • Seriously, also referred to as Actually as of v2.0, is also stack-based and has the gimmick that “all strings are valid programs”.
  • Vyxal is also primarily stack-based, but also has more familiar features like variables, functions, and control flow, and keeps them easily accessible.

(The worst golfing language on the list, which amusingly also has the longest name, is Shakespeare Programming Language.)

Finally, a language that’s so new that it doesn’t make that particular list, but that I’m compelled to mention due to its philosophy: Nibbles (2021), by the creator of GolfScript, is a golflang that resists the urge to just keep adding built-in functions while still trying to be competitive (pretty effectively). The flagship feature is a “literate form” that can be compiled into a “binary form” where most characters correspond to single nibbles (half bytes). I think Nibbles is best characterized as a functional golflang, but it has a lot of really interesting choices; for example, built-ins have fixed arities that are used to determine the “parse tree”, but after parsing, expressions can be treated as functions of any arity depending on the context.

Further reading

There is a lot of overlap between code golf and the “esoteric programming languages”, or esolangs, community; Hillel Wayne has a nice introduction to esolangs that also spends some time on code golf and GolfScript, via which I found an informative interview with Martin Ender.

Another adjacent scene I don’t know much about is JS1k, a more open-ended competition where the goal is just “make something cool in 1KB of JavaScript”, with an optional yearly theme. It stopped running as of 2020 but I think js1024 is the successor.

For those of you who want to go further down the golflang rabbit hole, there are many more useful answers to a CGCC question about creating/maintaining a golflang.

Happy golfing!

Thanks to the many members of the code.golf Discord and CGCC chat room who reviewed and gave suggestions, especially xnor and JoKing for particularly consequential comments and Lydxn for the −3 bytes on the flagship example.

Appendix: GolfScript Factorial breakdown

~       # Evaluate the input to get a bunch of integers.
]       # Collect them into a list.
{       # For each integer in that list:
  ,       # Make the range of integers from 0 to that number exclusive.
  1       # Push a 1, which will be an accumulator.
  \       # Move it beneath the list on the "stack".
  {       # For each integer in the range:
    )       # Increase it by 1.
    *       # Multiply the accumulator by it.
  }       # The brackets just delimit a code block...
  /       # the / is the actual "for each".
          # After that loop, we've computed the factorial.
  n       # Push a newline after each factorial.
}       # The brackets just delimit a code block;
/       # the / is the actual "for each".

  1. I’m considering code golf as a pursuit for its own sake, rather than something a programmer might (but shouldn’t) do in code they’re writing for other purposes. Sure, shorter code is simpler and easier to maintain up to a point just because there’s less of it, but good code will generally have descriptive variable names, comments, and some amount of common sense in deciding how to do things. Code golf is all about consciously ignoring all those other concerns. Variable names are basically never more than one character. Comments don’t exist, at least in the “final product”. And more often than not, well-golfed code will bend over backwards to do things in seemingly obtuse ways because the obtuse way happens to be one byte shorter.

    I’m not really sure anybody needs to read this disclaimer, but Googling “code golf” shows at least some articles warning about trying to write unnecessarily short code, so I thought a footnote on it might make sense.

  2. Sometimes code golf is described with the “cute” analogy: whereas in golf you want to use as few strokes as possible, in code golf you want to use as few keystrokes as possible. While cute, this is not very accurate; program length is usually measured in bytes and occasionally in number of Unicode characters, where lowercase letters, uppercase letters, digits, symbols, and sometimes even non-ASCII characters in your code page of choice all count equally, even though some might require you to hit more modifier keys than others.

  3. Though, when I think too much about code golf, my interpretation of “no golfing at all” tends to become a bit warped, so please pardon any ways in which this is still unnecessarily short. I had to reread this post several times to remember that Pythonistas don’t normally use 1 as the condition in an infinite loop.

  4. I say “arcane” from the perspective of a non-golfer, but all the tricks I’m using here are very well-known in code golf communities; you can find them in any of CGCC’s evergreen Tips for golfing in Python question, Anarchy Golf’s accompanying Python code golfing tips, and code.golf’s Python wiki.

  5. “Coerce” is not technically the right verb; in Python, “Booleans are a subtype of integers” and “behave like the integers 0 and 1” in numeric contexts.

  6. Thanks to Lydxn for pointing this out. I CTRL-F’ed “factorial” on the math module’s documentation and assumed that this would locate every built-in that could be used to compute the factorial, but the correct strategy was to CTRL-F “!”. Or just read it top to bottom.

  7. Thanks to xnor for the pointer.

  8. Thanks to emanresu A for the pointer.

  9. It and several other sites reference codegolf.com, which I think probably began in mid-2006; but judging by the Wayback Machine, that site died in early to mid-2014.

  10. Formerly known as Programming Puzzles and Code Golf (PPCG); the new name was chosen during a site redesign.

  11. This is implied by a 2021 StackOverflow blog post and the blog post immediately after this footnote. There are questions that seem to be asked earlier, e.g. the earliest question I was able to get from the search engine dates to 2008, but closer inspection reveals that it was migrated from StackOverflow proper.

  12. Well, subject to standard loopholes — no inventing or specifying through a “side channel” a post-hoc language where an incredibly short “program” simply happens to do exactly what the challenge asks for. This is somewhat subjective, but in practice it’s pretty obvious when somebody is trying to do one of these things.

  13. Some packers were developed before code.golf for posting JavaScript demos to certain sites with character limits, like Twitter and dwitter.net. Obfusc-a-tweet is a tool first developed in 2014 that packs 190 characters of JavaScript into a 140-character tweet; since then, it’s been expanded to take advantage of other Twitter features. Thanks to Irratix for the pointer.

  14. Partial application is just when you provide some, but not all, arguments to a function in order to produce a new function that can be called with the remaining arguments.

  15. This page does not actually cover an additional fact we use, which is that the left verb in a fork can be replaced with a noun, which is treated as the constant function that always returns that noun. See e.g. documentation on trains for that specification.

  16. One more subtlety: APL supports user-defined operators and J supports user-defined adverbs/conjunctions, but Jelly doesn’t support user-defined quicks, at least for now. (Thanks to Unrelated String for the explanation.)

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)