2021 MIT Mystery Hunt

Note: if you are viewing this shortly after it’s published and somehow don’t want to be spoiled on Mystery Hunt, make sure this spoiler formatting shows up: this text should be spoilered; if it doesn’t, try shift-refreshing. (There are Correct Ways to fix this, but I’m too lazy to do them. Sorry.)

Well, we did it.

✈✈✈ Galactic Trendsetters ✈✈✈ ran an MIT Mystery Hunt.

Somehow, it slipped my mind until trying to write the 2020 year-end post that I’d probably want to write a post on all this. In fact, there was one major project that I started in 2019, but that I didn’t think of mentioning in my 2019 year-end post because it wasn’t ready to be announced at that time, and that I almost forgot that I had never mentioned. But this post is a pretty good place to announce it.

Planning Mystery Hunt is a massive year-long endeavor. I didn’t have any leadership or otherwise high-responsibility roles, which made sense because I was busy writing a masters thesis for the first four months of Mystery Hunt planning. Because of that, and because I know many many other people on our team have written and will be writing blog posts (Rahul’s post, Nathan’s post, CJ’s post, maybe more to come?) I will focus on the things I did specifically. (So there is minimal discussion of the theme, overall organization, or big decisions like our COVID-19 response; I think the linked posts cover these well already.)

Puzzle-Writing Software

One part I did largely have “ownership” of, and that I sank a lot of time into, was maintaining our software for writing puzzles — the website where authors submitted puzzle ideas and drafts, testsolvers tested puzzles, and editors tracked and discussed the statuses of all the puzzles. This role was largely a continuation of me owning the same component for Galactic Puzzle Hunt since 2018, which itself grew out of the comparative advantage of having worked on it a little when writing with Random Fish for the 2015 MIT Mystery Hunt. There is some life lesson about specialization or pigeonholing to be learned here. But, to start at the beginning:

Puzzletron is a piece of PHP software used for organizing puzzle writing for puzzlehunts. The first commit on GitHub says it was imported from Metaphysical Plant in 2011; it’s likely older. There are active commits each year until January 2018, and AMA responses from Setec (2019) and Left Out (2020) mentioning it, so I believe most if not all Mystery Hunt writing teams used and improved Puzzletron and passed it down over those years.

Puzzletron was, in particular, one of the few things I touched in my minor participation in the writing and running of the 2015 MIT Mystery Hunt. I liked tweaking the colors and styles. I did not like working with PHP. Certainly back then I had a very negative impression of the language, enough so to pothole the “PHP” to eevee’s famous critique on its design, but my (extremely secondhand) impression today is that modern PHP has improved a lot and sports some pretty good features and frameworks.

Unfortunately, Puzzletron is very, very much not modern PHP. It only ran on some 5.x version, which was still full of language landmines and is now years after its end of life; one particular incident I remembered was that one of my commits broke Puzzletron by proxy because a slightly older version of the PHP parser didn’t understand the concept of indexing into an expression. All the SQL queries were strung together with sprintf and mysql_real_escape_string.1 There were dozens of magic numbers, gobs of copy-pasted code, and a wonderful permissions system consisting of 12 non-mutually-exclusive roles that each granted you some subset of 11 distinct permissions according to this handy table.

One particularly endearing button had a bold red warning above it saying, “Please don’t press this button”. I commented out the button.

diff --git a/postprod.php b/postprod.php
index fd1d564..02353ae 100644
--- a/postprod.php
+++ b/postprod.php
$puzzles = getPuzzlesInPostprodAndLater($uid);
displayQueue($uid,$puzzles, "notes finallinks", FALSE);
?>
+There used to be a button here, but I removed it.
+<!--
<hr>
<br>
<div class="warning">Warning: Please don't press this button. If you were supposed to press this button, you would know.</div>
@@ -23,6 +25,7 @@ displayQueue($uid,$puzzles, "notes finallinks", FALSE);
<input type="submit" name="postprodAll" value="Re-postprod ALL puzzles (THIS CANNOT BE UNDONE) [This will take a LONG TIME!]">
</form>
<br>
+-->
<?php
// End the HTML
foot();

Cheap technical shots aside, even in everyday usage, it was incredibly easy to misclick a link and accidentally spoil yourself on any puzzle; communication within testsolving groups and between them and puzzle authors was unnecessarily limited; and, despite figuring out a mostly reasonable subset of permissions to use, we still found that the permission system and a few other artificial restrictions in the code got in the way more than they helped.

Now, I want to be clear that I don’t think this reflects negatively on any of Puzzletron’s past developers in any way and that I am still deeply grateful to all of them for passing down such a tool. The maximalist feature set is to be expected of a tool that’s been passed down through so many different teams, each of which would have used it for exactly one iteration of the full hunt-writing cycle. Teams would naturally add features they want, but sensibly avoid the low-reward high-risk endeavor of removing or refactoring code from a codebase they’d only be using for one year; plus, you’d never know if a future team might want a feature you’re removing. And despite all its flaws, Puzzletron still worked a lot better for organizing GPH 2018 and 2019 than the ad-hoc Google Sheets–based approach we used in 2017; and without having used it and knowing what things worked and what things didn’t, I doubt we could have designed a system anywhere as good as what we ended up with.

But, design such a system we did — as the person on the Galactic Puzzle Hunt writing team who was the most familiar with Puzzletron, I decided in summer 2019 that, at least for organizing the writing of GPH 2020, a grounds-up redesign and rewrite would serve us better than continuing to use Puzzletron, and it was within my ability. (It certainly didn’t help that NearlyFreeSpeech, where I had hosted Puzzletron the previous year, no longer offered any PHP version that was compatible with it — for good reason!) So I decided to do that.

It was a very nice side project: precisely scoped, immediately practical, and meant for a user base I was firmly embedded in. I didn’t go full 6.813 on this project, with paper prototypes and live user testing and such, but I sent out a feedback form in mid-July and a few people filled it out. By early August, I had sent out a six-page design doc for review, settled on the name of Puzzlord,2 and started on the implementation.

In a few days:

For full contrast, here’s what it looked like just now:

From a technical standpoint, Puzzlord is not really that interesting at all. There were no scaling concerns or unusual UI demands, so we just used Django and artisanal handcrafted HTML templates. But the language and framework were things I and others on ✈✈✈ GT ✈✈✈ were more comfortable in than PHP, and it also provided an opportunity for me to redesign a lot of pain points from Puzzletron, some of which I already mentioned:

• Improved spoiler safety: Puzzlord always shows you an interstitial and makes you click a button before spoiling you on any puzzle.
• Testsolve sessions, a concept that Puzzletron just didn’t have: you could only testsolve a puzzle as an individual, and if you wanted to testsolve with your friends you’d each individually find the puzzle and click the testsolve button and communicate out-of-band.

Instead, in Puzzlord, you can create a testsolving session, representing one attempt by some user or group of users to testsolve a specific puzzle, and leave comments on it for both the other testsolvers in that session and for anybody spoiled on the puzzle. You could drop a link to your solving spreadsheet both for the other testsolvers to see and for the puzzle authors to review. This abstraction is pretty obvious in hindsight, but it’s definitely one of those things where we benefited from having Puzzletron as a negative example.

• Drastically simpler permissions: in place of the 132-entry matrix, we had three roles — you could be a normal user, an editor, or a superuser — and one custom permission, the ability to view and assign answers. (We get the logic allowing only superusers to access the Django admin interface for free from Django.)
• Expanded space of puzzle statuses: One simple disagreement we had with Puzzletron when writing GPH was that we generally preferred factchecking to come after postprodding,3 because postprodding could introduce mistakes that we also wanted a chance to catch. That was very easy to fix when re-listing the statuses. A more subtle issue was that it sometimes wasn’t clear who was responsible for taking the next steps on a puzzle, so sometimes a puzzle would languish indefinitely in a state when two people would deadlock waiting for each other to do something about a puzzle. People also brought up examples of other issue trackers that had much clearer conceptions of blockers. I tried to solve this just by splitting up some of the existing statuses so that statuses would, at least in theory, unambiguously indicate who the puzzle was blocked on. It’s not yet clear if this was the right approach, but I haven’t thought of anything clearly better either.

In late September, we deployed Puzzlord for GPH 2020 and started submitting puzzle ideas to it.

Then, of course, we won Mystery Hunt.

Although that threw a wrench in our plans, it was a slam-dunk to start a new instance of Puzzlord to organize Mystery Hunt writing. I kept developing the tool over the course of this hunt, though much less intensely, and several other people contributed major features like the postprodding code. There are still dozens of issues we’d like to fix that have been queued up from last year, and its feature set isn’t a strict superset of Puzzletron, but I think that overall, we definitely had a better time writing a hunt with it than we would have with Puzzletron. And just like Puzzletron, Puzzlord is open-source on GitHub! I hope it will be as useful to the puzzle community as Puzzletron was to us in earlier years.

Puzzles

Maybe this is what you’re actually here for.

I have author status on 13 (or maybe 20,639?) puzzles that went into the hunt, which break down as:

I also made a handful of the Green Building navigation puzzles, including the invisible maze that I’m sure everybody, including me, hated (I just hope it was the enjoyable kind of hate).

I’m not sure if this will sound strange with these numbers, but I think my puzzle output wasn’t as high as I hoped or expected. My puzzle ideas text file was and is still very long, but I ran out of the really good ideas in it pretty quickly. At least one idea I’ve had for a while didn’t pan out. Admittedly, I mostly dropped all my puzzle ideas heavy on math, programming, or video games, because the quota for those types were (predictably) quickly filled out by other members of our team.

To the extent there is a “magnum opus” I wrote this hunt, it would have to be How to Run a Puzzlehunt. You can read my long author’s notes on that one, but the short version of it is that I led the creation of an entire new puzzlehunt, the DP Puzzle Hunt, to publicize a resource used by that puzzle. (Of course, this reason was in addition to all the goals and philosophy we discussed in that puzzlehunt’s wrap-up, which I fully stand by; I just couldn’t mention this additional reason then, naturally.) Somehow, this makes DP Puzzle Hunt’s second-round meta my first published metapuzzle, after something like a decade of puzzlehunting and six years of puzzlewriting. I don’t know to what extent DPPH actually helped with accomplishing this ulterior objective, but I was happy to have an excuse to write a puzzlehunt and experience all the moving parts on a much smaller scale.

The other big “puzzle” I wrote was Unchained (link to #3), the first Infinite Corridor puzzle or category thereof. As the author notes mention, I had held this puzzle idea for at least three years but hadn’t found an implementation that I was satisfied with. The Infinite meta masked some of its issues quite well. Based on the general feedback I’ve read I think the other puzzles in that round all outshone it, but I’m happy to have provided a traditional puzzle that grounded the set of five by being self-contained, non-interactive, and non-self-referential. Also, listening to a lot of (extremely broadly defined) popular music is one of the more enjoyable forms of data gathering I’ve gotten to do in puzzle-writing research, and it gave me a lot of nostalgia and a few new discoveries to cherish.

One thing I’m less sure about was the puzzle’s difficulty, particularly given its unlock position. Since I had held the idea (and lived in fear of it being sniped by other puzzlehunts) for so long, I probably intuitively underestimated its difficulty; feedback from puzzle editors and from testsolves all suggested the one realization (from my perspective) was not at all easy. Considering it as a hunt puzzle in the abstract, I was okay with that and chose to stick with a deliberately obtuse title to preserve the sanctity of the aha! step. But after it was slotted in as the first unlock in one of the two earliest main rounds, it’s possible that a stronger clue in the title would have improved it.4

And also, it had so many errata! This is impossible to discuss without spoilers, so out of an abundance of caution I’ll hide it, although it’s just two paragraphs:

On the bright side, I probably hold the world record for “most puzzles in a puzzlehunt affected by factual errata”,5 so I got that going for me.

I solo-wrote quite a few more puzzles that I think weren’t particularly inspired. On the other hand, I do think it’s cool that Mystery Hunt finally forced me to learn how to write an uninspired puzzle, the kind based on one mildly interesting idea or data set and little else. In approximately decreasing order of how much I’d recommend them:

• 15×15 is interesting and silly, and hopefully short if you have crossword experience. One piece of feedback called it “what we wish crossword-solving could be like”, which is pretty much exactly what I was going for. (We also got some “bug reports” through the feedback part of the site. I was half-expecting bug reports, but not through that particular site feature, so I didn’t notice them there until much later and wasn’t able to reply nonchalantly, “The puzzle is correct as written.” I hope nobody was stuck waiting for a response that never came.)
• Recursion is a particularly minimal puzzle, one aha! and nothing else.
• Things has provoked some funny reactions from testsolvers.
• Countries is also pretty simple. However, I got some really positive feedback from Randall Munroe (wow!), so here it goes.
• Illiterate Programming is an idea I’ve also kept around for a while. I think it’s a cool idea, but unfortunately there is a sort-of erratum (SERVICE is a keyword in many versions of COBOL, including the first Google result, although I think it is sufficiently version-dependent in a way that the actual keyword wasn’t that I wouldn’t call it a full erratum).
• I think in terms of the aha!, Fish Hybridization is the least interesting puzzle on this list, but it’s built on a solid foundation of terrible puns all the way down.

The only puzzle I feel like “enough of a coauthor” on to fit on this list is Lime Sand Season, one of the very last puzzles to be written. I wrote the first draft with Anderson in about three hours one night and got it testsolved. Although the core stayed the same, we actually put a lot of work into revising the variety and distribution of poets and poems in ways that are hard to quantify. That puzzle would probably go near the top of this list in recommendation.

I did contribute one somewhat notable subpuzzle to Ignorance, namely Sum and Product. I didn’t expect as much positive reception for that subpuzzle as I got, because I generated the puzzle by blindly throwing things at the wall (a metaphor for a Python script) until something stuck, and because it turns out that a Sum and Product already exists. (We discussed omitting the large constant, since anybody who could concretely demonstrate an erratum in the puzzle without the constant would also necessarily have found a major mathematical counterexample, so the accuracy of a hunt puzzle would probably not be very high on their list of concerns. But since the puzzle relied on a slight strengthening of the most common version of the conjecture instead of the vanilla version, and the big number was funny, we decided to keep it in.) Anyway, the whole puzzle is amazing too.

I testsolved a lot of puzzles. I think my favorite regular puzzle (speaking of large constants) is definitely 1000000000000000000000000000000000000000000000000000000000000000000001000000000000116. It’s a puzzle idea that seems obvious in hindsight, centered on an inimitable realization that made me go “oh, that’s how it works… wait, you want us to solve what?”, executed brilliantly through the final step. So You Think You Can Count? (alas, not currently playable) is an incredible “teamwork time” puzzle, and the Tunnels meta is really cool (although I testsolved a somewhat harder version). Other highlights I testsolved significantly include:

• Nutraumatic: Interactive puzzles are great, ’nuff said.
• Altered Beasts: I wasn’t expecting to be impressed by this puzzle — the genre of “mangled clues” it uses is a pretty stock puzzle type that I think can get dull after a few — but it took this concept to a transcendent level.
• Le Chiffre Indéchiffrable: I didn’t rate this puzzle very highly when I testsolved it, but offhandedly made a certain suggestion that the author ran with and turned the puzzle into possibly the single puzzle in the entire hunt that made me think “why didn’t I write that?” the most strongly. You can read the author notes.

I tested Circular Reasoning and PClueRS in much larger groups, so I don’t personally remember them as sharply, but the feedback for them I’ve heard has been universally positive, so I’ll mention them too. Some other noteworthy puzzles:

• The IMO Shortlist: The topic, presentation format, and difficulty are all what you’d expect from the title.
• Super Mystery World: I bought Super Mario Maker 2 to testsolve this puzzle, spent about four hours in a single sitting just to beat all the levels at least once, and then spent four more hours actually solving it, both with quite a few other people watching/co-solving. That was quite an experience. I would probably feel like it’s the Domino Maze of our hunt if I had done it during the actual hunt, although I think the puzzle may have been made a bit easier since my testsolve.
• Bake Off: Bizarrely, this is the first time I baked this kind of object! I was too lazy to make icing, but I scrounged up some food coloring from the recesses of my apartment and smeared it everywhere.

Plot

I acted in the Mystery Hunt! I had a minor named role as “Max Matsuoka”, who had little backstory or plot significance, but got to answer the first, highly-upvoted question in the kickoff Q&A. He also got to confusedly ask “What is going on?” in several iterations of another early skit, which is very on-brand for me. Finally, Max holds the distinction of being the only named character whose first name was shared with a real person active in ✈✈✈ Galactic Trendsetters ✈✈✈; that same person also helped with skits, although not as an actor. In hindsight, this may not have been the best idea.

I also understudied as the deadpan snarker with a yellow wardrobe, Skylar Holstein. I am not sure I knew what the word “understudy” meant before all this. For context, the last time I had anything resembling an acting role was during my Dropbox internship in the summer of 2016, when I portrayed a banker in the Hack Week musical. I think acting is fun, but it’s the kind of fun where if I made a list of all the fun things I want to do in life, it would probably be around twentieth place. I learned a lot about acting techniques, pronunciation, and clothing, for example the dozens of features of the dress shirt. And I got to try not to break character in front of a few teams I knew. 10/10 would do again.

Honestly, it still boggles my mind how the story team came up with such a good plot under such severe constraints, including the fact that everything had to happen over video call, but also just the general structure of the rounds and metapuzzles, which were already well underway before we realized what had to happen. As only a minor character and understudy, I don’t actually show up on most of the recorded interactions, but (or maybe because of that) I think you should watch them anyway.

Everything else

I did small tasks in many other parts of the hunt in addition to all the other things I described above, with increasing frequency as it loomed closer. I drew a few assets for the projection device, like the ladder, the connection icons, and the Lobby 10 bench; I also fixed a few z-ordering bugs. I edited random diagrams and strung together random words from physics papers for the opening slides.

And then… hunt happened.

God, it’s a blur. Running skits and interactions, answering hints and contact HQ requests, refreshing the leaderboard, watching answer submissions come in, examining the Unchained source material really carefully (perhaps an unnecessarily spoiler-avoidant way to say listening to the songs to see what their lyrics were) to verify errata, wandering around the Projection Device and emoting at people I recognized, the very occasional check-in with teams. I also got to watch a lot of logistical and technical fires occur and get put out, but there wasn’t much I could help with. (The nice thing about owning Puzzlord, as opposed to nearly any other component of the hunt, is that it becomes almost entirely inconsequential once hunt starts. There were never any problems I had to respond to quickly.)

I have to say I didn’t really expect the feeling of… sheer presence, from seeing other people in the projection device. It was obviously going to be a thing in hindsight, but it’s not a feeling you get from wandering the projection device alone while debugging or testing puzzles ahead of the hunt, not even if you’re wandering with other ✈✈✈ GT ✈✈✈ members because most of the time you already know they would be online. But during the real hunt, once people had started unlocking the projection device, ⟂IW came alive in a way I was not emotionally prepared for. I recognized usernames of actual people, people who I knew were exploring the world for the first time, and knew that they would recognize me. It seems likely I bumped into more people I knew than I would have on a normal on-campus Mystery Hunt. Certainly, I virtually met a lot of remote solvers who I couldn’t have met on MIT’s campus and assume they all met way more people than they would normally. And that could easily have been me years ago. I may have been too busy during the hunt for all this to fully sink in, but I’m finally starting to realize the extraordinary advantage of running our particular hunt during what would have been interminable lockdown for many solvers.

I don’t have that many screenshots, so mostly I will just cheat and link to CJ’s screenshots (make sure to click left/right on the galleries). Among the screenshots I do have and want to publish here is a powerful spreadsheet:

The second is the screenshot I took of the post-hunt period when we told everybody to enter the projection device:

Every time I look at this picture, I recognize additional names or usernames that I hadn’t seen before.

Parting Thoughts

That feeling of having been working for months, then just standing back from it and just beholding everything we’ve done, is incredible and I will miss it. The positive feedback we’ve gotten has been overwhelming. But at least I get the prize of being able to return as a solver to Mystery Hunt next year.

Huge thank-yous to every other member of ✈✈✈ Galactic Trendsetters ✈✈✈. There is no place for me to stop listing roles and names where I can be happy, but I have to try anyway. Thanks to:

• The story director, Lillian, for absolutely knocking it out the park with the plot and scripts; and Theo, Ian, Kat, Josh, Connor, Amanda, and everybody else on the cast and crew for their dedication, particularly Phillip and (the real) Max for their masterful Zoom management. I had never realized a puzzlehunt story could have such nuanced characters, nor that it could make me laugh and tear up as I did.
• Ben, Nathan, Herman, Kat, Damien, Evan, and everybody else who worked on the MMO, plus DD, Cami, Andy, Lennart, and everybody else who worked on art. You all poured your heart and soul into the place to allow it to come to life the way it did during Hunt, and it shows. I Have Truly Found ⟂IW.
• The editors in chief, Rahul and Jon; factchecking lead Danny, copy-editing lead Seth, and testsolving lead Rob; and Anderson, Lewis, and all the other puzzle editors and authors, for making the puzzles happen, and for continuing to push the boundaries of what puzzlehunt puzzles can be without allowing any other aspect to be compromised.
• CJ, Yannick, and Mark for their dedication to the endgame runaround in spite of the Boston weather. I couldn’t have experienced the moment the MMO was unlocked the same way as most teams, as my exposure was spread out through the full ideation and development process, but the first time we ran through a test of the runaround I was floored. I can still see the stones in front of the Student Center and the lights on Mass Ave from those moments, and I was on campus again. I can only imagine what the one-shot full hunt weekend experience must have felt like.
• Justine, Colin, Jingyi, Curtis, and everybody else who designed and contributed to the events and interactions, plus Toomas, Steven, Sam, Joanna, Jenna, Jason, Mitchell, Daniel, Chris, Charles, Amon, Alan, Adam, and countless other people who built and fixed hunt infrastructure, ran those events and interactions, answered hints and questions, and kept the hunt going through heavy team traffic and that unforgiving pandemic. The kinds of human communication and collaboration you enabled are an irreplaceable component of hunt that we all need right now, and the passion, creativity, and humanity you infused it all with is deeply inspiring.
• And, of course, Jakob and Josh for their unwavering leadership through the entire process. You kept the team together through some tough decisions without any good choices, on top of the thousands of moving parts in an exceptionally ambitious Mystery Hunt.

Here’s to Palindrome running an amazing hunt in 2022. See you then!

1. This might have actually been done correctly so as to prevent any SQL injection, but I fixed a bug in 2015 in which you could impersonate any user on nearly any page by specifying their user ID (a small positive integer) in a POST argument, so I do not have high faith that there aren’t other vulnerabilities.

And to be fair, Mystery Hunt isn’t a setting where the security stakes are high. We generally trusted everybody on our team, and there’s little to gain from breaking in for anybody, insider or outsider. You could leak the puzzles and just spoil a lot of people without gaining anything yourself, or you could keep them to yourself to gain a competitive advantage and then… earn the “prize” of having to write the next year’s hunt?

Still, instead of trying to carefully reason through possible attacker motivations, I think it’s easier to at least do basic due diligence, where you use off-the-shelf authentication if possible and keep SQL statements firmly away from the concatenation operator.

2. I think the origin of this name is that I tried concatenating a bunch of role-playing game/Race for the Galaxy–esque words to the word “puzzle” and making a shortlist of ones that didn’t have too many Google results. From that list, somebody pointed out that “Puzzlelord” could be contracted to be very similar to Guzzlord, a Pokémon. It was immediately compelling because it was short, descriptive, and mostly free of namespace collisions.

3. This refers to “post-production”, the process of formatting a puzzle so it can be displayed exactly as desired on the final hunt website. Typically this is done by writing HTML, but sometimes it’s much more involved.

4. One category of hintier titles I considered was “Makin’ Bacon Pancakes” or some variant/allusion thereof (switching the five-bit binary out for the Bacon cipher, of course). But I never seriously pursued that title or had it testsolved, so I’m not at all calibrated on if that title would nudge, in particular, if hearing that song (or its… more famous remix) would make you go “oh, there are only two notes!” and lead you to break in.

I believe a much earlier version of the puzzle, from years ago, had the title “Alternative Expressions” and a different obfuscation technique whereby the lyrics would be synonym-substituted to be very verbose. Anyway, that didn’t work out.

5. Maybe “puzzle” needs to be defined with some criterion like needing to have an answer that’s individually accepted by some kind of answer checker, so as to exclude subpuzzles of 10,000-Puzzle Geometric Objects. But also maybe none of those puzzles has had announced errata and it doesn’t matter.

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)