I’m still doing interpretability at Anthropic. This year, among many
other research updates, we scaled
sparse autoencoders to Claude 3 Sonnet. There’s not a whole lot I
have to add. I like to think I improved at various clichéd abstract work
skills that I don’t have anything insightful to say about, the most
front-of-mind being communication and prioritization. It’s good to, uh,
communicate what everybody is prioritizing and communicate all the
information everybody needs to prioritize things. Thanks for coming to
my TED talk.
In my personal life, I went to more in-person social events than ever
before. I took a bunch of improv classes and might actually end up
performing soon™. I went to a Jacob Collier concert and a Bear Ghost
concert. I scored 141/150 plus two beers on the AMC 12. I attended three
separate furry conventions, one of which I believe I have to credit with
indirectly motivating me to hit the gym semiregularly for the first time
in my life. (Also I went to Tural for
summer vacation.) I’m pretty happy with all that, but it also doesn’t
really add up to exciting reflections.
…it’s still you. Looking at yourself in the window. The second
afternoon after you finally get COVID for the first time.
As previously reported, I left Zoom late last year and spent a bit of
time unemployed, traveling for some of it but mostly staying home. In
the process, I got COVID, though not with a particularly interesting
story. Would not recommend.
Then I started work at Anthropic doing interpretability research —
moving way back into my comfort zone in a way by returning to my web dev
roots to create many of the visualizations we cared about, and way out
of it in another by jumping into the deeply theoretical end of research,
in a field where my total experience is one college course and one
casual reading group. Still, I figured some things out and we published
Towards
Monosemanticity in early October.
I don’t have much to add to the research results in the paper, though
I can share some trivial, mildly entertaining anecdotes about the
process:
One day I’m going to run out of the energy to find barely adequate
allusions for the titles and thematic music videos for the openings of
these end-of-year posts, and they’ll just be called “2095 in Review” or
whatever. Or maybe I’ll just stop making them. But not today.
Good song. Good animation. Incredibly out of place on its YouTube
channel, in the most inspiring chaotic good way.
I closed out last year by saying that I wanted to accomplish a “big
milestone” this year. I actually had a specific milestone in mind that I
did not actually achieve and will not reveal, but I made good progress
towards it, and a lot of other things happened, enough that I think I’ll
count that as achieved.
The big thing is that I left my job at Zoom to have some time for
myself and family… though not before helping to give feedback on a draft
internet standard, publish a cryptography research paper
(on which I’m the “first author”, strictly due to the vagaries of the
English alphabet), and launch end-to-end
encrypted email. It was a productive year! I feel like I should have
more to say about all this, but it’s hard to think of anything that I
didn’t already write about last
year and also doesn’t require a blockbuster-length list of
prerequisites. However, if you ever want to hear about the difficulties
of actually getting end-to-end encryption into production in
excruciating detail, invite me to a cocktail party with a lot of
whiteboards.
2021 is the first year during which I held a full-time job
continuously. My disposable income and discretionary spending have both
increased the most sharply since, well, ever. It’s weird.
We are still in a pandemic. Greek letters continue to be associated
with uncool things. I got vaccinated and started taking measured (but
still small) risks. Funny story: my first vaccination was a complete
surprise, as my roommate knocked on my door mid-day to inform me that
somebody he knew had extra vaccines to give out — except that the night
before I had a dream about being vaccinated, which was weird enough that
I wrote that dream down, something I do only once every few months. I
also got boosted just a few days ago.
It’s a strange and darkly funny story — Tim Minchin wrote the album
this is from, Apart
Together, well before the pandemic hit and social distancing
became the norm, and I assume I am not alone in finding that the song
resonates unusually strongly as a result.1 By
Jove, it resonates.
Nobody needs me to say that it’s been a rough year. People have been
complaining that each of the last few years were terrible, and looking
forward to the next one, and being disappointed — as if years were
coherent bundles of quality, and there was any reason to expect
discontinuities in how things are going to occur around January 1st —
and as if there were additionally any reason to expect such
discontinuities, if they did exist, to be positive ones.
Seriously, do you remember when we thought 2015–2018 were bad?
And yet… I feel like overall, 2020 went quite a bit better than
expectations for me. Which maybe means it’s astronomically better than
the average person’s 2020. I had a long draft for this post that slowly
accumulated words over the year as usual, but a lot of the ramblings I’d
usually include now seem unusually vapid, and a lot of the deeper trends
and experiences I might normally reflect on are things I don’t think
I’ve really gone through or thought about for long enough to achieve
closure on. This is partly due to the pandemic scrambling a lot of plans
and partly because last January, nearly a full year ago, ✈✈✈ Galactic Trendsetters ✈✈✈ won
Mystery Hunt and so we’re writing the 2021 hunt. The ramifications
are still being felt and will accelerate until it actually happens two
weeks from now, and that’s all I’ll say about it here.
Not a very completionist run. I graded myself pretty strictly though
— both sides of every “and” need to count; “all” means literally all;
fuzzy actions and phrases require full psychological commitment to
qualify.
This is a weird song choice — I have not even watched the movie. But
there is a story, and there is a thematic correspondence.
The story is that I was interning remotely at a coworking space over
the summer. One night, I attended a karaoke event hosted there, the kind
where adult human beings socialize and where I didn’t know anybody else,
and I sang this song. Afterwards, another attendee told me that her kid
(yeah, you know, people in my reference class have children) loved Moana
and was really excited about my performance.
The thematic correspondence is less obvious and harder for me to
describe. I’m going much less further this year than I could be, and am
less sure about next year than I expected to be at this point for
reasons I’m not ready to share yet (this seems to be happening more and
more on this blog, but there’s not much I can do about it — so it goes).
But it really is the case that there are some things I can’t deny about
myself, some attractor states that my values and way of thinking keep
dragging me towards.
Frivolous examples: I went through another online Dominion phase and
at least two Protobowl phases, the highlight of which is learning a good
deal about Émile
Durkheim and then buzzing on him the next day. I did Advent of Code
again, with the same golfing setup as last year, a foray into making an
auxiliary over-the-top
leaderboard in Svelte, and (surprisingly to myself) getting first. I
have a shiny Charizard with Blast Burn now.
I put this question in my FAQ, because at least two people have asked
me this question, and that’s how frequent a question needs to be to be
on my FAQ: I got an IMO1 gold medal in 2012, as a ninth
grader, and an IOI gold medal in 2014, as an eleventh grader. I could
have kept going to either, or even decided to try taking the IPhO or
something, but I didn’t. Why not?
The short answer: It was a rough utilitarian calculation. By
continuing, I would probably displace somebody else who would gain more
from being on an IMO/IOI team than I would. Besides, I wanted to do
other things in high school, so I wasn’t losing much.
I think the short answer actually captures most of my thinking when I
made the decision back then, and it’s not really new; I said as much at
the end of 2013. But behind it
was a lot of complex thoughts and feelings that I’ve been ruminating
over and trying to put into words for the better part of a decade.
Hence, this post.
There is a natural question that precedes the frequently asked one
that I have never been asked, something I am now realizing I never
honestly asked myself and never tried to answer deeply: Why did I
participate in the IMO and the IOI in the first place?
I was pretty torn between this and “The Future Soon” as the Year-End
Song on this blog, but in the end I think I feel more threatened by the
bland existence of the soulless adult than inspired by the
starry-eyed-idealism-with-misogynist-undertones of the twelve-year-old,
plus I get to show you the best kinetic typography video I have ever
seen.
Halfway through 2018 I thought this would be the year of ephemeral
phases. I felt like I went through a different phase every month — Online Dominion in April, crosswords
in June, Only
Connect in July, Jonathan
Coulton in August, a brief stint of trying really hard to barre my
guitar chords in October. Somewhere in the middle, I discovered Kittens
Game (“the Dark Souls of Incremental Gaming”) and my summer internship
mentor got me to pick up Pokémon Go again. A few intense periods of
typographical study were interspersed, which involved watching the above
music video dozens of times, teaching a Splash class on typography, and
developing a new awareness of how Avenir
was everywhere. During the last month, I went hard on Advent of Code and got second
place, apparently the only person to make it on every single
leaderboard. I also did a related golf side
contest and poured a couple more hours into Paradoc, my personal
golfing language, for rather unclear gain. At least I got a lot of
GitHub followers?
It would turn out, though, that a lot of these phases had more
staying power than I expected. Pokémon Go is a much better game than it
was two years ago and has actually fostered a significant real-life
community, which seems like one of the best possible outcomes of an
augmented reality game, and I’ve found a steady pace to play at. I
spread the Only Connect bug and people on my hall, intrigued by the
format but annoyed by the overwhelmingly British trivia1,
started writing and hosting full games for each other, with our own
MIT-slanted set of trivia. One of us developed a custom site and tool to host
these games. It took me a while to warm up to Jonathan Coulton’s latest
album, but since it happened, I cannot get Ordinary Man or
Sunshine out
of my head; I’m still listening to JoCo as I finish typing up this post.
Although I never got back to the peak of my crossword frenzy, I still
study crosswordese from time to time and compose crosswords for some
special occasions, like this one
(.puz file).
The academics and technical aspects of this year have all blurred
together, but I think my interests are finally crystallizing:
I love the music and the animation. The music video spells out the
central conceit somewhat explicitly, but I think the lyrics by
themselves have a hint of ambiguity — is it a harmful addiction that you
just can’t escape from, or an essential part of your identity that you
just can’t deny?
What parts of me can I just not deny, huh? Unfortunately 2017 is also
the year I decide my online presence should probably be a little more
professional, so you might have to read between the lines a bit.