Passwords. It’s 2023 and we still have to deal with them.
Many people know that, per the canonical xkcd, sequences of randomly chosen words such as
make relatively memorable but hard-to-crack passwords. One popular strategy for randomly choosing words is Arnold Reinhold’s Diceware™, a list of 65 = 7776 “words” that you can randomly sample from by rolling five dice (analog or digital). (I won’t go into topics like how to calculate the entropy of passwords and how long a password you should try to have here, since most Diceware overviews already discuss them at length.)
A few people have iterated on the concept since then: probably most notably, the Electronic Frontier Foundation published their own word list in 2016, with words chosen to be more well-known and memorable, at the cost of taking longer to type. I’m a fast typer and prefer the EFF’s wordlist over the original, and am very grateful to them for creating it, but after generating quite a few passwords with it over the last few years, I began to feel that it still had a lot of room for improvement.
- One of EFF’s requirements was that “no word is an exact prefix of any other word”, so that concatenating all your words together is an injective operation and you can do that without worrying about losing entropy. They give the example that, if the word list has
input, then there might be multiple ways to create the password
input. But I’ve always separated the words in my passphrases, and most diceware password generators I’ve seen online do the same. (The Diceware™ FAQ considers one reason to not do this — since spacebar keys often have a “distinctive sound”, typing a space-separated Diceware™ password could audially leak information about the word lengths — before concluding that this is an unlikely threat to defend against; but even if you are truly paranoid, I’d suggest separating words with hyphens rather than spaces, which in my experience is also less likely to be rejected by stupid password rules.) I also think this causes a lot of good words to be rejected when there’s no risk of confusing them (
- Other goals of the EFF’s list were to avoid homophones and hard-to-spell words, but I don’t find this useful. Many homophone pairs, say
flour, make very different xkcd-style visualizations and won’t be confused for each other. Also, subjectively, I think I’m a pretty good speller. Furthermore, the EFF’s list doesn’t achieve their goals either: the list still contains some homophone pairs like
cash, as well as a bunch of words that are spelled, if not incorrectly, at least unusually:
whacky. (The pasta is usually spelled
linguine; Linguini is the name of the Ratatouille guy.
Plexiglassis a generic term, while
Plexiglas®is a brand name.) Worse, the list often contains two alternate spellings of the same word:
- The examples from the previous point were kind of cherry-picked; in a list of 7,776 you could easily never encounter any of them. A more common problem is that EFF’s list has a lot of weird inflections and derivations that I find to hurt memorability. Some are simply unnecessary: the list includes many adverbs like
politelyand negated words like
unshackledespite not including the root words or any other related words. Many others, while justifiable for the prefix-free property, feel very unnatural to me. There are odd -ness constructions like
gumminess; -like constructions like
fernlike; and just unusual compound words:
trailside… More than 5% of the words begin with “un”, including 29 words beginning with “under”. Somebody on StackExchange noticed traces of this phenomenon, but I haven’t seen any deeper analysis.
- Finally, the list strangely skips over a bunch of bigrams. None of the EFF’s words start with
WE, even though I’m confident you can think of some well-known, memorable words starting with each of these bigrams. (For example, the list omits
horse, so you can’t generate the canonical xkcd password with it.) Though I don’t know how or why this happened,1 this means there is definitely room for better words to displace existing ones in the EFF’s list.
So… I decided to make my own list.2
In another universe, I might have gone about this by sourcing a bunch of survey data, figuring out how to use word embeddings, or just prompting an LLM. But I never found the motivation and am dedicating all my AI brain cells to my day job, so I did it mostly manually. Basically, I started with the EFF’s word list, then in no particular order:
- diffed various subsets of other wordlists (the original Diceware™ list, BIP39, a different list by heartsucker on GitHub…) or frequency lists with my running wordlist to find memorable words to add
- looked up various metrics of words’ “concreteness ratings” and how recognizable they were from the English Lexicon Project to find good words I missed and drop particularly bad ones
- searched for inflections/derivations with some simple NLTK scripts to delete them
- and just scrolled through the list and deleted or undid inflection/derivation on words I didn’t like.
My most impactful decision was to avoid inflections and derivations nearly completely. My goal is that, after you roll a sequence of random words from this list, you can inflect/derive them to make a grammatically coherent phrase or even full sentence to make the password more memorable without losing any entropy. I did not go all the way in various cases where I thought the inflected/derived form had a sufficiently different meaning or visualization, or was simply a homograph:
wound. I also sometimes kept a word in derived form when I thought it was more commonly used thus (
goggles instead of
harrowing instead of
harrow) or just to sneak it in the length restriction (
oxen instead of
surveil instead of
surveillance despite it being a backformation).
Although I did not keep the prefix-free property, I decided to enforce a weaker rule that no word is the concatenation of two other words on the list, because I thought that could actually harm memorability. (This means you will still need to separate the words in your passphrase if you absolutely refuse to lose any entropy: for example, concatenation would collapse badger+eel = badge+reel.) I also relaxed the length constraint from the EFF’s list, allowing 3–10 letters per word rather than 3–9.
Even with those principles decided, though, there were still lots of small judgment calls and tradeoffs:
- To enforce my concatenation rule, I sometimes had to decide between including one short word or a bunch of compound words that were formed from that word plus another.
- I had to decide how to treat “profane, insulting, sensitive, or emotionally-charged words”, which the EFF generally avoided. It’s pretty obvious why you might not want those words in your password, but at the same time, those very qualities often make the words more memorable. So while I still avoided, say, obvious slurs, I allowed many more words with emotional valences.
- There’s a tradeoff between including obscure words with concrete, memorable meanings and including common “connector” words, usually prepositions or conjunctions, that may be hard to visualize (e.g.
that). Overall my list skews much more “obscure but memorable” than the EFF. Furthermore, some obscure words might be made from familiar root words, so that people who have never seen the word can still guess its meaning or imagine something vaguely related; sometimes I picked those words over words that I thought were less obscure, but that would be utterly meaningless to people who hadn’t heard of it.
- Finally, and most simply, I wanted to avoid overly obscure words, which requires judging how obscure words are. Although I tried to limit my personal bias in these decisions, I’m sure the list is still deeply infused with it. It’s also the case that I felt my judgment slipping after staring at too many words. I remember coming to a word
unpeeland struggling to figure out whether something like “unpeel an orange” is a normal thing to say.
As a final hedge against some of these tradeoffs, I included a few word pairs in my list: for words I thought were well-known but might be too abstract, I provided alternatives that I thought were concrete but might be too obscure, and vice versa. The idea is that when you roll one of the word pairs, you can choose from between them in the same way you can choose to inflect/derive the word to create the passphrase you personally find most memorable. In total there are 416 word pairs. (Why 416? Because 7776 + 416 = 8192 = 213, so that if for whatever reason you need a wordlist for neatly mapping bitstrings to words, you can flatten the list.)
The resulting wordlist, you hopefully already got a taste for at the beginning of this post (though that demo omits the alternative words for simplicity), but if you want them again, here’s the the list as a .txt and a tiny standalone .html file with the list and some trivial code to randomly sample words from it. Although my longest words are longer than the EFF’s longest words, my average word length is shorter (6.2 vs 7.0), though still considerably longer than the original Diceware list (4.3).
Other password strategies
This section is not so strongly related to the rest of the post, but as a casual password enthusiast I’ve seen a bunch of other approaches to authentication. I will focus on “knowledge-based” authentication methods here (so I won’t go into things like hardware authenticators, biometrics, or the surprisingly unrelated DiceKeys). For example, archagon’s Grammatical Passphrase Generator does what it says. Sometimes it works quite well, but sometimes it generates words like “diphyodont” and “skeptophylaxis”. Some more sophisticated but more theoretical approaches (i.e. I don’t know of publicly available implementations) are discussed in How to Memorize a Random 60-Bit String (Ghazvininejad and Knight, 2015); my favorite is the proposal for generating rhyming iambic tetrameter couplets.
A totally different line of research involves developing methods of authentication that rely on tacit rather than explicit knowledge, theoretically preventing some ways of leaking the password, such as the classic rubber hose. For example, Bojinov et al. (2012) propose an authentication system where you play an osu!mania-like rhythm game. To teach you a password, the system lets you practice on a rigged game where some patterns are more common than others, and then to authenticate, the system has you play the game with random patterns and checks if you score better on the patterns that were more common in practice. Joudaki et al. (2019) propose a similar system where you hunt for a (possibly rotated) T among many (also variously rotated) Ls; similar to the previous, the system teaches you a password by giving you a bunch of patterns with some repetition, and then to authenticate, the system randomly shows you either patterns you should have seen before or brand new patterns, and checks if you can find the T significantly faster in the former kind. I don’t think either system is reliable enough to be more than a theoretical curiosity, and in many ways being able to tell somebody your password or write it down is actually a good thing — it lets you manage authentication for many different accounts with password managers, for example. Still it would be really cool if something like them actually worked.
If I were to attempt to come up with a plausible story, I might speculate that each of these bigrams were included as two-letter words in some draft of the list, all other words beginning with them were cut to maintain the prefix-free property, and then they were cut themselves for being too short.↩
I think I first started this project several years ago, though at the time I think I was also sort of trying to make the list serve double duty as a Castlefall wordlist. This April, I decided to pursue it intensely for a few weeks, but suddenly lost steam shortly after. So I’ve decided to just wrap the project up.↩