# Better Master Passwords: The geek edition

I’ve always wanted to write a technical followup to an earlier post, Toward Better Master Passwords, but this time going into some of the math behind it. Today’s xkcd comic does that for me:

Indeed, what took me nearly 2000 words to say in non-technical terms, Randall Monroe was able to sum up in a comic. This just shows the power of math, but that’s another issue. So for those of you who want to understand the comic and see how it relates to my earlier post, read on. But first read or re-read my earlier post on strong master passwords.

If, like most sane people, you don’t want to dive into a technical discussion, then stop here and just read the original, non-technical, post that says the same thing as the comic. It’s also where the practical advice is.

### The only thing I’ll restate

There is one concept (well, actually two concepts) from the Toward Better Master Passwords post that needs to be restated. It is central to everything that follows:

*how many ways you could get a different result*using the same system.

This embodies two things that we need to take into account when looking at the strength of some components of security. *Kerchoff’s Principle*, and *entropy*.

#### Kerchoff’s Principle

Kerchoff’s Principle states that you should assume that your adversary knows as much about the system you use as you do. In this case it means that if you are following advice about how to generate strong memorable passwords, the people who will be trying to break that password are at least as familiar with that advice as you are.

I can’t over-emphasize the point that we need to look at the system instead of at a single output of the system. Let me illustrate this with a ridiculous example. The passwords F9GndpVkfB44VdvwfUgTxGH7A8t and rE67AjbDCUotaju9H49sMFgYszA each look like extremely strong passwords. Based on their lengths and the use of upper and lower case and digits, any password strength testing system would say that these are extremely strong passwords. But suppose that the *system* by which these were generated was the following: Flip a coin. If it comes up heads use F9GndpVkfB44VdvwfUgTxGH7A8t, and if it comes up tails use rE67AjbDCUotaju9H49sMFgYszA.

That system produces only two outcomes. And even though the passwords look strong, passwords generated by that system are extremely weak. Of course nobody would recommend a system that only produced two outcomes, but people do recommend systems that produce a far more limited number of outcomes than one might think by inspecting an individual result of the system. This is because humans are far more predictable than we like to believe.

#### Entropy

What unit do we use to measure the number of different results you can get from a system? The answer to that is “bits of entropy”. The silly system I listed above can get us two different results. We can represent two different outcomes using one binary digit (bit). Passwords from that system have just one bit of entropy.

Now suppose we had a similar system that involved rolling one die. That would lead to six possibilities. Six outcomes can be represented in three bits (with a little room to spare). The actual number of bits is closer to 2.58. (And for those who really want to know where that number came from it is the base-2 logarithm of 6.)

One feature of using bits of entropy as a measure is that each bit represents a doubling of the number of possible outcomes. Something with 10 bits of entropy represents 1024 possibilities, while 11 bits will double that to 2048 possible outcomes. There are many reasons that we use bits instead of the number of possibilities. I won’t go into the mathematical reasons, but one nice result is that it gives us manageable numbers. In cryptography we routinely deal with things that have 128 bits of entropy. 128-bits would represent 340282366920938463463374607431768211456 possible outcomes. It’s hard think about or compare such numbers.

### Working through the comic

Now lets look at a few things in the first pane of the comic. Let’s start with the stuff I’ve put into the ~~green~~ pink box. The single small gray square in the bottom of the ~~green~~ pink box shows that the choice between capitalizing or not capitalizing the word adds one bit of entropy. Of course people could add more entropy by possibly capitalizing other letters, but people don’t capitalize randomly. The do so at the beginning, at the end, or sometimes at internal word or syllable boundaries.

If people capitalized randomly that would add a lot of entropy, but capitalizing randomly would make the password impossible to remember.

Now let’s look at the stuff that I’ve put in the blue box. Here 16 bits are awarded to picking an uncommon, but non gibberish word. That would imply that the person picked a word in a truly random fashion from a list of 2^{16} words (65536). I don’t believe that people would be truly random in their choice of base words, so I would assign fewer bits to this choice, but I’m not going to quibble about a few bits here and there.

The red box covers three “tricks”. Adding some punctuation and a numeral to the end of the password. Adding a numeral gives us roughly four bits of additional entropy, and punctuation gives us four. We get one additional bit by not knowing which comes first, the digit or the punctuation.

I didn’t put a box around the common substitutions and misspellings of changing something like Troubadour to Tr0b4dor; three additional bits seems about right.

When we add up all of the bits of entropy that this system uses we get 28-bits. Of course the system can be made more complex and may go up a few bits, but almost certainly at a great cost of memorability.

For those who recall some laws of logarithms, you now can see an additional benefit for using bits as our unit instead of numbers of possible outcomes: We can add the bits contributed by each choice instead of having to multiply the number of possibilities. It is very convenient to say that such-and-such adds *X* bits of entropy.

Now contrast this with using a sequence of *random* common words. It is absolutely crucial that the words be chosen in a truly random fashion. Here 11 bits are assigned to each word. That means that the list of common words used has 2^{11} elements. That is, it is from a word list of 2048 words. This gives a more memorable password with 44 bits of entropy.

### Cracking time

Depending on what sort of access the bad guys have, they can test from 1000 passwords per second to hundreds of thousands per second. For more information on how we slow this down, see the post on PBKDF2. Only you can decide how much effort someone will put into cracking your master password if they get hold of your data.

Using Diceware alone with five words (you know what I’m talking about because you read the earlier post) you will get 64 bits of entropy. If you add your own private scheme (say it contributes 10 bits) then you will have 74 bits of entropy, which would take about 500 million years to crack at one million guesses per second. Not everyone needs that kind of strength in a master password.

Of course if you do have Three Letter Agencies willing to spend hundreds of millions of dollars specifically on getting at your secrets, then you have problems bigger than what can be managed through software alone. Indeed, let me point you to another favorite xkcd comic.

[Updated: August 11 to correct my colorblindness error and spelling of Randall Munroe's name. — jpg]

Hello,

Great article and the PBKDF bit got it bookmarked under several headings for me, partly because of part 3, below.

First, I tried to read through the comments first to see if this had been touched on, but i did catch myself heavily skimming a couple times.

2) the idea of a strong password generator seems misguided to me as the source (to be hashed,proc’d) can be random, and it’s random, as much as computers can be, fine, but random. What would that entail, unless you actively used different forms of random generation, and varied the spacing in time of random generation by weird variable amounts of milliseconds, assuming some weird time based function, to avoid even spacing there? I guess, I’m thinking you’re up against the limit of what is accepting the password, what characters it allows, etc. Unless, you were to try to propose a standard and protocol incorporating direct interchange between the client and the service provider, not limited to even the expansive unicode character set. Maybe even based on actual pieces of elliptical curves for instance, chopped up, sent in a vector graphics image format, subjected to some arcane alteration, and compared to a hash of the values of the formula they are supposed to compare to. I’m pretty sure i took a step backwards in variation at the end, there, but just tossing stuff together. Point is the current system is limited, right? non-character data of some sort would have more variation, or even altered character data. Bit strings,long ones, with variable words lengths. a 12 byte character, and a 123 bit character, and a etc, etc. bit lengths chosen by some permutation of the string. The user doesn’t know their password, it’s just an authenticator. The hope being that the some what physically limited access to their specific authenticator clients would prevent unauthorized access. You either target and hack my machine or break into my house to get at my email, which is at least more highly punishable by law.

And 3) just a thought, (and there’s good reason to believe you may actually do this, as coming out on a public forum and announcing your methods seems detrimental, unless you assume the method is compromised through employee, social means, which wouldn’t be bad form, but i digress, severely) but given character strings, and by using a varying (high) number of iterations, assigned and computed once per month, using a formula based on some piece of their user data picked ‘randomly’ from a table (reordered monthly) of 60 functions from a sum of a hash function, so no employee would necessarily know what that formula was at a particular time, or you could at least log who accessed that data, you could push your bits of entropy maybe, well, not quite exponentially, off the charts. Enough variance, and you just couldn’t follow that trail of breadcrumbs. Hell, once designed you could pass the function picking and maybe parts of computation, who knows, to a locked box who didn’t know what they were computing something for, but had the software they ran once a month to give you your encrypted table. Sure, maybe new vulnerabilities could be identified in there, but obfuscation of means, misdirection, dissection of the process, layered, and you wouldn’t even be able to use that table in an attack. But, the computation, still less. Of course, the raw password would have to exist somewhere, on the users box, but even then between secure protocols, and pushing and pulling pieces of the computation, further scrambling, and I’m convinced you could break it up and automate it.

Just tossing some thoughts together, but variation is the desired outcome, right? Lots of avenues to get there.

Hello Haji,

I’m delighted that you have found this and the PBKDF2 article useful.

I’ll try to address some of what you raised in your comment, but I’m not quite sure that I followed all of it.

For your second point, there are two things to keep in mind. The actual keys used for encryption and decryption are totally random numbers, derived using cryptographically appropriate processes. On the Unix-like systems, the entropy comes ultimately from /dev/random, though it is an indirect process using Apple’s CommonCrypto functions. This is discussed more extensively in one of our forum threads.

Your master password is not your encryption key, instead your master password is used to encrypt your actual key. The actual encryption keys, 128-bit numbers, are not the kind of thing that anyone actually deals with directly. This allows keys to be more complex entities than the kinds of things that people deal with. As you suggest, the keys could be a specification on an elliptical curve, or the components for an RSA key or DH key. By having the master password be something that conceals or unlocks the key instead of the key itself, this frees us up to have those sorts of complex things for keys. (At the moment, 1Password encryption keys are 128-bit or 256-bit AES keys.)

For your third point I see two issues. One is really about Kerchoff’s Principle and the other is again about how we get random numbers out of computers. I’ve already talked about the latter, but I am always happy to talk more about Kerchoff’s Princple.

For a long time, crypto systems were used where both the system and the key were concealed from the enemy. Now-a-days that is know to be insecure practice (ok, there are a handful of exceptions). There are a couple of reasons for this. Suppose that the security of 1Password depended on us using secret algorithms. If that were the case, it would mean that we here would have some ability to break into things that you encrypt with 1Password. You certainly don’t want that, and we don’t want that either. (It does mean, however, that if you forget your master password, we have no way of recovering your data).

But even more importantly, a system which a designer might think is secure can actually have severe weaknesses that the designer doesn’t see. The slogan (I think this is due to Bruce Schneier, but it might be someone else) is that “anyone can create a system that he himself can’t break.” The true test of a system is how it has stood up to public scrutiny. This is why everyone from governments, banks, and 1Password use algorithms that have been intensively analyzed.

What I am trying to do is extend this notion to selection of master passwords. We should start to assume that the attacker knows about the system that you use and so you need to have a system that survives that. And this returns to the question of good random number generators. Instead of rolling your own secret system, you should be looking for something that has withstood the scrutiny of experts. That is what we have done in our choice of random number generators.

Cheers,

-j

Isn’t this the same principle behind the Amazon Pay-Phrase system?