Better Master Passwords: The geek edition

I’ve always wanted to write a technical followup to an earlier post, Toward Better Master Passwords, but this time going into some of the math behind it. Today’s xkcd comic does that for me:

Indeed, what took me nearly 2000 words to say in non-technical terms, Randall Monroe was able to sum up in a comic. This just shows the power of math, but that’s another issue. So for those of you who want to understand the comic and see how it relates to my earlier post, read on. But first read or re-read my earlier post on strong master passwords.

If, like most sane people, you don’t want to dive into a technical discussion, then stop here and just read the original, non-technical, post that says the same thing as the comic. It’s also where the practical advice is.

The only thing I’ll restate

There is one concept (well, actually two concepts) from the Toward Better Master Passwords post that needs to be restated. It is central to everything that follows:

The strength of a password creation system is not how many letters, digits, and symbols you end up with, but how many ways you could get a different result using the same system.

This embodies two things that we need to take into account when looking at the strength of some components of security. Kerchoff’s Principle, and entropy.

Kerchoff’s Principle

Kerchoff’s Principle states that you should assume that your adversary knows as much about the system you use as you do. In this case it means that if you are following advice about how to generate strong memorable passwords, the people who will be trying to break that password are at least as familiar with that advice as you are.

I can’t over-emphasize the point that we need to look at the system instead of at a single output of the system. Let me illustrate this with a ridiculous example. The passwords F9GndpVkfB44VdvwfUgTxGH7A8t and rE67AjbDCUotaju9H49sMFgYszA each look like extremely strong passwords. Based on their lengths and the use of upper and lower case and digits, any password strength testing system would say that these are extremely strong passwords. But suppose that the system by which these were generated was the following: Flip a coin. If it comes up heads use F9GndpVkfB44VdvwfUgTxGH7A8t, and if it comes up tails use rE67AjbDCUotaju9H49sMFgYszA.

That system produces only two outcomes. And even though the passwords look strong, passwords generated by that system are extremely weak. Of course nobody would recommend a system that only produced two outcomes, but people do recommend systems that produce a far more limited number of outcomes than one might think by inspecting an individual result of the system. This is because humans are far more predictable than we like to believe.

Entropy

What unit do we use to measure the number of different results you can get from a system? The answer to that is “bits of entropy”. The silly system I listed above can get us two different results. We can represent two different outcomes using one binary digit (bit). Passwords from that system have just one bit of entropy.

Now suppose we had a similar system that involved rolling one die. That would lead to six possibilities. Six outcomes can be represented in three bits (with a little room to spare). The actual number of bits is closer to 2.58. (And for those who really want to know where that number came from it is the base-2 logarithm of 6.)

One feature of using bits of entropy as a measure is that each bit represents a doubling of the number of possible outcomes. Something with 10 bits of entropy represents 1024 possibilities, while 11 bits will double that to 2048 possible outcomes. There are many reasons that we use bits instead of the number of possibilities. I won’t go into the mathematical reasons, but one nice result is that it gives us manageable numbers. In cryptography we routinely deal with things that have 128 bits of entropy. 128-bits would represent 340282366920938463463374607431768211456 possible outcomes. It’s hard think about or compare such numbers.

Working through the comic

Now lets look at a few things in the first pane of the comic. Let’s start with the stuff I’ve put into the green pink box. The single small gray square in the bottom of the green pink box shows that the choice between capitalizing or not capitalizing the word adds one bit of entropy. Of course people could add more entropy by possibly capitalizing other letters, but people don’t capitalize randomly. The do so at the beginning, at the end, or sometimes at internal word or syllable boundaries.
If people capitalized randomly that would add a lot of entropy, but capitalizing randomly would make the password impossible to remember.

Now let’s look at the stuff that I’ve put in the blue box. Here 16 bits are awarded to picking an uncommon, but non gibberish word. That would imply that the person picked a word in a truly random fashion from a list of 216 words (65536). I don’t believe that people would be truly random in their choice of base words, so I would assign fewer bits to this choice, but I’m not going to quibble about a few bits here and there.

The red box covers three “tricks”. Adding some punctuation and a numeral to the end of the password. Adding a numeral gives us roughly four bits of additional entropy, and punctuation gives us four. We get one additional bit by not knowing which comes first, the digit or the punctuation.
I didn’t put a box around the common substitutions and misspellings of changing something like Troubadour to Tr0b4dor; three additional bits seems about right.

When we add up all of the bits of entropy that this system uses we get 28-bits. Of course the system can be made more complex and may go up a few bits, but almost certainly at a great cost of memorability.

For those who recall some laws of logarithms, you now can see an additional benefit for using bits as our unit instead of numbers of possible outcomes: We can add the bits contributed by each choice instead of having to multiply the number of possibilities. It is very convenient to say that such-and-such adds X bits of entropy.

Now contrast this with using a sequence of random common words. It is absolutely crucial that the words be chosen in a truly random fashion. Here 11 bits are assigned to each word. That means that the list of common words used has 211 elements. That is, it is from a word list of 2048 words. This gives a more memorable password with 44 bits of entropy.

Cracking time

Depending on what sort of access the bad guys have, they can test from 1000 passwords per second to hundreds of thousands per second. For more information on how we slow this down, see the post on PBKDF2. Only you can decide how much effort someone will put into cracking your master password if they get hold of your data.

Using Diceware alone with five words (you know what I’m talking about because you read the earlier post) you will get 64 bits of entropy. If you add your own private scheme (say it contributes 10 bits) then you will have 74 bits of entropy, which would take about 500 million years to crack at one million guesses per second. Not everyone needs that kind of strength in a master password.

Of course if you do have Three Letter Agencies willing to spend hundreds of millions of dollars specifically on getting at your secrets, then you have problems bigger than what can be managed through software alone. Indeed, let me point you to another favorite xkcd comic.

[Updated: August 11 to correct my colorblindness error and spelling of Randall Munroe’s name. — jpg]

47 replies
Newer Comments »
  1. Tommy
    Tommy says:

    Nice article Jeff. IMO it begs the question though: When will a password generator that uses this strategy be added to 1Password? Even though the software is such that you shouldn’t have to have memorizable passwords, it would still be nice.

    • Jeff
      Jeff says:

      Thanks Tommy.

      The post not so much begs to question of when we will put this in the Strong Password Generator, but it begs for the question to be asked in the comments.

      The answer is that it’s something we’ve been talking about, but have come to no firm decision. Thanks for letting us know it would be useful.

      Cheers,

      -j

    • Jeff
      Jeff says:

      Oops.

      I’m now going to grumble at the people I asked to look over this before posting, but we’ve all been unbelievably busy with the Safari 5.1 extension, so I guess I’ll just have to forgive them if they skimmed it.

      As should be obvious by now, I suffer from a form of colorblindness. And that pink still look green to me.

      Cheers,

      -j

    • brenty
      brenty says:

      lol Andy. I have no idea where that came from, but thank you for that! :)

      “It is not the spoon that bends, it is only yourself”.

  2. Chris Darling
    Chris Darling says:

    This issue is definitely a fascinating one. I especially enjoyed that cartoon. A picture is always an immediate way to explain something, be it an outcome, concept or a process. I was wondering however, what you though of Steve Gibson’s assertion (Podcast #303 – password haystacks: http://www.grc.com/sn/sn-303.txt) that though Entropy IS important, that password LENGHT is of far greater importance – that is, entropy is necessary, but within a given password “space” that a brute force attack take much, much, much longer with say a LONG padded password – one that a user could create and pad with a simple methodology? Wanted to hear your opinion. Thanks!

    • Jeff
      Jeff says:

      Hi Chris,

      I have enormous respect for Steve Gibson, but this isn’t the first time I’ve disagreed with him. His argument depends on the assumption that password cracking software is dumb. That is, that after checking a few common passwords it simply tries all possibilities with a of a given length. In other words, he is assuming that password cracking systems assume that passwords are truly random. Under such assumptions length would be the crucial factor.

      But we know that password cracking software isn’t dumb. And it gets smarter every day. Indeed, exactly the kinds of things illustrated in the first pane of the comic are included in modular “rule sets” for popular password crackers. Just as there are rule sets that perform the common substitution or add some numbers at the end, it would be easy to add rule sets that check for passwords based on Steve’s recommendations over at https://www.grc.com/haystack.htm

      Going back to Kerchoff’s principle, consider what would happen if a substantial portion of people started following Steve’s advice. Password crackers would quickly adjust. (Remember that the people actively working on password cracking have access to millions of stolen passwords to study.) Now suppose that a substantial number of people followed a diceware-like approach that I advocated earlier. These systems do no lose strength through the mere fact that attackers know we are using them.

      We can, in this case, turn Kerchoff’s Principle into a Kantian question. Does our advice remain good advice if everyone follows it? It’s not that I actually realistically expect the world to start following the kind of advice that I’ve given, but it still seems like it is a useful way to think about the quality of the advice.

      Cheers,

      -j

    • Ross
      Ross says:

      After reading this article and a lot of the other commentary on the XKCD, I wonder if it would actually be more true to say this: length is a better generator of entropy than any other mechanism of obfuscation. That is: the decision of which letters to capitalize or which letters to replace with l33t sp33k give you two or three bits of entropy, but simply slapping one more dictionary word onto the end of your phrase gives you 11 bits, and an all-lowercase truly random alphabetical sequence that is 10 characters long gives you about 50 bits of entropy, while an 8 character one that requires at least one each of upper, lower, number, and punctuation, is more like 42 bits.

      And also, length (in the absense of other constraints) has a markedly lower impact on memorability than the other kinds of obfuscation.

      Does that sound right?

    • Jeff
      Jeff says:

      Hi Ross,

      I think you are on the right track, but length matters of the elements are chosen at random. But if the elements are not chosen at random, then length doesn’t help. (Take another look at my coin flipping example in the article, which resulted in very long passwords.)

      When elements are chosen at random from an “alphabet” then the entropy will be proportional the length. Now what’s a little tricky to keep in mind here, is that for these calculations the length of “correct horse battery staple” is 4. That is we’ve strung together a four elements from an “alphabet” that contains 2048 elements. Because each element is chosen at random from a list of 2048 elements, it contributes 11 bits of entropy. Adding an additional word, would add another 11 bits.

      What makes these memorable is that each element (even though it is drawn from a long list) has meaning. People remember meanings, so by using familiar words, we just have meanings to remember.

      The diceware list that I talked about earlier has more elements (and are short words, but some are obscure) so we get a few more bits of entropy for each. Plus, the diceware scheme provides a really terrific mechanism to pick the words at random.

      But back to meaning and your question, random words are easy to remember than random character substitutions or random capitalizations. Because of this, people don’t actually add capitalization randomly.

      Cheers,

      -j

  3. Scott Jangro
    Scott Jangro says:

    What happens to the entropy when it becomes fairly common practice to use 4 (or 3 or 5) random common words as passwords?

    At that point, the people trying to break passwords aren’t doing it one character at a time, but one word at a time. How long does it take to go through every 4 word combination of the few thousand common words?

    • Jeff
      Jeff says:

      Scott, this is exactly the right question to ask!

      As I just responded to Chris, we need to ask ourselves how good is a system if everyone starts to use it. The system of four common words as illustrated in the comic provide 44 bits of entropy even if an attacker knows exactly what system you used. That is a whole lot better than typical passwords, but I’m not sure that I would consider 44 bits enough for a 1Password master password for those really anticipating a concerted effort at cracking.

      The advice I offer in my earlier post on the matter would give you about 74 bits of entropy, which should be plenty. Please take another look at the section of “Crack time” in my post above.

      Note that estimates crack time for schemes like what I’m suggesting already assume that the attacker knows what system you used. Most crack time testers on various websites, however, make the entirely unrealistic assumption that the password being tested was chosen at random.

      Cheers,

      -j

  4. Daniel
    Daniel says:

    This is a very interesting post. I love mathish stuff like this.

    How many bits do the strong passwords generated by 1Password have? It would depend on the parameters, I suppose, but given a “Random”, 50-character password with numbers, symbols, ambiguous characters, and repeats, that would be … what? 300 or so bits?

    That would take many many millennia to crack one of those babies. It would be quicker to point a gun in my face and ask (which is probably the MO for the TLAs).

    How would one take into account that you can have up to 10 digits in a 1Password-generated password, but you don’t know which characters are the digits?

    • Jeff
      Jeff says:

      Hi Daniel,

      If we set parameters for the strong password generator to zero digits and symbols, allow characters to repeat, and not to avoid ambiguous characters, then we have an alphabet of 52 characters. The number of possible results (we use a cryptographically strong random number generator for doing these) will be 56 to the Nth power, where N is the length. So the number of bits of entropy will be ln(56^N)/ln(2). (Recall that to calculate log base a of x is just ln(x)/ln(a), where “ln” is the natural logarithm.)

      So for your example, with N = 50, we get: 285 bits, which isn’t far from your guess of 300. I’m not going to do the years to crack calculation on that. I already know that 128 bits already is on the order of 10^30 times the age of the universe.

      Also, the random key that is encrypted by your master password is “only” 128 bits (again, gadzillions of times the age of the universe to crack), so there is no gain in having a master password (or really any password) stronger than that.

      I think some pictures that I took on a recent vacation illustrate that point:

      http://i.agilebits.com/blog/bear-proof.png

      I haven’t had time to sit down and do the math for the symbols and digits business. Knowing that there are exactly N symbols or M digits certainly takes away some entropy, while, as you correctly say, not knowing where they are adds some. I haven’t done the calculation, but my intuition is that it adds more entropy than it takes away. (Think of it like the common substitution trick.)

      Those particular features of the Strong Password Generator are designed for people using web sites that actually say “you must use 3 digits and 1 symbol in your password”. As such websites are, thankfully, becoming less common, we may take a different approach the next time we redesign the Strong Password Generator.

      Cheers,

      -j

  5. Lri
    Lri says:

    A Tr0oub4dor&3-style password would be more secure if the modifications were less predictable. For example three characters could be substituted with random printable ASCII characters (like tr}u?adGr).

    • Uncommon long-ish dictionary words: ~ 16 bits (like in XKCD)
    • Replacement characters: 95^3 (or (95-1)^3) ≈ 20 bits
      • One number, one capital letter and one special character: ~ 102633*3! ≈ 16 bits
    • Combinations of letters to be substituted (from a 9-letter word): comb(9, 3) = (987)/(3*2) ≈ 6 bits

    So about 40 bits, but the methodology could be even more unpredictable. In comparison 26^9 ≈ 42 bits and 95^9 ≈ 59 bits.

    • Lri
      Lri says:

      Whoever’s moderating the comments: the Markdown formatting for italics messed up some of the multiplications… And it should be Tr0ub4dor&3.

    • Jeff
      Jeff says:

      Hi Lri,

      I’m not sure what markdown issues you are talking about. But if you let me know more, I’ll see what I can do.

      The 8-bits of entropy that come from adding the “&3” at the end are accounted for elsewhere. Where I mentioned “Tr0b4dor” I am only talking about the entropy added by common substations and misspellings. We shouldn’t count the “&3” suffix twice. But I could have been clearer by referring to it as the transformation from “Troubadour&3” to “Tr0ub4dor&3”.

      Thanks!

      -j

    • Jeff
      Jeff says:

      You are absolutely correct, Lri, that truly random substitution would make the Tr0oub4dor&3-style passwords much stronger. But it would also make them much harder to remember. Likewise, making some randomly chosen letter uppercase and throwing in the random digit and character added at a random location would add the entropy you describe.

      If you can manage all of this and really do the random parts randomly and then actually remember the password you create through such a system, more power to you. But given the difficultly of generation and memorability, I think you will see why that is not something I would be advising to our users.

      With diceware anyone who can download the list and has some dice can generate memorable, strong passwords.

      Cheers,

      -j

  6. Jeff
    Jeff says:

    Hi, Lri,

    Can you email me (jeff@agilebits.com) a screenshot or more detailed description illustrating the markdown issue?

    The 8-bits of entropy that come from adding the “&3” at the end are accounted for elsewhere. Where I mentioned “Tr0b4dor” I am only talking about the entropy added by common substations and misspellings. We shouldn’t count the “&3” suffix twice. But I could have been clearer by referring to it as the transformation from “Troubadour&3” to “Tr0ub4dor&3”.

    Thanks!

    -j

Newer Comments »

Comments are closed.