A salt-free diet is bad for your security

I am not giving anyone health advice. Instead, I’m going to use the example of the recent LinkedIn breach to talk about hashes and salt. Not the food, but the cryptology. Before you dive into this article, you should certainly review the practical advice that Kelly has posted first. Also Kelly’s article has more information about the specific incident.

I’m writing for people who saw security people talking about “salt” and want to know more.
You may have seen things like this, that appeared in an article at The Verge.

It’s worth noting that the passwords are stored as unsalted SHA-1 hashes. SHA-1 is a secure algorithm, but is not foolproof. LinkedIn could have made the passwords more secure by ‘salting’ the hashes.

If you would like to know what that means, read on.

What we know

We know that a data set of about 6 million password hashes has been released. We also know that this does include LinkedIn passwords (more on how we know that later). The data made public does not include the usernames (email addresses), but it is almost certain that the people who got this data from LinkedIn have that. By the end of this article, you will understand why people who broke in would release only the hashes.

Hashing passwords

Websites and most login systems hash passwords so that they only have to store the hash and not the password itself. I have been writing about the importance of hash functions for a forthcoming article, but here I will try to keep things simple. A hash function such as SHA-1 converts a password like Password123 to “b2e98ad6f6eb8508dd6a14cfa704bad7f05f6fb1″. A good hash function will make it unfeasible to calculate what the password was if you only know the hash, but the hash function has to make easy to go from the password to the hash.

A hash function always produces the same hash from the same input. It will always convert “Password1″ to that same thing. If both Alice and Bob use Password123, their passwords will be hashed to the same thing.

Let’s put rainbows on the table

Even if Bob and Alice use the same password (and so have the same hash of their passwords) they don’t have to worry about each other. Who they need to worry about is Charlie the Cracker. Charlie may have spent months running software that picks passwords and generates the hashes of those passwords. He will store those millions of passwords and their hashes in a database called a “Rainbow Table.”

A small portion of Charlie’s table may look like this

Password SHA-1 Hash
123456 7c4a8d09ca3762af61e59520943dc26494f8941b
abc123 6367c48dd193d56ea7b0baad25b19455e529f5ee
Password123 b2e98ad6f6eb8508dd6a14cfa704bad7f05f6fb1

(Rainbow tables are structured to allow for more efficient storing and lookup for large databases. They don’t actually look anything like my example.)

When the hashed passwords for millions of accounts are leaked, Charlie can simply look up the hashes from the site and see which ones match what he has in his tables. This way he can instantly discover the passwords for any hash that is in his database.

Of course, if you have been using 1Password’s Strong Password Generator to create your passwords for each site, then it is very unlikely that the hash of your password would be in Charlie’s table. But we know that not everyone does that.

Charlie also has a lot of friends (well, maybe not a lot, but he does have some) who have built up their own tables using different ways of generating passwords. This is why Charlie, who has both the usernames and the hashed passwords, may wish to just circulate the hashes. He is asking his friends to help lookup some of these hashes in their own tables.

I also promised to tell you how we know that the leaked hashes are indeed of LinkedIn passwords. If someone has a strong unique password for their LinkedIn account it is very unlikely that it was ever used elsewhere. So if the hash for that strong and unique password turns up on the leaked list, we can know it is from LinkedIn. This is presumably what Robert Graham did when determined that this is the LinkedIn list.

Salting the hash

More than 30 years ago, people realized the problem of pre-computed lookup tables, and so they developed a solution. The solution was to add some salt to password.

Here is how salting can work. Before the system creates a hash of the password it adds some random stuff (called “salt”) to the beginning of password. Let’s say it adds four characters. So when Alice uses the Password123 the system might add “MqZz” as the salt to the beginning to make the password MqZzPassword123. We then calculate the hash of that more unique password. When storing the hash, we also use store the salt with it. The salt is not secret, it just has to be random. Using this method, the salted hash that would be stored for Alice’s password would be “MqZz1b504173d594fd43c0b2e70022886501f30aee16″.

Bob’s password will get a different random salt, say “fgNZ”, which will make the salted hash of his password “fgNZ2ec6fa506fa9048d231b765559e2f3c79bdee5a1″. This is completely different than Alice’s, and – more importantly – it is completely different from anything Charlie has in his rainbow tables. Charlie can’t just build a table with passwords like Password123, instead he would have to build a table that contained Password123 prefixed by all of the two million possible salts we get using a four character salt.

Beyond salt

Salting is an old technology, yet a surprising number of web services don’t seem to be using it. This is probably because many of the tool kits for building websites didn’t include salting by default. The situation should improve as newer toolkits encourage more secure design, and also as these issues make the news.

But just as we are getting people up to speed with a 30 year old technology, salting may no longer be enough. And mere salting certainly isn’t good for the passwords that require the most security. To defend against determined and resourceful password crackers you should use both strong passwords and a password based key derivation function like
PBKDF2, which 1Password does use when encryption your Master Password.

Other posts in this series

  1. Guess why we're moving to 256-bit AES keys (March 9, 2013)
  2. Authenticated Encryption and how not to get caught chasing a coyote (January 18, 2013)
  3. Doing the two-step until the end of time (December 20, 2012)
  4. Alan Turing's contribution can't be computed (December 8, 2012)
  5. Hashing fast and slow: GPUs and 1Password (December 5, 2012)
  6. Credit card numbers, checksums, and hashes. The story of a robocall scam (October 18, 2012)
  7. Flames and collisions (June 7, 2012)
  8. A salt-free diet is bad for your security (June 6, 2012)
  9. Cipher of Advanced Encryption Substitution And Rotation (April 1, 2012)
  10. Do you know where your software comes from? Gatekeeper will help (March 1, 2012)
  11. AES Encryption isn't Cracked (August 18, 2011)

We'd love to hear your comments in our forum!