Flames and collisions

Having a Microsoft code signing certificate is the Holy Grail of malware writers. This has now happened.Mikko Hypponen

Unless you are a system administrator for a government institution in or around the Middle East you do not need to worry about Flame infecting your computer. Flame (also known as “Flamer” and “skywiper”) itself is not a security concern except to a very narrow, targeted group. Quite simply you don’t need to worry about being infected by Flame, and antivirus vendors who suggest otherwise may be engaging in fear mongering.

With so few people in danger of Flame, why am I writing about it? Good question. I’m writing about it because one of the methods used in Flame has the potential of undermining a crucial part of computer security. The authors of Flame have the ability to subvert the Windows Update process. Whatever Flame itself does or doesn’t do, the fact that its authors acquired the capability to distribute fake updates to Microsoft Windows is cause for serious concern.

Software updates and chains of trust

I have previously written about how an important part of computer security is ensuring that your software updates come from the right place. You don’t want someone who pretends to be AgileBits giving you malicious updates to 1Password. And you don’t want someone who pretends to be Microsoft giving you malicious Windows Updates. The methods used for digitally signing downloads and updates involves some mathematical magic and a Chain of Trust. In the summer 2011, we saw, in the example of DigiNotar, what can happen when someone finds a way to insert themselves into the chain of trust.

These two articles, “Who do you trust to tell you who to trust?” and “A peek over the Gatekeeper” explain the security infrastructure I’ll be writing about here. You will see terms like Certificate Authority or Man in the Middle attack in this post, but they are more fully explained and illustrated in those other posts.

Putting trust to the flame

Windows Update iconThe authors of Flame have acquired the ability to digitally sign their own updates to Microsoft Windows as if they were (almost) Microsoft. They have been able to create digital signatures in a way that will successfully fool the Windows Update process. Microsoft made a series of mostly innocent blunders in creating some intermediate Certification Authorities that are used by customers to creating individual licenses for a particular product. However, in combination those blunders on something that was not supposed to be part of the critical system turned out the have major consequences.

The details are subtle, and much of what we know comes from announcements by Microsoft about their recent urgent security updates. With the sketchiness of the details and the fluidity of the situation, it is safe to assume that anything I write about the mechanism used to create the bogus signatures will be out of date by the time I actually post this article. A great source for information about this is Matthew Green’s post on his Cryptographic Engineering blog. The technical details of Flame (PDF) itself are coming from research at the Budapest University of Technology and Economics.

Microsoft has issued a security update, which you get via – you guessed it – Windows Update. It is not clear (to me, at this point) whether these updates fix the long term problem or just fix the Flame-specific signatures. In a notice on June 4 they state that the June 3 update was the first of a series of actions in a phased mitigation strategy. So there will be more to come.

But the most interesting thing (to me) in their notice is

The Flame malware used a cryptographic collision attack in combination with the terminal server licensing service certificates to sign code as if it came from Microsoft. [Emphasis added]

And this gives me the opportunity to discuss an important concept in cryptography that I haven’t talked about before.

Hashes and Collisions

In the article about Gatekeeper I talked a bit about some of the mathematical magic behind digital signatures, but I left many pieces out. The digital signature of a file is not actually a signature of the file directly. For a number of technical reasons it is too unwieldily to directly construct a signature of a large chunk of data; the file itself would need to be treated is a single enormous number that is then used as an exponent of some other large number. If the file were just 10 kilobytes, its number would be on the order of 281920 (about 25,000 digits long). We deal with some really big numbers in cryptography, but we would like to just deal with numbers about the size of 2256 (around 77 digits).

To create a digital signature of a file we first need to do is reduce the file (no matter how big it is) to a number in the range we can deal with. We create a cryptographic hash (sometimes called a message digest of a file) and then digitally sign that hash. If a hash function (something that takes all the data in a file and produces a smaller number) is to be useful for digital signatures it must have a number of security properties. The security property we are interested in today is called (strong) collision resistance.

Collision resistance
It must be infeasible to find or create two distinct files that have the same hash.

Any time you have two distinct files with the same hash you have a collision. In principle collisions are inevitable because there are more possible files than there are hashes, after all, the point of the hash to turn a really big chunk of data (a file) into a smaller (though still big) number. Instead the security of hash functions relies on how difficult it is to find or create collisions.

In a recent article, I talked about another security requirement of a cryptographic hash. It must be infeasible to calculate anything about the original file from the hash. That is called pre-image resistance, but today’s topic is collision resistance.

Danger: Collision ahead!

Now let’s look at why collision resistance is important for digital signatures. Collision illustrationSuppose that Patty, one of my dogs, creates two files. One of them says, “Molly is the cleverest and prettiest dog ever, and Molly gets to rule.” The other file that Patty creates says, “Patty will get all of Molly’s dog treats.” Patty, of course, is the clever one. She is able to create these two files so that they produce the same hash. Patty has created these to files in such as way that their hashes collide. Patty asks Molly to digitally sign the “Molly rules” file, and Molly is happy to comply. Molly, using her secret key, creates a signature for the “Molly rules” file. The digital signature is actually a signature of the hash of the file, and so that signature will also work as a signature for the “Patty gets all the treats” file.

Patty will then bring me the “Patty gets all of Molly’s treats” file along with Molly’s signature. I calculate the hash of the file and verify that the hash really is signed using Molly’s secret key. Nobody but Molly knows that secret key, and so I figure that Molly must have signed that file. I give all of Molly’s treats to Patty. Molly may like to collide with Patty on walks, but this is one collision that Molly is not at all happy about.

It’s more than just dog treats that are at stake here. As I described in the Gatekeeper article, it’s not just ordinary files that can be signed (or more correctly, have their hashes signed), but certificates and Certificate Authorities are also just data that can be signed. The signature on a certificate helps our computer determine whether it can trust that certificate. If Molly controls a Certification Authority (CA) Patty can ask Molly’s CA to sign a certificate for Patty that Molly is happy to sign. But if Patty has another, trickier, certificate that produces the same hash as the one that Molly signed, Patty can simply add Molly’s signature to the bogus certificate.

The solution is to make sure that the hash function that is used for signing certificates (or files related to dog treats) is collision resistant. In particular this means that the MD5 hash algorithm should not be used. Unfortunately some of Microsoft’s Certification Authorities use the MD5 hash system. The MD5 hash system is badly broken, and it is very possible to generate collisions.

The downfall of MD5

Cryptographers don’t talk about “impossible” or “possible”; they talk about “feasible” and “infeasible”. An infeasible task usually means something like “if you used all the computing power on Earth now and for the next decades, would still have only a negligible chance of successfully completing the task.” With MD5 we’ve witnessed the progression from “secure” to “weaknesses that are infeasible to exploit” and eventually to “practical exploits” in the space of about 10 years.

MD5 (Message Digest 5) was shown in 1995 to have theoretical weaknesses. From there the history is remarkably interesting (if you are into that sort of thing). There was one group of people who were saying “don’t use MD5 any new software or systems; use SHA1 instead.” There was another group saying that the weaknesses in MD5 don’t pose an actual threat to how it is actually used. By 2005 it became absolutely clear that the weaknesses in MD5 had been expanded and could be leveraged into real attacks, and in 2008 there was a spectacular demonstration to create a rogue Certification Authority. An important side note is that while the advice to use SHA1 instead of MD5 was very good advice in 1995; today that advice is to use SHA2 instead of SHA1.

As we all watched MD5 go from secure to completely broken with respect to collision resistance in the space of ten years, we also saw that its use continued. I think that many of us failed to understand just how easily collisions could be exploited. It was tough to steer a course between people panicking on one side who were saying that anything involving MD5 was tainted and those on the other side who always seemed to think, “sure MD5 has its weaknesses, but those weaknesses don’t affect my application.”

Microsoft using MD5 in 2009

People continued to use MD5 for a number of purposes that they didn’t see as critical. The intermediate certification authority that Microsoft set up for signing Terminal Services licenses was, presumably, thought to be not critical enough to worry about. The worst (they thought) that could happen is that customers could create a few extra licenses for themselves if they went through the effort to create collisions. So as even as late as 2009 Microsoft created CAs that used MD5 as its hashing algorithm. They turned out to be wrong about “the worst that could happen”. Other blunders allowed a weak CA created in 2007 to sign certificates for other CAs.

Using the same trick that Patty pulled on Molly, an attacker could get a signature for an acceptable CA to work on a malicious one. The malicious one could then be used to sign false updates to Windows. Thus, the attacker would have the ability to sign things that would look like legitimate updates to Windows.

Microsoft has recently (June 6) provided more details, which confirm that what I’ve described above did play a role in the creation of the signatures for the bogus updates.

Lessons

Follow early warnings

When we, the technology community, were advised in 1995 to stop relying on MD5 we should have been quicker to make the transition to SHA-1, even though at the time it wasn’t clear how those weaknesses could be exploited. Likewise, we should follow the same warnings about SHA-1 today. We shouldn’t panic every time a theoretical weakness is found, but we should remember that these can often be leveraged in ways we can’t predict.

Hash algorithms are hard. MD5, as you’ve seen deteriorated rapidly. SHA-1 has been shown to be less than ideal (not yet exploitable in practice), and even its replacement, SHA-2 isn’t all we would hope it to be. Cryptographers and the NIST are working on finding a system to become SHA3. If you have a high tolerance for geeks singing out of key, you may wish to sing along to the SHA-3 song.
The refrain is

SHA-2 will soon retire
Because NIST is learning and SHA-1 is burning
SHA-2 will soon retire
No we didn’t light it but tried to fight it.

The trust infrastructure is fragile

There are a large number of Root Certification Authorities and an even larger number of intermediate CAs. As we’ve seen here, the chain of trust can be compromised through some poorly designed CAs. As we saw last summer, in the case of DigiNotar, a Root CA can have their computer systems compromised allowing an attacker to gain accesses to the secret key needed to sign certificates. A third possibility is that the operators of CA may simply go rogue.

Charles Dudley once noted that “everybody complains about the weather, but nobody does anything about it.” The same can be said about the trust infrastructure that so much of our security relies on. Fortunately there are a few talented people who are working on serious proposals that, while they won’t completely fix the system, will mitigate some of the problems. I won’t review them here.

Are governments friends or foes of computer security?

Flame is almost certainly the creation of a “Western intelligence agency”. A few weeks ago, I would have guessed that it was an Israeli creation, but recent news about the creation of Stuxnet leads me to suspect that Flame, like Stuxnet, was primarily created by the United States. Parts of the US government have done and continue to do a great deal of important work in promoting good security. But we now have an example where the government has undermined a crucial part of computer security.

So far the victims of state-sponsored attacks on the certificate trust infrastructure have been in Iran. Last summer the government of Iran used certificates from a compromised Root Certification Authority, DigiNotar, to interest “secure” connections between individuals in Iran and Gmail. Now it looks like like the US (or possibly Israeli) government has been using poorly constructed Microsoft Intermediate CAs to target specific entities in Iran.

The good news, I suppose, is that these governments still had to exploit vulnerabilities instead of, say, compelling Microsoft to sign bogus updates. Of course, we cannot rule out that government agencies asked Microsoft to include those vulnerabilities. I suspect that Microsoft’s errors were innocent because they are the kinds of errors that people and organizations do make, but I can’t entirely rule out the more paranoid interpretation.

Security has a lot of moving parts

Each of the mistakes that Microsoft made were probably harmless on their own. No doubt they were made by different people who weren’t aware of the decisions that others had made. In combination, the mistakes add up to several ways of generating “good” signatures for bad Microsoft Updates, something with potentially catastrophic consequences.

A few closing remarks

Truth and speculation

I have speculated about how Microsoft made the errors that they did, and I have speculated about what the creators of Flame have done with that. What we do know is the bogus certificates for signing Windows Updates were created, and we know what Microsoft has said about it. We know that CAs using MD5 in their digital signatures are vulnerable in the way discussed, and we know that that the bogus certificates were signed by those weak CAs.

As it turns out, some of my initial speculation has been confirmed since I first drafted this. Microsoft has published an outstanding and detailed report of their analysis.

We also have a rapidly changing situation. It shouldn’t be too surprising if much of what I say about the specifics of the case today is out of date by tomorrow. Still, it gave me the chance to talk about hash functions and attacks based on collisions, along with the problems of using MD5 for digital signatures. Furthermore, Patty and Molly are always seeking attention, so they appreciate that I’ve mentioned them here.

Continuing the discussion

If you would like to comment or continue this discussion, please do so in the Flames and Collisions forum topic I’ve created for the purpose.

Updates

This article was out of date before it was finished, and I expected that there would be more news to come. And so to avoid a continues series of updates, I will be adding updates to the forum thread set up for the purpose.

However, the news that came out right after this article was posted is so jaw dropping that I will repeat it here.

The cryptanalytic technique used to create the MD5 collision is new. It isn’t radically different than previous known techniques, but this is using a technique that would have taken a great deal of expertise to develop.

People have always speculated about how far ahead in cryptanalysis agencies like the NSA or GCHQ are compared to what is known by the academic community. The assumption is that the gap has been narrowing over the decades as there is more open work in cryptography done outside of intelligence agencies. We don’t often get data to help pin anything down with our speculation, but this definitely is interesting.

The Ars Technica article linked to above has a terrific diagram outlining the nature of the general technique used to create MD5 collisions.

 

Other posts in this series

  1. Guess why we're moving to 256-bit AES keys (March 9, 2013)
  2. Authenticated Encryption and how not to get caught chasing a coyote (January 18, 2013)
  3. Doing the two-step until the end of time (December 20, 2012)
  4. Alan Turing's contribution can't be computed (December 8, 2012)
  5. Hashing fast and slow: GPUs and 1Password (December 5, 2012)
  6. Credit card numbers, checksums, and hashes. The story of a robocall scam (October 18, 2012)
  7. Flames and collisions (June 7, 2012)
  8. A salt-free diet is bad for your security (June 6, 2012)
  9. Cipher of Advanced Encryption Substitution And Rotation (April 1, 2012)
  10. Do you know where your software comes from? Gatekeeper will help (March 1, 2012)
  11. AES Encryption isn't Cracked (August 18, 2011)