Awesome MD5 Collisions

   10 June 2005, late morning

Perhaps awesome isn’t the word, but these researchers present two different meaningful documents that share the same MD5 checksum. Usually MD5 isn’t used to sign documents like this, but it is quite common to use MD5 to verify binaries on the Internet.

Briefly, MD5 is a hash function, a program that takes a big string of 1s and 0s (which is what everything on your computer is), and outputs a much smaller string of 1s and 0s. This smaller string is usually called a fingerprint of checksum. MD5 was thought to be secure, but was recently broken. For a hash function to be considered secure:

  1. given the value of a hash, it should be infeasible to find the input that produced the hash; given any input x
  2. it should be infeasible to find another input x' such that the hashes of x and x' match
  3. it should be infeasible to find two different inputs that have the same hash value.

One attack I could envision is creating an evil Trojan distribution of a popular open-source program: You could tell people you are mirroring a popular program—when you download from SourceForge for example there are countless mirrors for you to choose from. No one would assume anything is amiss, since the checksum for your application would match the checksum generated by the real programs being hosted by all the other mirrors; when the people run your evil program and it would do its evil things. (Update: not quite right, see comments.)

|  

Comments

  1. Just to clarify: this attack, and all the others published sofar, only demonstrates (3) – that is, finding two different inputs with the same hash value.

    Many common uses of MD5 don’t rely on this property. The open source trojan attack you describe actually relies on breaking property (2), which hasn’t yet been demonstrated.

  2. Alex, you’re correct—these researched created both the documents so what I’m describing doesn’t make sense.

  3. Are you sure that MD5 isn’t commonly used to sign stuff? e.g. GMail’s certificate has the “Certificate Signature Algorithm” listed as “PKCS #1 MD5 With RSA Encryption”. [Of course, PKCS has issues too…]

Don't be shy, you can comment too!

 
Some things to keep in mind: You can style comments using Textile. In particular, *text* will get turned into text and _text_ will get turned into text. You can post a link using the command "linktext":link, so something like "google":http://www.google.com will get turned in to google. I may erase off-topic comments, or edit poorly formatted comments; I do this very rarely.