@natmchugh Image suggestion for the third hash collision: White, Brown, ...and Black -- http://t.co/yYEawVjPVaSo I set to work.
— Neil K. (@kneil_) November 4, 2014
After a couple of false starts where I started with the wrong image file I managed to achieve a three way collision. Here are the images.
If you want to check
$ curl -s http://www.fishtrap.co.uk/black.jpg.coll | md5 b69dd1fd1254868b6e0bb8ed9fe7ecad $ curl -s http://www.fishtrap.co.uk/brown.jpg.coll | md5 b69dd1fd1254868b6e0bb8ed9fe7ecad $ curl -s http://www.fishtrap.co.uk/white.jpg.coll | md5 b69dd1fd1254868b6e0bb8ed9fe7ecad
A new hash value
This isn't the same hash as before instead the 3 images now collide with a new hash value b69dd1fd1254868b6e0bb8ed9fe7ecad . This is because I had to add near collision blocks to all three images. In the case of the first two the blocks added are the same. This is probably best illustrated with a diagram.
Again I created the files with HashClash. As inputs I used white.jpg and black.jpg images. To make brown.jpg.coll I just had to append the extra collision blocks to brown.jpg which was already a collision with white.jpg.
I could go on adding more and more files in a tree structure to get many documents to collide. The number of collisions needed is n-1 where n is the number of files. It was this tree of collisions that allowed Marc Stevens to predict the 2008 US presidential election.
A word about file sizes
The files started out different sizes to each other, however, before each collision was generated between two files padding had to be added to one of the files to make it the same as the other. Without this step it would be impossible to extend a collision in the unpadded version to the full MD5 algorithm. This is because the padding includes the size of data processed.