tag:blogger.com,1999:blog-74529797860170183112024-03-28T08:10:53.506+00:00Nat McHughTransient Random-Noise Bursts with Announcements
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.comBlogger14125tag:blogger.com,1999:blog-7452979786017018311.post-6624967214108979712017-04-19T14:02:00.000+01:002017-09-18T11:03:10.168+01:00Bug Bounty - Remote Code Execution in Magento 1.xMagento is a popular ecommerce solution written in PHP. It is widely used for web shops both large and small. The most current product is Magento 2 however, Magento 1.x is still supported and widely used since the upgrade path for a heavily customised sites is largely unclear.<br />
<br />
Both version 2 and version 1 of magento make use of the Zend framework for some functionality including the sending of email. Recently issues were found in multiple PHP frameworks which wrap PHPs native mail function.<br />
<br />
The attacks on mail() rely on an attacker being able to set the from address on an email which then gets passed to sendmail as the envelope sender of -f argument. In magento, one area where a user can set the address from which an email originates is the Send a Product to a Friend functionality.<br />
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZSiAFW20gWaDGaIURpbr-ghGy-9HZYmMP23lJ2JJ1IF2PFRcGeCYHft-EafVjgXLTqk-7skUZgx6Nl6mVaVWpdUnfvCzPQ8uo7fcZ89Vdo2VizsIxZo9jb34MPDcBmfRd4QJcc7Ki7AE/s1600/Screen+Shot+2017-01-16+at+10.10.31.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZSiAFW20gWaDGaIURpbr-ghGy-9HZYmMP23lJ2JJ1IF2PFRcGeCYHft-EafVjgXLTqk-7skUZgx6Nl6mVaVWpdUnfvCzPQ8uo7fcZ89Vdo2VizsIxZo9jb34MPDcBmfRd4QJcc7Ki7AE/s320/Screen+Shot+2017-01-16+at+10.10.31.png" width="320" /></a></div>
This is meant to allow logged in customers to send an email to a friend about a product they think they may be interested in. To prevent abuse it is normally only for logged in customers and the admin can set limits on the number of emails sent per hour etc. This functionality is on by default but often custom designs will remove the link from the product page itself in favour of social media buttons.<br />
<br />
By default magento uses the supplied email address to set the From: header on the sent email only. However, within the magento admin backend is a setting labelled 'Set Return-Path' this does indeed set the return path header in email but it also sets the sender via a -f flag passed directly to sendmail.<br />
<br />
Code from app/code/core/Mage/Core/Model/Email/Template.php<br />
<br />
if ($returnPathEmail !== null) {<br />
$mailTransport = new Zend_Mail_Transport_Sendmail("-f".$returnPathEmail);<br />
Zend_Mail::setDefaultTransport($mailTransport);<br />
}<br />
<br />
Dawid Golunski disclosed vulnerabilities in <a href="https://legalhackers.com/advisories/PHPMailer-Exploit-Remote-Code-Exec-CVE-2016-10033-Vuln.html" target="_blank">several wrapper scripts for PHP's mail function</a> which used the from address to set the sender address on sendmail. The problem comes that several valid forms of email address can be used to escape the sendmail command and set additional flags on the sendmail operation. One such flag is the -X log flag which writes out to a log file at a path specified as an argument. It is worth noting that the flag used in the attack will only work if the target server is running sendmail and not postfix's sendmail compatibility interface, which is more common. This will <a href="http://www.postfix.org/sendmail.1.html" target="_blank">accept the -X log flag to maintain compatbility but just ignores it</a>.<br />
<br />
Both the from email addresses and that of the recipient are validated as being of a valid format.<br />
However the local part of an email address in particular can contain more than is normally expected and still be valid according to the spec. This is because of the general rule that only the final email server should parse the local part of an email address. There are several valid email address formats that are surprising such as<br />
<br />
"dave"@example.com<br />
<br />
This allows PHP code from the user input to be included in the email and hence the log file. If the log file is named with a PHP extension and placed somewhere in the web root then the attacker can navigate to the log file and the PHP code will be executed.<br />
<br />
e.g. "dave\" -oQ/tmp/ -Xmedia/log.php some"@test.com<br />
<br />
One thing that makes this attack, marginally, more difficult but much more fun is that the users message is escaped before being included in the email and hence log file. This turns any "<" characters, which are necessary for PHP code tags, into their html equivalents &lt; This means it is not possible to include executable PHP code as part of the message. But the sendmail log file format helpfully quotes the email address in "<" and also removes the double quotes, so starting an email address ?php will create a valid PHP tag when shown in the log file!<br />
<br />
For example an email address of "?php echo 'hello'; "@example.com will get converted into <?php echo 'hello'; @example.com><br />
<br />
<br />
A PHP closing tag cannot be created by the same method however. The email address validation will not accept a domain ending in a "?" character and so the logfile quoting will not form a ?> tag for us.<br />
The last piece of the puzzle is the __halt_compiler() PHP function. This means that PHP code parsing will end and the rest of the log file does not need to be valid PHP code. This is important since complete parsing of syntax happens before any code is executed.<br />
<br />
Here is a short python script to execute the entire attack.<br />
<br />
<script src="https://gist.github.com/natmchugh/efaeef35f28034cfb2f7e0e80fb67771.js"></script>
<br />
The fix applied by Magento as patch <a href="https://magento.com/security/patches/supee-9652" target="_blank">SUPEE-9652</a> was to add email validation to the parameters before calling PHP's mail() function this prevents email address of the form "dave"@example.com being used since the -f flag is already prefixed and -f"dave"@example.com fails validation.<br />
<br />
Magento very generously paid $3000 for this bug. While possibly not the most creative attack the high value reward reflects the seriousness of RCE and the value Magento place on customers security.<br />
<br />
Timeline:<br />
<br />
<table border="1px">
<tbody>
</tbody><colgroup><col width="80"></col>
<col width="500"></col>
</colgroup><tbody>
<tr><th>Date</th><th>Activity</th></tr>
<tr><td>2017-01-09</td> <td>Issue Reported via Bugcrowd</td></tr>
<tr><td>2017-01-13</td> <td>Issue triaged at Bugcrowd and sent to Magento</td></tr>
<tr><td>2017-01-13</td> <td>Magento Issues security advice to disable Set Return-Path</td></tr>
<tr><td>2017-02-7</td> <td>Magento Issues security patch supee-9652 https://magento.com/security/patches/supee-9652</td></tr>
<tr><td>2017-02-13</td> <td>Magento marks issue as resolved pays bug bounty $3000</td></tr>
</tbody></table>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com0tag:blogger.com,1999:blog-7452979786017018311.post-8865497193861717962015-09-07T17:08:00.000+01:002017-02-02T09:51:41.776+00:00MD5 collisions in ssh keys<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTtLPBXoZ5f1R_HA6oi8iJoJ0KR7MAGjM3Iv61eG5ev5wZkaxqvSk_qkJNHUU4Vsgb4cIVRJRLaoMNlNwMzLhUnaTE-kTnNQ0rt-ih_ZOtfqWe8JhZikWlOICqr-0O-WpiUKZQi5ePtl8/s1600/Fingerprintforcriminologystubs2.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTtLPBXoZ5f1R_HA6oi8iJoJ0KR7MAGjM3Iv61eG5ev5wZkaxqvSk_qkJNHUU4Vsgb4cIVRJRLaoMNlNwMzLhUnaTE-kTnNQ0rt-ih_ZOtfqWe8JhZikWlOICqr-0O-WpiUKZQi5ePtl8/s320/Fingerprintforcriminologystubs2.png" width="215" /></a><span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span><span style="font-family: "arial" , "helvetica" , sans-serif;">You can insert MD5 collision blocks in many data formats and if you do it right the result will be 2 objects that differ but which when passed through MD5 have the same hash value.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span><span style="font-family: "arial" , "helvetica" , sans-serif;">MD5 hashes were used as a unique fingerprints in openssh right up until <a href="http://www.openssh.com/txt/release-6.8" target="_blank">version 6.8</a> released 2015-03-18. The idea behind public key fingerprints is that as a client you should be able to uniquely identify the public key of a server you are connecting to. Your client will show you the fingerprint of the public key and hopefully you will recognise it. To aid this you can set your ssh client to draw you a pretty piece of <a href="https://github.com/natmchugh/drunken-bishop" target="_blank">random art based on the hash</a> giving you a visual clue if something has changed.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<pre>Host key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48
+--[ RSA 2048]----+
| . |
| + . |
| . B . |
| o * + |
| X * S |
| + O o . . |
| . E . o |
| . . o |
| . . |
+-----------------+
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">That is an example of connecting to github. You can check that fingerprint <a href="https://help.github.com/articles/what-are-github-s-ssh-key-fingerprints/" target="_blank">here</a>.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span><span style="font-family: "arial" , "helvetica" , sans-serif;">MD5 is no longer considered collision resistant and in fact it is relatively simple to insert collisions in any binary format. This means you can create two ssh public keys that will have the same fingerprint in MD5 and same random art image. All we need to do is take a pre generated ssh public key cut out 128 bytes out of the modulus and replace it with a pair of collision blocks</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Here are a couple of 2048 bit public key files to which this technique has been applied:</span><br />
<pre>ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD74ICRUoaAXkRwqGW51botLrFygSC1SN3C7NR6cfFmpSdRuXZt6d20+GvAzOQKJpPTItK3YRjds1izapAqSM6o/BYWOikTKLHgzJ2s5Lwm6MZnPsnRJ9o9KbnXCytCrj3wFCcay0Re+TJrlPvXggFZlane2yMit2rWx/Fbly44wNzsJLIGTIhsz6UnfmR7Wi7GfiMZ2p48xn3rSHJ3lNPTOzWfg3PFgc2YftzUow1TQ5xaamyPRD/UOXBOj2pfCp75v3TWDwwq3qliBV6pxUhe/ou8ut1zaRET5VTN4kFNziQZoxY4WVmCnCJxjBPt71hM9VzoujZggRNXct8uPdri hostname
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD74ICRUoaAXkRwqGW51botLrFygSC1SN3C7NR6cfFmpSdRuXZt6d20+GvAzOQKJpPTItK3YRjds1izapCqSM6o/BYWOikTKLHgzJ2s5Lwm6MZnPsnRJ1o+KbnXCytCrj3wFCcaS0Re+TJrlPvXggFZlane2yMit2rWx/FbFy44wNzsJLIGTIhsz6UnfmR7Wi7GfiMZ2p68xX3rSHJ3lNPTOzWfg/PFgc2YftzUow1TQ5xaamyPRD/UOXBOj2pfCp75v3TWDwwq3qliBV6pxUhe/ou8ut1zaRET5VTN4kFNziQZoxY4WVmCnCJxjBPt71hM9VzoujZggRNXct8uPdri hostname
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Despite looking pretty much identical they do in fact have different moduli. To check the fingerprint you can use ssh-keygen.</span><br />
<pre>$ ssh-keygen -l -E MD5 -v -f 1.pub
2048 MD5:9e:a0:13:86:e0:40:06:59:70:84:b3:7e:57:1a:5d:1d hostname (RSA)
+---[RSA 2048]----+
|+O+ .E. |
|*. . . |
|oo . . |
|+. . . o |
|... o = S |
| . o = o . |
| . + o |
| . |
| |
+------[MD5]------+
$ ssh-keygen -l -E MD5 -v -f 2.pub
2048 MD5:9e:a0:13:86:e0:40:06:59:70:84:b3:7e:57:1a:5d:1d hostname (RSA)
+---[RSA 2048]----+
|+O+ .E. |
|*. . . |
|oo . . |
|+. . . o |
|... o = S |
| . o = o . |
| . + o |
| . |
| |
+------[MD5]------+
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span><span style="font-family: "arial" , "helvetica" , sans-serif;">To prove they are not the same try replacing MD5 in the above command with anther hash algorithm like SHA1 or SHA256.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">The problem is after messing with the moduli they almost certainly don't have two prime factors of length ~ root n and even if they did I have no idea what it might be without trying to factor them which is hard. So effectively they now have no private key which makes them useless. They do in fact have trival factors which makes them doubly useless.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Is there a way to insert a collision into a key and still maintain basic RSA functionality?</span></h4>
<br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Well it turns out this at least partially quite simple too. You see following our collision blocks we can put what ever additional value of the modulus we like as long as it's the same in both keys. So if we imagine the modulus being made up of three parts concatenated together </span><span style="font-family: "arial" , "helvetica" , sans-serif;">A|B|C</span><span style="font-family: "arial" , "helvetica" , sans-serif;">, A and B are essentially fixed but we can adjust C until we find an n which meets the RSA requirement that </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">n = A|B|C = p .q </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Where p and q are prime and of length ~ half that of n. The best way I found to do this is to generate a random C, generate a random prime p of the correct length, divide our n made up of A|B|C by p to get the nearest integer q. Then test q to see is prime. If is we can move on however it is very unlikely that A|B|C will divide exactly by p and q without remainder. But if we multiply our p and q together they will quite likely give us a new n which can be written in the form </span><span style="font-family: "arial" , "helvetica" , sans-serif;">A|B|D</span><span style="font-family: "arial" , "helvetica" , sans-serif;"> where D is a new number. This n still has the collision blocks B in it and we have the two prime factors p and q. The whole process takes a few seconds to run on a normal laptop. It helps to have a larger public key and so a longer C to mess around with so I have used 4096 bit public keys.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">With p and q we can reconstruct the ssh private key to match public key we just created.</span><br />
<br />
<br />
<br />
<span style="font-family: "arial" , "helvetica" , sans-serif;">What about the other private key?</span><span style="font-family: "arial" , "helvetica" , sans-serif;">Here we face a pretty much insurmountable problem. The value at the end of the modulus following the collision blocks needs to be the same in both public keys. However, in order to be able to build a private key for the modulus we would need to know two prime factors the likelihood of finding two is extremely small. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: "arial" , "helvetica" , sans-serif;">So this attack is impractical?</span></h4>
<div class="separator" style="clear: both;">
<span style="font-family: "arial" , "helvetica" , sans-serif;">Yes as far as I can tell collisions in ssh keys of this sort don't throw up any security concerns. Hopefully you are all running newer versions of openssh and now comparing keys with sha256 anyway.</span></div>
<div>
<br /></div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com0tag:blogger.com,1999:blog-7452979786017018311.post-90973680546647081352015-07-07T13:02:00.002+01:002017-02-03T14:55:59.871+00:00Talk Talk<span style="font-family: "arial" , "helvetica" , sans-serif;">In June I gave talks at both BSides London and the Dutch PHP Conference on the subject of hash functions. I really enjoyed both experences and they both helped me improve my public talking.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">At Bsides London I was part of the rookie track which meant I got an excellent mentor in the shape of Yiannis Chrysanthou. It also meant I only had 15 minutes to cram something interesting into which was good as it made me concentrate on the important parts</span>
<br />
<script async="" class="speakerdeck-embed" data-id="294543cf2f85463790acdf01cd5f811c" data-ratio="1.29456384323641" src="//speakerdeck.com/assets/embed.js"></script>
<div style="font-family: Arial, Helvetica, sans-serif;">
The Dutch PHP Conference or DPC is a conference I have attended twice before. This year there was no way I could go not as a speaker as they covered travel and accommodation as well as entry to the conference. It was again an excellent conference with great organisation and venue.
</div>
<div style="font-family: Arial, Helvetica, sans-serif;">
<script async="" class="speakerdeck-embed" data-id="b4d9cba08af74ec3b75af1b912a1fae2" data-ratio="1.33333333333333" src="//speakerdeck.com/assets/embed.js"></script>
</div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com0tag:blogger.com,1999:blog-7452979786017018311.post-59135055832545489772015-07-01T13:47:00.002+01:002017-08-16T14:15:34.821+01:00Github ssh keys some experiments<span style="font-family: "arial" , "helvetica" , sans-serif;">About a month ago I read this excellent piece of work <a href="https://blog.benjojo.co.uk/post/auditing-github-users-keys" target="_blank">https://blog.benjojo.co.uk/post/auditing-github-users-keys</a> . My first reaction was kick myself for not thinking of it before. It reminded me of this paper <a href="https://factorable.net/paper.html">https://factorable.net/paper.html</a> and associated the presentation which is pretty special. One of the major tools used in that paper was use of a batch version of the Greatest Common Divisor algorithm that can efficiently find common factors in large numbers of semi primes. Common primes should never happen in correctly functioning ssh key generation. However they are known to happen when things go wrong. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">The batchGCD algorithm didn't seem to have been used to examine the keys in the database. I was in good company in spotting this possible extension.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<blockquote class="twitter-tweet" lang="en">
<div dir="ltr" lang="en">
Lots of bad SSH keys on Github, and nobody has even run a batch GCD yet. This is a big deal. <a href="https://t.co/HcEcHgES9V">https://t.co/HcEcHgES9V</a></div>
— Matthew Green (@matthew_d_green) <a href="https://twitter.com/matthew_d_green/status/605851292670394368">June 2, 2015</a></blockquote>
<span style="font-family: "arial" , "helvetica" , sans-serif;">At the time I expected to see people reporting the discovery of further bad keys discovered using batchGCD over the next few days. A few days latter I came across <a href="http://cryptosense.com/batch-gcding-github-ssh-keys/">http://cryptosense.com/batch-gcding-github-ssh-keys/</a> which looks like exactly that just in a more restrained writing style. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">In the mean time I have been having a poke around myself.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Here are some observations I have made from my experiments with the keys on github and uploading new keys.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<ul>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">There is now a minimum key length of 1024 bits enforced when submitting keys with a suggestion of 2048 bits</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">There are no keys below 1024 bits still in the database all shorted ones have been removed.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">You cannot add a public key from a pair generated with the debian bug. If you do you will see this message</span></li>
</ul>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4ng7QvLKhuuIpsFlzB-mL1EyCMZYNCLCgVt13ER8pZIVxg54C8nPh2XnbK8C28AbjSMcj4CsftxLQs69NXwp6b5v8eG3ZgXol6Iler_OXhmYJ-8cIQ-Mgx8EI4J9ZnPpLJQt4OtW5JLc/s1600/Screen+Shot+2015-07-02+at+11.25.49.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4ng7QvLKhuuIpsFlzB-mL1EyCMZYNCLCgVt13ER8pZIVxg54C8nPh2XnbK8C28AbjSMcj4CsftxLQs69NXwp6b5v8eG3ZgXol6Iler_OXhmYJ-8cIQ-Mgx8EI4J9ZnPpLJQt4OtW5JLc/s1600/Screen+Shot+2015-07-02+at+11.25.49.png" /></a></div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></div>
<ul>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">There are very few obviously bad keys (e.g. with trivial moduls factorisation) on github. I found a handful all very old.</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">I could not find any keys with common divisors in the database. It looks like somebody has been through with batchGCD notified github and they have revoked them</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">New keys added are not checked to see if they have a common divisor but presumably will eventually get discovered</span></li>
<li><span style="font-family: "arial" , "helvetica" , sans-serif;">Github seems to receive a lot of spam signup these days where users are listed in the api but deleted for viewing</span><span style="font-family: "arial" , "helvetica" , sans-serif;"> </span></li>
</ul>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.comtag:blogger.com,1999:blog-7452979786017018311.post-45965163607595877702015-06-16T09:13:00.000+01:002015-06-16T11:22:10.697+01:00Batch GCD ssh key challenge<span style="font-family: Arial, Helvetica, sans-serif;">Here is a little challenge I have had some fun with recently.</span><br />
<br />
<span style="font-family: Arial, Helvetica, sans-serif;">The <a href="ttps://en.wikipedia.org/wiki/Greatest_common_divisor" target="_blank">Greatest Common Divisor</a> (GCD) method can quickly find any common denominator between two numbers. This even works on very large numbers such as those used by RSA. If you want to try it you can <span id="goog_170750250"></span>write a method<span id="goog_170750251"></span> in a <a href="http://rosettacode.org/wiki/Greatest_common_divisor" target="_blank">few lines of code</a> to find the factors between these two 1024 bit numbers :</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="p1">
<span class="s1" style="font-family: Arial, Helvetica, sans-serif;">154562462709421668048811029574429176339967582770033927576824633187099552180626760770554766161885657196312518658221149688394800043151559510354818420232102676069730902564150494088745313126769777823443395885737730234868902723737890922926976276986298764777507242621301740051336797119228435868695641797422930394301</span></div>
<div class="p1">
<span class="s1" style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="p1">
<span class="s1" style="font-family: Arial, Helvetica, sans-serif;">
</span></div>
<div class="p1">
<span class="s1" style="font-family: Arial, Helvetica, sans-serif;">141870892963640483906191484688359999906901497888195706215571887802641521733841862353130672719272343785182876822221422152533930744933871548118719685110452659915157122801280912706475082870290416497225380769325061108597080567715885948417156740160002970573069043853048947718197839111694567700297700024292552486341</span></div>
<div class="p1">
<span class="s1" style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="p1">
<span style="font-family: Arial, Helvetica, sans-serif;">If you can find a common factor you can of course find the other factors of each number by simple division.</span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">There is such a thing as the batch GCD algorithm <a href="http://facthacks.cr.yp.to/batchgcd.html.">http://facthacks.cr.yp.to/batchgcd.html.</a> This was used in 2012 to test millions of deployed RSA keys. That obviously took some time and computing hardware. You should be able to run this challenge in a few minutes on a laptop.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">This file <a href="https://s3-eu-west-1.amazonaws.com/batchgcd/10000_ssh_pub_keys.gz" target="_blank">https://s3-eu-west-1.amazonaws.com/batchgcd/10000_ssh_pub_keys.gz</a> has a list of 10^4 ssh public keys of various lengths. Hidden amongst them are two which share a common factor. Your task is find them with the batch GCD algorithm. You can either implement your own batchGCD from the algorithm as described in the article or there are a few implementations out there. Once you have identified the two public keys and their prime factors you have all the information you need to reconstruct the ssh private keys.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">With one of the private keys sign the unzipped file. This is not something you normally do with a ssh key but is possible with.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">openssl dgst -sha256 -hex -sign reconstructed-private-key 10000_ssh_pub_keys</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">If you want you can email the output to: batchGCD at fishtrap dot co dot uk for a final check that you have constructed the keys correctly.</span>Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com1Sheffield Sheffield53.361616 -1.511748tag:blogger.com,1999:blog-7452979786017018311.post-86371328484741471282015-05-06T13:38:00.004+01:002015-05-07T16:56:24.090+01:00How to make two binaries with the same MD5 hash<span style="font-family: Arial, Helvetica, sans-serif;">One question I was asked when I <a href="http://natmchugh.blogspot.co.uk/2014/10/how-i-made-two-php-files-with-same-md5.html">demo'd creating two PHP files with the same hash</a> is; does it work on compiled binaries?</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Well the answer is yes in fact that is where I first got the idea from, in <a href="http://www.mathstat.dal.ca/~selinger/md5collision/">this demo</a>.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">That example uses a C program as both the target and also to do the binary manipulation, which slightly of obscures the way it works. It also makes use of an old very slow implementation on the Wang attack to generate the collisions. To better and more quickly show how it works for an upcoming talk I have created a really simple example using a PHP script to manipulate a binary once compiled.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">I have put all the code used here on <a href="https://github.com/natmchugh/longEgg" target="_blank">github</a>.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Below is the super simple C program it compares two strings and if they don't match it prints out an ASCII art picture of an angel. If they do match you get a picture of a devil.
</span><br />
<script src="https://gist.github.com/natmchugh/1c1ea2840db134a5d604.js"></script><span style="font-family: Arial, Helvetica, sans-serif;">
It can be compiled with gcc and executed simply by doing</span><br />
<pre class="brush: bash">longEgg$ gcc -o demo ./demo.c
longEgg$ chmod a+x demo
longEgg$ ./demo
</pre>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Executing the program will print out the angel since the two strings differ in the last letter.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Now we have our compiled binary we need to do a bit of mucking about with it. What we are going to do is insert a MD5 collision into the long string of A's of the dummy text. We only need to insert two blocks of 64 bytes but we need to insert it at the beginning of a block i.e. when the byte length is a multiple of 64 bytes.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<script src="https://gist.github.com/natmchugh/e5418f21febbd6a772e5.js"></script>
<span style="font-family: Arial, Helvetica, sans-serif;">When we run the php script over it the first time it finds such a location and calculates the value of the four chaining variables of MD5 at that point in the file. It prints out the hex values concatenated together as a hash. We can now take that value and search for an MD5 collision with that initial state. The best MD5 collision finder is<a href="https://www.win.tue.nl/hashclash/" target="_blank"> Marc Stevens fastcoll</a>. It can typically find collisions in a couple of seconds using a a variant of the Wang attack. After downloading it you will need to compile it. There should be a Makefile for it in the code on <a href="https://github.com/natmchugh/longEgg" target="_blank">github</a>. Running it specifying the initial state and output files is shown below.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<pre class="brush: bash">longEgg$ wget https://www.win.tue.nl/hashclash/fastcoll_v1.0.0.5-1_source.zip
longEgg$ unzip fastcoll_v1.0.0.5-1_source.zip
longEgg$ make
longEgg$ chmod a+x fastcoll
longEgg$./fastcoll -i c15cfe39c40e47f5b8ae31e6658fd1bd -o a b
</pre>
<span style="font-family: Arial, Helvetica, sans-serif;">The -o option specifies the output files and so will create two new files a and b which contain 2 blocks of binary data. These blocks only work as MD5 collisions within the binary at that point. Running the php script for a second time will create two copies of the original compiled binary with the collisions inserted in the appropriate places.
</span><br />
<pre class="brush: bash">longEgg$ I want to replace 128 bytes at position 6528 in colliding_binaries/demo
longEgg$ Chaining variable up to that point is c15cfe39c40e47f5b8ae31e6658fd1bd
longEgg$ Just output new file /Users/nmchugh/longEgg/devil with hash dea9dc288b6c56626997ce86ca8eb6da
longEgg$ Just output new file /Users/nmchugh/longEgg/angel with hash dea9dc288b6c56626997ce86ca8eb6da
</pre>
<span style="font-family: Arial, Helvetica, sans-serif;">So now we have created two more files angel and devil. Running each of those should give different outputs.
</span><br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTALk3FQZ_axNg0JahW2bLwfAF0Ja9O48Ie7EuBMo8l0hyIH1mMVIxUQ3SYnkcYQ4RB2NfZNyhp3a43j-zaFKIzHbOY_6ovEwQ7CULefYaOkgewUEyWuVqrjD1mhq-r3aowH7TEHmo7Ec/s1600/Screen+Shot+2015-05-06+at+08.46.11.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Arial, Helvetica, sans-serif;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTALk3FQZ_axNg0JahW2bLwfAF0Ja9O48Ie7EuBMo8l0hyIH1mMVIxUQ3SYnkcYQ4RB2NfZNyhp3a43j-zaFKIzHbOY_6ovEwQ7CULefYaOkgewUEyWuVqrjD1mhq-r3aowH7TEHmo7Ec/s1600/Screen+Shot+2015-05-06+at+08.46.11.png" width="338" /></span></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<span style="font-family: Arial, Helvetica, sans-serif;">But they should have the same MD5 value.
</span><br />
<pre class="brush: bash">longEgg$ md5 angel devil
MD5 (angel) = dea9dc288b6c56626997ce86ca8eb6da
MD5 (devil) = dea9dc288b6c56626997ce86ca8eb6da
</pre>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com6tag:blogger.com,1999:blog-7452979786017018311.post-28334477803414527882015-03-26T15:47:00.000+00:002015-03-27T13:48:23.910+00:00The Magic Words are Squeamish Ossifrage - factoring RSA-129 using CADO-NFS<h3>
</h3>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">This thing got long and can basically be summarised as:</span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">I factored smaller RSA numbers on free cloud computing using excellent open source software.</span><br />
<br />
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Whats with the weird title?</span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div style="text-align: right;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://upload.wikimedia.org/wikipedia/commons/2/23/Bartgeier_Gypaetus_barbatus_front_Richard_Bartz.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://upload.wikimedia.org/wikipedia/commons/2/23/Bartgeier_Gypaetus_barbatus_front_Richard_Bartz.jpg" width="218" /></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;">The strange sentence:</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">"THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE" </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">is the plain text solution to many cryptographic challenges. I first saw it appear one character at a time from a CBC padding attack, and I started googling it. The origin of the sentence lie with a challenge set by the authors of the RSA encryption algorithm in 1977. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;">RSA was introduced to the world in </span><a href="http://simson.net/ref/1977/Gardner_RSA.pdf" style="font-family: Arial, Helvetica, sans-serif; text-align: center;">this article in Scientific American in 1977</a><span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"> . In that article a message encrypted with a 129 decimal digit public key is given and an $100 prize is offered to anyone who can decrypt it. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"><br /></span>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzp4cb3fsHo5Z443jt0ryn7Lg9rWS_N2ARbE6cMh1dKjiawF-8oje8ZOfzVaPiPFLIUZ5xa5EDFKgY3x03Fq-DwzEMIIqAmKlwOBVzw4ntKy6A8J0f1baePNFRM3TG3QT7miz747gcQ4c/s1600/Screen+Shot+2015-03-11+at+16.08.50.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzp4cb3fsHo5Z443jt0ryn7Lg9rWS_N2ARbE6cMh1dKjiawF-8oje8ZOfzVaPiPFLIUZ5xa5EDFKgY3x03Fq-DwzEMIIqAmKlwOBVzw4ntKy6A8J0f1baePNFRM3TG3QT7miz747gcQ4c/s1600/Screen+Shot+2015-03-11+at+16.08.50.png" /></a><span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"><br /></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">At the time it was estimated that by the methods known at the time it would take millions of years to break. In fact thanks to advances in factoring algorithm and computing power it took only until 1994. A team successfully factored the 129 digit semi prime reconstructed the private key and decoded the text to reveal the sentence. Here is their <span style="color: #0000ee;"><u><a href="http://www.crypto-world.com/announcements/RSA129.txt">announcement</a></u></span>.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">I have always found the basic proposition at the heart of RSA encryption hugely counter intuitive. How can it be that people much smarter than me using vast computer resources cannot find which two prime numbers I multiplied together when the product reaches 300 decimal digits? </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">What would it take to answer the 129 digit challenge that seemed near impossible in 1977. How difficult is the problem with modern factoring algorithms, software and cheap cloud computing? </span><br />
<br />
<span style="font-family: Arial, Helvetica, sans-serif;"> The team that claimed the prize for what became known as <a href="https://en.wikipedia.org/wiki/RSA_numbers#RSA-129">RSA-129</a> used the </span><a href="http://boinc.berkeley.edu/index.php" style="font-family: Arial, Helvetica, sans-serif;">Bonic</a><span style="font-family: Arial, Helvetica, sans-serif;"> framework to distribute the computing amongst volunteers. Apparently 1600 machines were used over a period of eight months. The algorithm used was the </span><span style="font-family: Arial, Helvetica, sans-serif;">Multiple Polynomial Quadratic Sieve. It was considered the best understood factoring algorithm at the time.</span><br />
<br />
<span style="font-family: Arial, Helvetica, sans-serif;">Cado-nfs is an implementation of the number field sieve it is made up of several individual programs that will perform each of the steps necessary. I chose this implementation for a couple of reasons: Firstly it was written </span><span style="font-family: Arial, Helvetica, sans-serif;">fairly recently</span><span style="font-family: Arial, Helvetica, sans-serif;"> </span><span style="font-family: Arial, Helvetica, sans-serif;">and is maintained so the build dependancies are easy to satisfy. Secondly the D in it's title stands for distributed as it has been built with distribution across several servers in mind.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Coincidentally to this google offered a great free trial of 2 months and $300 of credit on their <a href="https://cloud.google.com/">cloud platform</a>. Which is a pretty good deal if you like playing around with CPU intensive stuff like me.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">In order to give cado-nfs a spin I started up a high CPU instance on <a href="https://cloud.google.com/">google's cloud </a>compute engine and set it off with RSA-100 a 100 digit semi-prime first factored in 1991. On a single instance it took 148 minutes. or nearly 2.5 hours. This is quite impressive given that it took days across many more machines to achieve the original result admittedly using a slower sieving algorithm. Unfortunately (or fortunately for security) factoring times do not scale linearly with the moduls length. Before tackling anything longer I decided to try out cado-nfs distributed capabilities. Doing this couldn't really be easier. A master process uses SSH to start and control threads on slave machines which it allocates work over an HTTP API. All this is handled automagically all that you have to do is make sure the slave machine has the master's ssh key authorised and then start the master process with the ip-address of the slave machines.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Using this method I managed to get the time for RSA-100 down to 77 minutes across 2 machines and 47 minutes across 4 machines. I stopped at 4 machines or 8 CPUs as I hit the CPU limit for free instances on Googles platform. I later discovered that I could run 8 CPUs per region so with three regions I could use 24 CPUs in total on 12 virtual machines.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">With four machines I felt my set up was probably good enough to tackle some longer problems. I factored RSA-110 and RSA-120 experimenting with parameters as I went. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Here are some times I found:</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<table border="1" cellpadding="0" cellspacing="0" dir="ltr" style="border-collapse: collapse; border: 1px solid #ccc; font-family: arial,sans,sans-serif; font-size: 13px; table-layout: fixed;"><colgroup><col width="100"></col><col width="100"></col><col width="100"></col><col width="150"></col><col width="100"></col><col width="100"></col></colgroup><tbody>
<tr style="height: 21px;"><td style="padding: 2px 3px 2px 3px; vertical-align: bottom;"></td><td data-sheets-value="[null,2,"CPU Time (s)"]" style="padding: 2px 3px; vertical-align: bottom;">CPU Time (s)</td><td data-sheets-value="[null,2,"Elapsed Time (s)"]" style="padding: 2px 3px; vertical-align: bottom;">Elapsed Time (s)</td><td data-sheets-value="[null,2,"CPU Time (hrs)"]" style="padding: 2px 3px; vertical-align: bottom;">CPU Time (hrs)</td><td data-sheets-value="[null,2,"Elapsed Time (mins)"]" style="padding: 2px 3px; vertical-align: bottom;">Elapsed Time (mins)</td><td data-sheets-value="[null,2,"Machines"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">Machines</td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA100"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA100</td><td data-sheets-value="[null,3,null,15466.6]" style="font-size: 90%; padding: 2px 3px; text-align: right; vertical-align: bottom;">15466.6</td><td data-sheets-value="[null,3,null,8883.19]" style="font-size: 90%; padding: 2px 3px; text-align: right; vertical-align: bottom;">8883.19</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,4.296277777777778]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">4.296277778</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,148.05316666666667]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">148.0531667</td><td data-sheets-value="[null,2,"1x"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">1x</td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA100"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA100</td><td data-sheets-value="[null,3,null,15213.6]" style="font-size: 90%; padding: 2px 3px; text-align: right; vertical-align: bottom;">15213.6</td><td data-sheets-value="[null,3,null,4641.16]" style="background-color: white; font-size: 100%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">4641.16</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,4.226]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">4.226</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,77.35266666666666]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">77.35266667</td><td data-sheets-value="[null,2,"2 x "]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">2 x </td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA110"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA110</td><td data-sheets-value="[null,3,null,55199.6]" style="font-size: 90%; padding: 2px 3px; text-align: right; vertical-align: bottom;">55199.6</td><td data-sheets-value="[null,3,null,16018.8]" style="font-size: 90%; padding: 2px 3px; text-align: right; vertical-align: bottom;">16018.8</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,15.333222222222222]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">15.33322222</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,266.97999999999996]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">266.98</td><td data-sheets-value="[null,2,"2x "]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">2x </td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA100"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA100</td><td data-sheets-value="[null,3,null,15390.4]" style="font-size: 90%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">15390.4</td><td data-sheets-value="[null,3,null,3510.53]" style="font-size: 90%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">3510.53</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,4.275111111111111]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">4.275111111</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,58.508833333333335]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">58.50883333</td><td data-sheets-value="[null,2,"3 x"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">3 x</td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA100"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA100</td><td data-sheets-value="[null,3,null,15344.8]" style="font-size: 90%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">15344.8</td><td data-sheets-value="[null,3,null,2829.9]" style="font-size: 90%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">2829.9</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,4.262444444444444]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">4.262444444</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,47.165]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">47.165</td><td data-sheets-value="[null,2,"4 x"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">4 x</td></tr>
<tr style="height: 21px;"><td data-sheets-value="[null,2,"RSA120"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">RSA120</td><td data-sheets-value="[null,3,null,184096]" style="font-size: 100%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">184096</td><td data-sheets-value="[null,3,null,30134.1]" style="font-size: 100%; padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">30134.1</td><td data-sheets-formula="=R[0]C[-2]/3600" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,51.13777777777778]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">51.13777778</td><td data-sheets-formula="=R[0]C[-2]/60" data-sheets-numberformat="[null,0]" data-sheets-value="[null,3,null,502.23499999999996]" style="padding: 2px 3px 2px 3px; text-align: right; vertical-align: bottom;">502.235</td><td data-sheets-value="[null,2,"4 x"]" style="padding: 2px 3px 2px 3px; vertical-align: bottom;">4 x</td></tr>
</tbody></table>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">The number field sieve algorithm has several steps. </span><span style="font-family: Arial, Helvetica, sans-serif;">It is sieving</span><span style="font-family: Arial, Helvetica, sans-serif;"> of polynomial relations that can be paralysed</span><span style="font-family: Arial, Helvetica, sans-serif;"> amongst several machines. Following this there are several further steps which need to be performed on a single machine. These steps can be memory intensive and as the numbers got bigger took an increasing amount of time. It is for this reason that the time for factoring numbers is not directly proportional to the number of machines it is run across.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">With a bit of confidence in the setup from RSA110 and RSA120 I decided to tackle the 1977 problem.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;">Here is the public key modulus as it appears in that article.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif; text-align: center;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiojJxjMJB6T5TDQjB0JdlMcISUVa21EPqk5gHMhrswxZz8tO5bsd7pFxTlpNJzKwSD8dm5GTVOVkBNNSMOkwjD76ddQqHRScnUv6wNrmMIFU9XTiJN1q8O9iiCaJZoCAYbOj9s_O86Jgs/s1600/Screen+Shot+2015-03-26+at+15.42.34.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiojJxjMJB6T5TDQjB0JdlMcISUVa21EPqk5gHMhrswxZz8tO5bsd7pFxTlpNJzKwSD8dm5GTVOVkBNNSMOkwjD76ddQqHRScnUv6wNrmMIFU9XTiJN1q8O9iiCaJZoCAYbOj9s_O86Jgs/s1600/Screen+Shot+2015-03-26+at+15.42.34.png" /></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">I logged onto to the master machine and ran the shell script which wraps a python script </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"> ./factor.sh 114381625757888867669235779976146612010218296721242362562561842935706935245733897830597123563958705058989075147599290026879543541 -t 2 -s 3 -h localhost,slave1</span><span style="font-family: Arial, Helvetica, sans-serif;">,slave3,slave4 &> rsa129.4.out</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg47t1UmLPPwvv_w5IkfYZY4JHr4YfmA4xRzVNShcjyoRzlHXek5vXPk5tvXj1oLWC-QZ9E7TenSSa5_fSWC27_EHDJySzVww43RUjiqcLPAoX0g0eGFX4FTygrFKG3qgpqPyCi-EzAmy0/s1600/Screen+Shot+2015-03-12+at+10.29.21.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg47t1UmLPPwvv_w5IkfYZY4JHr4YfmA4xRzVNShcjyoRzlHXek5vXPk5tvXj1oLWC-QZ9E7TenSSa5_fSWC27_EHDJySzVww43RUjiqcLPAoX0g0eGFX4FTygrFKG3qgpqPyCi-EzAmy0/s1600/Screen+Shot+2015-03-12+at+10.29.21.png" height="261" width="640" /></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">After 22 hours I checked in on it and there was both good news and bad news. The good news was that the sieving had finished! The bad news was that there had been a memory error during the next stage. It seemed quite likely that the machine had simply run out of memory as it had only 1.8GB. So I backed up the working directory of cado-nfs from /tmp and took a snapshot. I then used the snapshot to create a memory optimised machine which seemed to have similar CPUs but also 13GB of memory.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Cado-nfs makes restarting a calculation simply a matter of rerunning the original command. I dug this out of the logfile and ran it. The calculation then picked up from where it had crashed and seemed happier with more memory. After a few hours I had the answer and yes I can confirm <a href="https://en.wikipedia.org/wiki/RSA_numbers#RSA-129">wikipedia was correc</a>t. I'd racked up about $30 worth of usage but as I said it was in fact free due to the trial.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWam58cSdGCDPTjh126pt2FfIHoChPk3hgyTIBIxC3Y-Pw2kRs6CK6pLkIm7GwdWSKLz0mlUuC6p4IqMTKNDHalTvQb_qRYB5nsxRlk4aaMOm6aqYwGxnkviKBA-kK7kY9Yyd_EyMM7fM/s1600/Screen+Shot+2015-03-26+at+15.13.46.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWam58cSdGCDPTjh126pt2FfIHoChPk3hgyTIBIxC3Y-Pw2kRs6CK6pLkIm7GwdWSKLz0mlUuC6p4IqMTKNDHalTvQb_qRYB5nsxRlk4aaMOm6aqYwGxnkviKBA-kK7kY9Yyd_EyMM7fM/s1600/Screen+Shot+2015-03-26+at+15.13.46.png" height="148" width="640" /></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Once we have the factors decoding the message is easy RSA maths with the exponent being given in the article. The character encoding scheme used was also given and was a simple substitution with 0 = space and 1 = A, 2 = B etc.</span><br />
<br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<script src="https://gist.github.com/natmchugh/fbbed841ad06c4f9a7f7.js"></script>
<span style="font-family: Arial, Helvetica, sans-serif;"><br />I stole the modular inverse function from http://rosettacode.org/wiki/Modular_inverse#Python . Thanks to Google for the free computing. I found the cloud compute web interface to be pretty self explanatory having used AWS for a while. There are some nice features such as the usage graphs (to check always using 100% of CPUs) and a little activities pop-up which states what actions you've just taken. The in browser ssh client even seemed usable.</span>Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com0Sheffield, South Yorkshire, UK53.381128999999987 -1.4700850000000453.078144999999985 -2.11553200000004 53.684112999999989 -0.82463800000004006tag:blogger.com,1999:blog-7452979786017018311.post-11131848598300531562015-02-05T14:01:00.000+00:002018-02-06T09:28:27.406+00:00Create your own MD5 collisions<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;">A while ago a lot of people visited my site (</span><span style="background-color: #f9f9f9; color: #333333; font-family: "open sans" , "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px; line-height: 20px;">~</span><span style="font-family: "arial" , "helvetica" , sans-serif;"> 90,000 ) with a post about how easy it is to make two images with same MD5 by using a chosen prefix collision. I used <a href="https://marc-stevens.nl/research/">Marc Steven</a>'s <a href="https://code.google.com/p/hashclash/">HashClash</a> on AWS and estimated the the cost of around $0.65 per collision.</span></div>
<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></div>
<div>
<span style="font-family: "arial";">Given the level of interest I expected to see cool MD5 collisions popping up all over the place. Possibly it was enough for most people to know it can be done quite easily and cheaply but also I may have missed out enough details in my original post. </span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">In this further post I’ve made an AWS image available and created a step-by-step guide so that you too can create MD5 chosen prefix collisions and amuse your friends (disclaimer: they not be that amused). All you need to do is create an AWS instance and run a few commands from the command line. There is a explanation of how the chosen prefix collision works in Marc Steven's <a href="https://marc-stevens.nl/research/papers/MTh%20Marc%20Stevens%20-%20On%20Collisions%20for%20MD5.pdf">Masters thesis</a>.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">Here are the steps to create a collision.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">1) Log on to AWS console and create a spot request for an instance based on my public Amazon Machine Image (AMI). Spot requests are much cheaper than creating instances directly, typically $0.065 an hour. They can be destroyed, losing your data, if the price spikes but for fun projects they are the way to go.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNyWToFnYZF0W7g5-wtPrz2v0YIUT51H8WFreZbi8aSqbvCNSpSCTCeiSgmUGjp0GEPy5dU_Fp0-lNNW_wI8rZJOZDBRjugC-MjTzptxxgCJ45QBftqFU7PQmnwWWyLxYFyluB36_khEc/s1600/request_spot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="291" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNyWToFnYZF0W7g5-wtPrz2v0YIUT51H8WFreZbi8aSqbvCNSpSCTCeiSgmUGjp0GEPy5dU_Fp0-lNNW_wI8rZJOZDBRjugC-MjTzptxxgCJ45QBftqFU7PQmnwWWyLxYFyluB36_khEc/s1600/request_spot.png" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">I have created a public AMI called hash-clash-demo. It has the id </span><span style="background-color: white; color: #444444; font-family: "helvetica neue" , "roboto" , "arial" , "droid sans" , sans-serif;"><span style="font-size: 15px; line-height: 18.2000007629395px;">ami-dc93d3b4 and is in the US East (North Virginia) region. </span></span><span style="font-family: "arial";">It has all the software necessary to create a collision pre-built. Search for it with </span><span style="background-color: white; color: #444444; font-family: "helvetica neue" , "roboto" , "arial" , "droid sans" , sans-serif; font-size: 15px; line-height: 18.2000007629395px;">ami-dc93d3b4 </span><span style="font-family: "arial";">in community AMIs and then choose a GPU2 instance. I promise it does not mine bitcoins in the background although thinking about it this would be a good scam and I may introduce this functionality. </span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiscvaasewUmQTZlNYwCqEGMSJsa8VFT_t6gbM42v-e8iH3VUxnaqYEDchlciDnAbBQLUJKVHcff4AIpMlqyHvtFP-aK0W3-bTmXlKzur3_wl_b_j1na-jerkM6th5OmRDoQzv8bI9ZfBQ/s1600/AMI.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="301" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiscvaasewUmQTZlNYwCqEGMSJsa8VFT_t6gbM42v-e8iH3VUxnaqYEDchlciDnAbBQLUJKVHcff4AIpMlqyHvtFP-aK0W3-bTmXlKzur3_wl_b_j1na-jerkM6th5OmRDoQzv8bI9ZfBQ/s1600/AMI.png" width="400" /></a></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">2) Once your request has been created and evaluated hopefully you will have a running instance to connect to via SSH. You may need to create a new key pair, follow the instructions on AWS to do this and install on your local machine. Once you have your key installed log onto instance via ssh as ec2-user.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg80Mu9-VzkcD-eOxzmmyPNKhu6GOr0m18pm-0Uq8FEPGxvAJUrzs-rdOX8T3sGcAqOS2plujvPMYJCB434HLW4QuUQ-GdG2iBoEmD4frbWZKt7bMHRRxXtAA2hsIE035hsyzs3HL4zv6c/s1600/photofun-2043710194.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="105" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg80Mu9-VzkcD-eOxzmmyPNKhu6GOr0m18pm-0Uq8FEPGxvAJUrzs-rdOX8T3sGcAqOS2plujvPMYJCB434HLW4QuUQ-GdG2iBoEmD4frbWZKt7bMHRRxXtAA2hsIE035hsyzs3HL4zv6c/s1600/photofun-2043710194.jpg" width="400" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">3) The shell script for running hash clash is located at </span>/home/ec2-user/hashclash/src/scripts <span style="font-family: "arial";">. Change into that directory and download some data to create a collision. Here I download a couple of jpeg images from tumblr.</span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ2tt52uXdmllXQSzOWrUo0btAuNhx26z2MIBKSWIQUQB850X1JczFS4ak_WxCB3l8PVADMOdLewTBVX-bfhxeEEkEQkizPqtpFM0FftKFddG3YBUzD6APHa-ht3qTlJ3kD3V5ucYjp-A/s1600/download.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="239" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQ2tt52uXdmllXQSzOWrUo0btAuNhx26z2MIBKSWIQUQB850X1JczFS4ak_WxCB3l8PVADMOdLewTBVX-bfhxeEEkEQkizPqtpFM0FftKFddG3YBUzD6APHa-ht3qTlJ3kD3V5ucYjp-A/s1600/download.png" width="400" /></a></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<br /></div>
<div>
<span style="font-family: "arial";">4) It is best to run the shell script in a screen session so you can detach from it and do other stuff. Start a screen session by typing</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">screen</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">Once you are in the screen session kick off the </span><span style="font-family: "arial";">cpc.sh shell script with your two files. Send the outputs to a log file in this case I called it demo.output.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA1hAxd_rHRCJuFj9z0EBPBecuqVjn-VndP22iexvTwB_iMyo8II7PE-phIKVCLTDqtN7DeMu3h0FI30d9joJ_40OQmTKyR4mTVuZZNk2eAeTawOswoJJZepi8A7yshOzMO2Q5gWbLj3o/s1600/Screen+Shot+2015-02-04+at+10.32.33.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiA1hAxd_rHRCJuFj9z0EBPBecuqVjn-VndP22iexvTwB_iMyo8II7PE-phIKVCLTDqtN7DeMu3h0FI30d9joJ_40OQmTKyR4mTVuZZNk2eAeTawOswoJJZepi8A7yshOzMO2Q5gWbLj3o/s1600/Screen+Shot+2015-02-04+at+10.32.33.png" /></a></div>
<div>
<br /></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">Detach from the screen session with Ctrl A + D</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAY_TdCWbloFFG_rlXWpA3vldOkLaXWd5_a5GM2aUEk0V37xsPiW3tzrYds-Sg-zfXqHDYCFwvU_graGtRLBzKrL80_4Up63K5bcMiIosICKXvlPl-WfpLLLFvCbCuWFVYHhEMd9F5cwI/s1600/Screen+Shot+2015-02-04+at+10.33.41.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAY_TdCWbloFFG_rlXWpA3vldOkLaXWd5_a5GM2aUEk0V37xsPiW3tzrYds-Sg-zfXqHDYCFwvU_graGtRLBzKrL80_4Up63K5bcMiIosICKXvlPl-WfpLLLFvCbCuWFVYHhEMd9F5cwI/s1600/Screen+Shot+2015-02-04+at+10.33.41.png" /></a></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<br /></div>
<div>
<span style="font-family: "arial";">5) Tailing the log file you should be ale to see the </span><span style="font-family: "arial";">birthday attack to get the hash differences into the correct locations starting.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">tail -f demo.output</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgtFqadP-_6ocYIUAHm72Oa97rdCdwFHM_qVREA5otsazHkFYyGfu2eiIsxaL9urrhYOWh_iKtRkP7A0wo6f8V1pJPUMDeZ7apCzcLA7hMac7LkHm8H5lSlmqDCV3CeR82qYuubppPlkfc/s1600/Screen+Shot+2015-02-04+at+10.34.21.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgtFqadP-_6ocYIUAHm72Oa97rdCdwFHM_qVREA5otsazHkFYyGfu2eiIsxaL9urrhYOWh_iKtRkP7A0wo6f8V1pJPUMDeZ7apCzcLA7hMac7LkHm8H5lSlmqDCV3CeR82qYuubppPlkfc/s1600/Screen+Shot+2015-02-04+at+10.34.21.png" /></a></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">6) Leave the birthday search to do it's thing for an hour or so. Hopefully when you come back the attack should have moved on to the next stage, creating the near collision blocks to gradually reduce the hash differences. The best way to check this is to look at files created. The workdir0 contains all the data for the current collision search for the first near collision block. More of these will be created as more near collision blocks are created.</span></div>
<div>
</div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqX0ljnsq9Y9sDX7CnbbJ9PrHIK2zBwHOcSH8_Ll8ZO5So8MVXVzIX7MDD_5O9jWiTGrb0U8id_ns0lep4ZZD2KRgbKcH0oVqmpv7z9vQixn3hzX_0UuRNBqZuMforAQ1wLa0EVzAll9U/s1600/Screen+Shot+2015-02-04+at+12.17.32.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="306" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqX0ljnsq9Y9sDX7CnbbJ9PrHIK2zBwHOcSH8_Ll8ZO5So8MVXVzIX7MDD_5O9jWiTGrb0U8id_ns0lep4ZZD2KRgbKcH0oVqmpv7z9vQixn3hzX_0UuRNBqZuMforAQ1wLa0EVzAll9U/s1600/Screen+Shot+2015-02-04+at+12.17.32.png" width="640" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<span style="font-family: "arial";"> </span><span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";">7) Go away again, a watched collision pretty much never happens. Check back in ~5 hours that it is still going on. Tailing demo.output and listing the directory should let you know roughly what stage the attack is at.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpyK9aOf6LwJxjpaZhUrhD8nOJeNkTr7MuBKvvXk8vsNJfgPmLA_axOSwGSRmFnLYmnwvW7foGiLVvyNoCZOtFhhNwVOlJGSv2m0xXyGgalJbsdfAEbzFLVehMuPFf8IOxPFm4p0T3Stk/s1600/Screen+Shot+2015-02-04+at+14.14.20.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpyK9aOf6LwJxjpaZhUrhD8nOJeNkTr7MuBKvvXk8vsNJfgPmLA_axOSwGSRmFnLYmnwvW7foGiLVvyNoCZOtFhhNwVOlJGSv2m0xXyGgalJbsdfAEbzFLVehMuPFf8IOxPFm4p0T3Stk/s1600/Screen+Shot+2015-02-04+at+14.14.20.png" /></a></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial" , "helvetica" , sans-serif;">Here we are only at block number 2 of probably 9.</span><br />
<br />
<br /></div>
<div>
<span style="font-family: "arial";">8) Come back again about 10-12 hours from start and with any luck we have a collision. </span><br />
<span style="font-family: "arial";"><br /></span>
<span style="font-family: "arial";"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgT3rl9iEGkSXCCQ3s44kVG8Z229l6NNWS3ON9Y5EMpWRJwiArS3uYkQgBzM-t_Jp5EJy9tRxF00ttB9dsUDVbsMcwfgeYD8TGWTiAGt1quhYxJjAgHekYLnyQfiw1fGnPCvZki7sd3O0o/s1600/Screen+Shot+2015-02-05+at+10.31.51.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgT3rl9iEGkSXCCQ3s44kVG8Z229l6NNWS3ON9Y5EMpWRJwiArS3uYkQgBzM-t_Jp5EJy9tRxF00ttB9dsUDVbsMcwfgeYD8TGWTiAGt1quhYxJjAgHekYLnyQfiw1fGnPCvZki7sd3O0o/s1600/Screen+Shot+2015-02-05+at+10.31.51.png" width="555" /></a></div>
<span style="font-family: "arial";"><br /></span>
<span style="font-family: "arial";">This one finished at 02:45 in the morning having been started at 10:30 the previous morning. You can tell when it finished as that was the last point the log was written to. If the log log file is still being updated the collision search is still going on. It took 9 near collision blocks to finally eliminate all the differences which is normal. 16 hours is a bit longer than average.</span><br />
<span style="font-family: "arial";"><br /></span><span style="font-family: "arial";">The collisions have been created in files named plane.jpg.coll and ship.jpg.coll. You can verify they do indeed have the same md5 hash with md5sum.</span><br />
<span style="font-family: "arial";"><br /></span>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5OfO8JUFhmBbo7NjF5PWRyRTKGHSuLmZ-agFIv2q3T6zGAKh39OVZfe9of_I51nU3jueg2lojE1tz7s7F56IDhub2zJzNCdqdxotyMXGxLo4Fr84JxoBHCS5DwsQsFtEPPnkbPcahspM/s1600/Screen+Shot+2015-02-05+at+10.37.47.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="51" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5OfO8JUFhmBbo7NjF5PWRyRTKGHSuLmZ-agFIv2q3T6zGAKh39OVZfe9of_I51nU3jueg2lojE1tz7s7F56IDhub2zJzNCdqdxotyMXGxLo4Fr84JxoBHCS5DwsQsFtEPPnkbPcahspM/s1600/Screen+Shot+2015-02-05+at+10.37.47.png" width="640" /></a><span style="font-family: "arial";"><br /></span><br />
<span style="font-family: "arial";"><br /></span>
<span style="font-family: "arial";">Here are the images with collision blocks added.</span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<img src="https://s3-eu-west-1.amazonaws.com/md5collisions/ship.jpg" width="800" />
<img src="https://s3-eu-west-1.amazonaws.com/md5collisions/plane.jpg" width="800" />
<br />
<div>
<span style="font-family: "arial";"><br /></span>
<span style="font-family: "arial";">I downloaded them to my local machine with scp</span></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghS_P8eGvoXun-jfkhrmqtL7JXb_WMRyzkKQnUXltti1fYnSB5d8Zl1IYiSxwbDHYHGHdlefaCXcijvcsb2D8uJfGRP1i0trpI3Ig-_qHeNIkWtQxZYqu035sOQqsfgBTr6-7hKpaCRS4/s1600/Screen+Shot+2015-02-05+at+11.46.21.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghS_P8eGvoXun-jfkhrmqtL7JXb_WMRyzkKQnUXltti1fYnSB5d8Zl1IYiSxwbDHYHGHdlefaCXcijvcsb2D8uJfGRP1i0trpI3Ig-_qHeNIkWtQxZYqu035sOQqsfgBTr6-7hKpaCRS4/s1600/Screen+Shot+2015-02-05+at+11.46.21.png" /></a></div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<div>
<span style="font-family: "arial";"><br /></span></div>
<br />
<div>
</div>
<br />
<div>
<span style="font-family: "arial";"><br /></span></div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.comtag:blogger.com,1999:blog-7452979786017018311.post-32351961004815090192015-01-07T15:10:00.000+00:002015-01-13T22:31:33.321+00:00Hash Collisions Reading List<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiNGvebNBZM3qOVdOTnhx-s_-oMVLc2X4sNozh28bZlJ8WkoNqiCMp3_UGgTsOzqatIz2ljIYjoUrs0MAhyI-63x2BQK65X0hxHGNoH0roSIb4wwy0FYGSGBssq7PZsOQGobSUCL3vbe4/s1600/7955335706_926d621244_m.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiNGvebNBZM3qOVdOTnhx-s_-oMVLc2X4sNozh28bZlJ8WkoNqiCMp3_UGgTsOzqatIz2ljIYjoUrs0MAhyI-63x2BQK65X0hxHGNoH0roSIb4wwy0FYGSGBssq7PZsOQGobSUCL3vbe4/s1600/7955335706_926d621244_m.jpg" /></a><span style="font-family: Arial, Helvetica, sans-serif;">Lately in an effort to code up and properly understand the Wang attack on the MD4 family of hash functions I've been reading a lot of papers on the subject. Many of the papers have very similar names and the same authors. I found myself having to create a quick reference about each paper and it's contents. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Here they are with a brief summary of what I got from each:</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h4>
</h4>
<div>
<br />
<br /></div>
<div>
<br /></div>
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">
Collision for Hash Functions MD4, MD5 HAVAL-128 and RIPEMD</span></h4>
<i><span style="font-family: Arial, Helvetica, sans-serif;">Xiaoyun Wang, Dengguo Feng, Xuejia Lai and Hongbo Yu</span></i><br />
<i><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></i>
<a href="https://eprint.iacr.org/2004/199.pdf"><span style="font-family: Arial, Helvetica, sans-serif;">https://eprint.iacr.org/2004/199.pdf</span></a><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">This is the original paper listing out some collisions for each of these functions. This must have been quite a blockbuster at the time.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br />
</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br />
</span><br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Cryptanalysis of the Hash Functions </span><span style="font-family: Arial, Helvetica, sans-serif;">MD4 and RIPEMD</span></h4>
<i><span style="font-family: Arial, Helvetica, sans-serif;">Xiaoyun Wang, Xuejia Lai, Dengguo Feng, Hui Chen, and Xiuyuan Yu</span></i><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><i><br /></i>
<span style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;"><a href="https://s3-eu-west-1.amazonaws.com/md5collisions/CryptanalysisOftheHashFunctionsMD4andRIPEMD.pdf" style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;" title="https://s3-eu-west-1.amazonaws.com/md5collisions/CryptanalysisOftheHashFunctionsMD4andRIPEMD.pdf">https://s3-eu-west-1.amazonaws.com/md5collisions/CryptanalysisOftheHashFunctionsMD4andRIPEMD.pdf</a></span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">This article details the attack that was used to generate the collisions of the previous paper and should be all you need to write a collision generating script for MD4 and RIPEMD.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br />
</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br />
</span><br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">How to Break MD5 and Other Hash Functions</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><i>Xiaoyun Wang and Hongbo Yu</i></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="background-color: white; color: #004b91; font-family: Arial, Helvetica, sans-serif; line-height: 12px; text-decoration: none;"><a href="https://s3-eu-west-1.amazonaws.com/md5collisions/HowtoBreakMD5andOtherHashFunctions.pdf" style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;" title="https://s3-eu-west-1.amazonaws.com/md5collisions/HowtoBreakMD5andOtherHashFunctions.pdf">https://s3-eu-west-1.amazonaws.com/md5collisions/HowtoBreakMD5andOtherHashFunctions.pdf</a></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">MD5 is slightly harder to break than MD4 requiring 2 blocks and more muli-step message modifications. This article details the method used to generate MD5 collisions in the first.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></h4>
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Searching for Differential Paths in MD4</span></h4>
<span style="font-family: Arial, Helvetica, sans-serif;"><i>Martin Schälffer and Elisabeth Oswald</i></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><i><br /></i>
<span style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;"><a href="https://s3-eu-west-1.amazonaws.com/md5collisions/SearchingforDifferentialPathsInMD4.pdf" style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;" title="https://s3-eu-west-1.amazonaws.com/md5collisions/SearchingforDifferentialPathsInMD4.pdf">https://s3-eu-west-1.amazonaws.com/md5collisions/SearchingforDifferentialPathsInMD4.pdf</a></span></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">More detail on how the attacks work with a good description of how paths are calculated and an algorithm for finding them. Also contains a new path with fewer stage 2 required requirements.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Improved Collision Attack on MD5</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><i>Yu Sasaki, Yusuke Naito, Noboru Kunihiro and Kazuo Ohta</i></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><i><br /></i></span></div>
<span style="background-color: white; color: #004b91; font-family: Arial, Helvetica, sans-serif; line-height: 12px; text-decoration: none;"><a href="https://s3-eu-west-1.amazonaws.com/md5collisions/ImprovedCollisionAttackonMD5.pdf" style="background-color: white; color: #004b91; line-height: 12px; text-decoration: none;" title="https://s3-eu-west-1.amazonaws.com/md5collisions/ImprovedCollisionAttackonMD5.pdf">https://s3-eu-west-1.amazonaws.com/md5collisions/ImprovedCollisionAttackonMD5.pdf</a></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">The paper where I finally understood how the correction of second round collisions worked</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Improved Collision Attack on MD4</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Yusuke Naito, Yu Sasaki, Noboru Kunihiro, and Kazuo Ohta</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><a href="https://eprint.iacr.org/2005/151.pdf">https://eprint.iacr.org/2005/151.pdf</a></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Some corrections to the Wang collision on MD4 speeds things up with good explanation.</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Automatic Search of Differential Path in MD4</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Pierre-Alain Fouque, Gaëtan Leurent, Phong Nguyen</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><a href="http://www.di.ens.fr/~fouque/pub/md4.pdf">http://www.di.ens.fr/~fouque/pub/md4.pdf</a></span></div>
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></h4>
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">New Message Difference for MD4</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><i>Yu Sasaki, Lei Wang, Kazuo Ohta and Noboru Kunihiro</i></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><i><br /></i></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><a href="https://www.iacr.org/archive/fse2007/45930331/45930331.pdf">https://www.iacr.org/archive/fse2007/45930331/45930331.pdf</a></span><br />
<br />
<span style="font-family: Arial, Helvetica, sans-serif;">The best path I know of with a totally different message difference and explanation of the local collisions underlying the collisions.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h4>
<span style="font-family: Arial, Helvetica, sans-serif;">Herding Hash Functions and the Nostradamus </span><span style="font-family: Arial, Helvetica, sans-serif;">Attack</span></h4>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">John Kelsey and Tadayoshi Kohno</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><a href="http://homes.cs.washington.edu/~yoshi/papers/EC06/herding.pdf">http://homes.cs.washington.edu/~yoshi/papers/EC06/herding.pdf</a></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;">
<br />
<br />
</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com1Sheffield, South Yorkshire, UK53.381128999999987 -1.4700850000000453.078144999999985 -2.11553200000004 53.684112999999989 -0.82463800000004006tag:blogger.com,1999:blog-7452979786017018311.post-68103163814550783052014-11-11T22:20:00.000+00:002014-11-12T12:26:22.907+00:00Three way MD5 collision<span style="font-family: Arial, Helvetica, sans-serif;">Previously I explained how I created two images one of James Brown the other of Barry White with the same MD5 hash. At the end of the post I said I was going to try and create a three way collision where three images have the same MD5 hash. Neil K made a suggestion about the image</span><br />
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<blockquote class="twitter-tweet" data-partner="tweetdeck">
<span style="font-family: Arial, Helvetica, sans-serif;"><a href="https://twitter.com/natmchugh">@natmchugh</a> Image suggestion for the third hash collision: White, Brown, ...and Black -- <a href="http://t.co/yYEawVjPVa">http://t.co/yYEawVjPVa</a></span><br />
<span style="font-family: Arial, Helvetica, sans-serif;">— Neil K. (@kneil_) <a href="https://twitter.com/kneil_/status/529530436888711169">November 4, 2014</a></span></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script><span style="font-family: Arial, Helvetica, sans-serif;">
So I set to work.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">After a couple of false starts where I started with the wrong image file I managed to achieve a three way collision. Here are the images.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<img src="http://www.fishtrap.co.uk/black.jpg.coll" width="500" />
<img src="http://www.fishtrap.co.uk/brown.jpg.coll" width="500" />
<img src="http://www.fishtrap.co.uk/white.jpg.coll" />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
If you want to check
<br />
<pre class="brush: shell">$ curl -s http://www.fishtrap.co.uk/black.jpg.coll | md5
b69dd1fd1254868b6e0bb8ed9fe7ecad
$ curl -s http://www.fishtrap.co.uk/brown.jpg.coll | md5
b69dd1fd1254868b6e0bb8ed9fe7ecad
$ curl -s http://www.fishtrap.co.uk/white.jpg.coll | md5
b69dd1fd1254868b6e0bb8ed9fe7ecad
</pre>
<br />
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></h3>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">A new hash value</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">This isn't the same hash as before instead the 3 images now collide with a new hash value </span>b69dd1fd1254868b6e0bb8ed9fe7ecad . <span style="font-family: Arial, Helvetica, sans-serif;">This is because I had to add near collision blocks to all three images. In the case of the first two the blocks added are the same. This is probably best illustrated with a diagram.</span><br />
<br /></div>
<div class="mxgraph" style="height: 466px; overflow: hidden; position: relative; width: 791px;">
<div style="height: 1px; overflow: hidden; width: 1px;">
1ZlLc9s4DIB/TaaneCzJz2OdpLuH3ZmdyaG7pw4t0RIbStRSdGz31xewAVmvdNKx7NgXW4RIiPwAASB1Fzyk2z+syJO/TST1nT+MtnfB453v+7ORD38o2R0kk+H8IIitig4i7yh4Vj8kCYckXatIFrWOzhjtVF4XhibLZOhqspXR9UfkImb1R8FzKHRb+lVFLjlIZ/7kKP9Tqjjhx3gTWkzhdqwjkiux1u5+L6LFp4J10aq2w0Pz3psMSbQj0YhU5iKrTeqHMWlNYGVxpEULVjQz0rk0NpK2JtIqe6mSC57AetYYGIhX6fZBarQgG+cw7Msbd0tgVma1R781gIC8Cr2mqbcAFonI8VKle2stXqV1Cmz0l1hK/Y8plFMG0SyNc4ik7PBZqxhvOJODlIY/Js5B8zNOyf+y2WwGK1UkDpx1EJrB+gWES2s22eB7HrcXQ+vDJ0hy6SM18HlpUunsDi3KBh3PyYDk8GMCvzk6lMeypOpM7AiCzBOXyo884YKQduMNrhDvJlFO9od3VsfrM7Yq31EXX44op/AdvYOvs+alDB+wlgUTT7cxxsjBSptNmAjrBikECnUfmXCd7pksekE08uuIpqSjQmg2awOa9MBnfBN8PFo9uxAruQAgyiR9ARJZZpzAN/YbdrTUgL733oymeyKtIOAE1YhTFVqd8YyFp+CatnA9GK1VgWtscoORUBRgSGNeoTZrULRoE12ZzFGt4U2hvY9Rz7kIUbIByH35mj+tv4teQAavhism2ns6ICf+yHQQiZ03h1yQQnvfqYCLXIX45w/n375KVRSD4XTwXYQvSw0/3w5D+koYwYyCElmgbFctwK921QLzHgxAxr/ueOg1Uuqo4w0/VzzkpHz+gNhXPBwGjTd6+r54WApPwsUTvm6HGtUTrDduB72zOVR7h3FbKWPOfsrVyagjXp0tY3jtHcSnTAr7ab/FLjlCGRNsxX7Fw6U2IWwpm2xhvehQgtJECEhwK9rKH6mKIhxzntp31lHadbDjDcNJ6NrF722ha5bFQdvxzsauXRffFLtWxOtICmdj1y6Sb4sdpwIuP7gmuAS7doXcOK74cDxjdiWuXy+Jp12/Ng7LPh7PhM5kylrjcnj4TLUbT/nCXZtH+ROqkcrzj8sF+nK+FWS46bsiPAEXXByPphfE03U+PtFIIcdwrkVx0D75f41H9oscVsPXcBXjf+mDsO1G7YfxuO3mHr+vs/ThHnWWhv+Fzm6PSFwKBB+9slCHEt/gd47MZOgHK6V1Q/R+J+oq7A30xl0TnnBAPwkD+nG25m4y6Ci6fD6Bq51OUL+TvI2K/eqOKIIPYtQ01iUmNpnQT0cpbKnXWSRRA3ywqlpCbpX7l8R4/R9eD8bYymBi5S1s8L23EBZmbff8K1HWCRtL3kyTDKf7S9BWajgAeK1/JvsNZtA8fhzb36t85wyefgI=</div>
</div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Again I created the files with <a href="https://code.google.com/p/hashclash/">HashClash</a>. As inputs I used white.jpg and black.jpg images. To make brown.jpg.coll I just had to append the extra collision blocks to brown.jpg which was already a collision with white.jpg. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">I could go on adding more and more files in a tree structure to get many documents to collide. The number of collisions needed is n-1 where n is the number of files. It was this tree of collisions that allowed Marc Stevens to <a href="http://www.win.tue.nl/hashclash/Nostradamus/">predict the 2008 US presidential election</a>.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">A word about file sizes</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">The files started out different sizes to each other, however, before each collision was generated between two files padding had to be added to one of the files to make it the same as the other. Without this step it would be impossible to extend a collision in the unpadded version to the full MD5 algorithm. This is because the padding includes the size of data processed. </span></div>
<div>
</div>
<script src="https://www.draw.io/embed.js?s=flowchart" type="text/javascript"></script>Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com6Sheffield, South Yorkshire, UK53.381128999999987 -1.4700850000000453.078144999999985 -2.11553200000004 53.684112999999989 -0.82463800000004006tag:blogger.com,1999:blog-7452979786017018311.post-1935627411224915782014-11-05T13:26:00.000+00:002014-11-06T14:35:58.620+00:00Colliding MD5 Images go crazy<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibX7ijefNgGlOgsE1oahyYV6fno-1kYzI1fmdpXoj9Ep-mczkZmfe0DtvF1KpXCQ1gr2hOQaR1Lw6SxNFJBitqwY2jYxEs-YYU7ZqkhB5uDt3R4X77c5mDxHHyuX-R9L_r5V2sKlA2mQU/s1600/800px-CMS_Higgs-event.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibX7ijefNgGlOgsE1oahyYV6fno-1kYzI1fmdpXoj9Ep-mczkZmfe0DtvF1KpXCQ1gr2hOQaR1Lw6SxNFJBitqwY2jYxEs-YYU7ZqkhB5uDt3R4X77c5mDxHHyuX-R9L_r5V2sKlA2mQU/s1600/800px-CMS_Higgs-event.jpg" height="295" width="320" /></a></div>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif; font-size: small;"><span style="font-weight: normal;">I nearly chocked on my cornflakes while looking at blog stats yesterday morning to see several thousand page views in the morning. Looking at the referring URLs it seemed a link had been posted on Hacker News. </span></span><span style="font-family: Arial, Helvetica, sans-serif; font-size: small; font-weight: normal;">I browse that site most days</span><span style="font-family: Arial, Helvetica, sans-serif; font-size: small;"><span style="font-weight: normal;">. It was on the front page at number five, anything on the front page is BIG. Later on it got posted to a couple of reddit.com subreddits. I was getting a lot of traffic, all quite unexpected.</span></span></h3>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif; font-size: small; font-weight: normal;">Originally I wrote the blog post for a couple of reasons, firstly to aid my own memory about what I had done and how to repeat it and secondly to explain to a few of my friends how I made the images which I had already put on twitter. I didn't really see it as Hacker News material, anyone who has an interest in hash functions will have done this I thought. </span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Here are my answers to a couple of things people have said</span></div>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">This isn't new</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Agreed, it certainly is not. All the papers and software I referenced and used are at least four years old. The original block buster attack by Wang is ten years old. Even before that in 1996 just four years after its release MD5 was known to be weak at not recommended. </span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">My guess at what caught peoples attention were two things:</span></div>
<div>
<ol>
<li><span style="font-family: Arial, Helvetica, sans-serif;">Images are much easier to comprehend as collisions</span><span style="font-family: Arial, Helvetica, sans-serif;"> than random byte strings.</span></li>
<li><span style="font-family: Arial, Helvetica, sans-serif;">The fact I was clearly a rank amateur</span><span style="font-family: Arial, Helvetica, sans-serif;"> with nothing more than an AWS account and basic knowledge.</span></li>
</ol>
</div>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Can you do the same thing with SHA1 and SHA2?</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">No not really, in the case of SHA1 there are theoretical attacks which work the same way that mean it should not be used either. Marc Stevens has <a href="https://marc-stevens.nl/research/papers/EC13-S.pdf">published an attack</a> which he estimates to have a complexity of 2<sup>61</sup> compression functions for same prefix and 2<sup>77</sup> for chosen prefix collisions. In fact <a href="https://code.google.com/p/hashclash/">HashClash</a> contains some code for finding differential paths and near collision blocks for SHA1. However to come up with a collision would take vast computing effort and (unless you control a botnet) expense. I'm sceptical of the exactness of the numbers in <a href="https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html">here</a> but certainly that order of magnitude. As if to prove this fact there are no published collisions in the full SHA1. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">So with those sort of numbers I'm not got to stick it on an AWS instance backed by my credit card. </span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">SHA1 seems to be surprisingly resistant to differential analysis attacks this is one reason pretty much nobody has moved to using SHA3 yet. I like <a href="http://en.wikipedia.org/wiki/SHA-3">SHA3 or Keccak</a> as it is also known.</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">SHA2 looks like even more of a challenge than SHA1 for differential analysis.</span></div>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Is this practical?</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Hopefully not, no one should be using MD5 for anything. However, old habits die hard and once upon a time MD5 seemed like a fast and secure hash function. You can still find it in use in package managers, download pages and is probably used server side in many applications to verify file uniqueness. I haven't found anywhere still using it in SSL certs following flame.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Can you do a third image collision?</span></h3>
</div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">Currently a work in progress, watch this space.</span></div>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com5Sheffield, South Yorkshire, UK53.381128999999987 -1.4700850000000453.078144999999985 -2.11553200000004 53.684112999999989 -0.82463800000004006tag:blogger.com,1999:blog-7452979786017018311.post-54691419679688643242014-10-31T17:20:00.002+00:002023-04-26T12:10:56.574+01:00How I created two images with the same MD5 hash<div class="separator" style="clear: both; text-align: center;">
</div>
<span style="font-family: Arial, Helvetica, sans-serif;">I posted the following images the other day which although looking totally different have exactly the same MD5 hash (</span>e06723d4961a0a3f950e7786f3766338) .<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.fishtrap.co.uk/james.jpg.coll" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Arial, Helvetica, sans-serif;"><img border="0" src="https://s3-eu-west-1.amazonaws.com/md5collisions/james.jpg" /></span></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://s3-eu-west-1.amazonaws.com/md5collisions/barry.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Arial, Helvetica, sans-serif;"><img border="0" src="https://s3-eu-west-1.amazonaws.com/md5collisions/barry.jpg" /></span></a></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">The images were just two I lifted from the web in fact I could have chosen any image or indeed any arbitrary data and created a collision with it.</span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<h3 style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">Why is this surprising?</span></h3>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">MD5 was designed as a cryptographic hash function. These are supposed to possess the following 3 properties as per wikipedia (</span><a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function#Properties" style="font-family: Arial, Helvetica, sans-serif;">http://en.wikipedia.org/wiki/Cryptographic_hash_function#Properties</a><span style="font-family: Arial, Helvetica, sans-serif;">)</span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<ul style="background-color: white; color: #252525; font-family: sans-serif; font-size: 14px; line-height: 22.3999996185303px; list-style-image: url(data:image/png; margin: 0.3em 0px 0px 1.6em; padding: 0px;">
<li style="margin-bottom: 0.1em;"><i>Pre-image resistance</i>
<dl style="margin-bottom: 0.5em; margin-top: 0.2em;"><dd style="line-height: 1.6; margin-bottom: 0.1em; margin-left: 1.6em; margin-right: 0px;">Given a hash <i>h</i> it should be difficult to find any message <i>m</i> such that <i>h = hash(m)</i>.</dd></dl>
</li>
<li style="margin-bottom: 0.1em;"><i>Second pre-image resistance</i>
<dl style="margin-bottom: 0.5em; margin-top: 0.2em;"><dd style="line-height: 1.6; margin-bottom: 0.1em; margin-left: 1.6em; margin-right: 0px;">Given an input <i>m</i><sub style="font-size: 11px; line-height: 1;">1</sub> it should be difficult to find another input <i>m</i><sub style="font-size: 11px; line-height: 1;">2</sub> such that <span class="nowrap" style="white-space: nowrap;"><i>m</i><sub style="font-size: 11px; line-height: 1;">1</sub> ≠ <i>m</i><sub style="font-size: 11px; line-height: 1;">2</sub></span> and <span class="nowrap" style="white-space: nowrap;"><i>hash</i>(<i>m</i><sub style="font-size: 11px; line-height: 1;">1</sub>) = <i>hash</i>(<i>m</i><sub style="font-size: 11px; line-height: 1;">2</sub>)</span>.</dd></dl>
</li>
<li style="margin-bottom: 0.1em;"><i><a href="http://en.wikipedia.org/wiki/Collision_resistance" style="background: none; color: #0b0080; text-decoration: none;" title="Collision resistance">Collision resistance</a></i>
<dl style="margin-bottom: 0.5em; margin-top: 0.2em;"><dd style="line-height: 1.6; margin-bottom: 0.1em; margin-left: 1.6em; margin-right: 0px;">It should be difficult to find two different messages <i>m</i><sub style="font-size: 11px; line-height: 1;">1</sub> and <i>m</i><sub style="font-size: 11px; line-height: 1;">2</sub> such that <span class="nowrap" style="white-space: nowrap;"><i>hash</i>(<i>m</i><sub style="font-size: 11px; line-height: 1;">1</sub>) = <i>hash</i>(<i>m</i><sub style="font-size: 11px; line-height: 1;">2</sub>)</span>. Such a pair is called a cryptographic <a class="mw-redirect" href="http://en.wikipedia.org/wiki/Hash_collision" style="background: none; color: #0b0080; text-decoration: none;" title="Hash collision">hash collision</a>. </dd></dl>
</li>
</ul>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">The two images above clearly demonstrate that MD5 lacks the final property. MD5 is broken as a </span><span style="font-family: Arial, Helvetica, sans-serif;">cryptographic hash function.</span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="separator" style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">To search though all possible MD5 values is 2<sup>128</sup> operations which is massive. To be in with a good chance of finding a collision would take ~ 2<sup>64</sup> operations which is again far too big for normal computing.</span></div>
<h3 style="clear: both; text-align: left;">
<span style="font-family: Arial, Helvetica, sans-serif;">How did I create the images?</span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">This type of collision is has been termed a chosen prefix collision. In this case the image data is the prefix or to be more exact the internal state of the MD5 algorithm after processing the image is. You can't see the added binary data at the end of jpeg images as it is preceded with an End Of Image JPEG marker.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Chosen prefix collisions for MD5 were first successfully shown in 2007 in this paper <a href="http://www.win.tue.nl/hashclash/ChosenPrefixCollisions/">http://www.win.tue.nl/hashclash/ChosenPrefixCollisions/</a> . </span><span style="font-family: Arial, Helvetica, sans-serif;">The attack uses iterations of differential</span><span style="font-family: Arial, Helvetica, sans-serif;"> analysis of MD5. The first successful differential analysis was demonstrated by Xiaoyun Wang in her 2005 paper </span><a href="http://merlot.usc.edu/csac-f06/papers/Wang05a.pdf" style="font-family: Arial, Helvetica, sans-serif;">How to Break MD5 and Other Hash Functions</a><span style="font-family: Arial, Helvetica, sans-serif;">.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">In a </span><a href="http://natmchugh.blogspot.co.uk/2014/10/how-i-made-two-php-files-with-same-md5.html" style="font-family: Arial, Helvetica, sans-serif;">previous post</a><span style="font-family: Arial, Helvetica, sans-serif;"> I used an improved version of this </span><span style="font-family: Arial, Helvetica, sans-serif;">differential </span><span style="font-family: Arial, Helvetica, sans-serif;">path to create two PHP files with the same MD5 hash. Where those files differed by only a few bits and the data before the collision had to be identical. In a chosen prefix collision the data preceding the specially crafted collision blocks can be completely different.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">The chosen prefix collision attack works by repeatedly adding 'near collision' blocks which gradually work to eliminate the differences in the internal MD5 state until they are the same. Before this can be done the files must be of equal length and the bit differences must be of a particular form. This requires a brute force 'birthday' attack which tries random values until two are found that work, it does however have a much lower complexity than a complete brute force attack.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">If the attack sounds complicated to do in practice fortunately <a href="https://marc-stevens.nl/research/">Marc Stevens</a> has created framework for automated finding of differential paths and using them to create chosen pre-fix collisions. It is available at <a href="https://code.google.com/p/hashclash/">https://code.google.com/p/hashclash/</a> . It is intended mainly as a research tool but there is a</span><span style="font-family: Arial, Helvetica, sans-serif;"> GUI and pre-built binaries for windows available. I chose to run it on linux, however, using a bash script which can automate the repetitive steps needed. </span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Here are the MD5 states following each successive block (these are unpadded versions of MD5 algorithm). </span><br />
<center>
<table style="text-align: left;">
<tbody>
<tr><td>601034f03377d68d68a74f71b0d76bf4</td><td>c924b00ad433ccc979b8e79e6925f28e</td></tr>
<tr>
</tr>
<tr><td>0e5453c5c7deabc5e23331c415780ecf</td><td>0e5453c521968d3df17653c224bb30cd</td></tr>
<tr>
</tr>
<tr><td>8d43bcaea7f738cbbbae37b1a5ec4f3c</td><td>8d43bcae018f1a43cad159afb40f723a</td></tr>
<tr>
</tr>
<tr><td>ea12c8b8645e161d589c69dcf845dd60</td><td>ea12c8b8bef6f79467c08bda076aff5e</td></tr>
<tr>
</tr>
<tr><td>8ec47a1d73c2267dfb7686b83a3554fb</td><td>8ec47a1dce5608750a97a8b6495576f9</td></tr>
<tr>
</tr>
<tr><td>daa7508daa16fe67f09ede251abd83a2</td><td>daa7508dc5aadd5fffbefe2329dda3a0</td></tr>
<tr>
</tr>
<tr><td>c570b0b7b22e3a2545432f2a96baf83a</td><td>c570b0b7cec2591d55634f28a6da1839</td></tr>
<tr>
</tr>
<tr><td>00d3111f51f505e81d5f5db537ed64a4</td><td>00d3111f6d0926e22d7f7db5470d85a4</td></tr>
<tr>
</tr>
<tr><td>73e0e4d069fdac380cfe2cf0e7ba47fb</td><td>73e0e4d0851dad321c1e2df0f7da47fb</td></tr>
<tr>
</tr>
<tr><td>583a08b75fa22174307a24df6109b9ec</td><td>583a08b76bc22174309a24df6129b9ec</td></tr>
<tr>
</tr>
<tr><td>e2bdd99bb0bcc66557192eec1f36ff44</td><td>e2bdd99bb0bcc66557192eec1f36ff44</td></tr>
<tr>
</tr>
</tbody></table>
</center>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">The first line is the state after processing the original images where the two hashes are unrelated. The second line shows the state after padding to equal length and the addition of the 'birthday' bits. As you can see the first four bytes of the hash are the same. Each of the next nine lines shows the hashes gradually converging until they are there are no differences</span><span style="font-family: Arial, Helvetica, sans-serif;">. The last hash is different from the one calculated on the whole image as that one includes padding.</span><br />
<div class="p1">
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<span style="font-family: Arial, Helvetica, sans-serif;">The image below shows the bit differences between the above hashes a '.' indicates they are the same a '1' indicates a difference.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhotOf92ecFcj_j0iWygAXpsK4Ob8AsQtVRdiYrc9k3174JyUtvt94cY6kYFsohi-qX5tUrQj3_4yngJgWExZ-9uV54AkmSyrKqGTesd_6UGPYkvXD_u1e5gH-6AIaNpKlQ7nBB7D1nSgo/s1600/Screen+Shot+2014-10-31+at+10.20.08.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhotOf92ecFcj_j0iWygAXpsK4Ob8AsQtVRdiYrc9k3174JyUtvt94cY6kYFsohi-qX5tUrQj3_4yngJgWExZ-9uV54AkmSyrKqGTesd_6UGPYkvXD_u1e5gH-6AIaNpKlQ7nBB7D1nSgo/s1600/Screen+Shot+2014-10-31+at+10.20.08.png" height="113" width="640" /></a></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">I ran HashClash on an AWS GPU instance. I cannot say with any certainty that this is the most efficient or cheapest option but it seemed to work reasonably quickly. In particular the 'birthdaying' step took much less time than I had expected. It finished in roughly 1hr originally this step had an estimated complexity of 2<sup>49</sup> compression function calls. In his article Marc Stevens gives an estimate of 2 days for creating a complete collision on a PS3 in 2007</span><span style="font-family: Arial, Helvetica, sans-serif;">. I found that I was able to run the algorithm in about 10 hours on an AWS large GPU instance bringing it is at about $0.65 plus tax.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">I faced a few problems with the code as published and had to make some changes to the bash script and some of the C++ code related to saving the collisions. I don't know if the windows binaries that are published work any better as this seems to be where effort has most recently been expended. If anyone wants too know the changes I made let me know.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">So I guess the message to take away here is that MD5 is well and truly broken. Whilst the two images</span><span style="font-family: Arial, Helvetica, sans-serif;"> have not shown a break in the pre-image resistance</span><span style="font-family: Arial, Helvetica, sans-serif;"> or second pre-image resistance, I cannot think of a single case where the use of a broken cryptographic hash function is an appropriate choice.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">It was a chosen prefix collision attack similar to this that was used to produce a </span><span style="font-family: Arial, Helvetica, sans-serif;">counterfeit ssl certificate used to sign the Flame malware as Microsoft and pass itself off as a Windows update.</span><br />
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">What Next?</span></h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Well once you have two colliding files you can easily create a third. You just take the output of the first files and a third with any data you want file and use it as the starting point for another chosen prefix collision. I may do this and would welcome image suggestions.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<span style="font-family: Arial, Helvetica, sans-serif;">Marc Stevens has also published a <a href="https://marc-stevens.nl/research/software/download.php?file=libdetectcoll-0.2.zip">tool</a> for detecting files which have been processed in this way. Running this tool over either of the files can detect that collision blocks have been added.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<h3>
<span style="font-family: Arial, Helvetica, sans-serif;">Update</span></h3>
<div>
<span style="font-family: Arial, Helvetica, sans-serif;">I successfully created a <a href="http://natmchugh.blogspot.co.uk/2014/11/three-way-md5-collision.html">third collision</a>.</span></div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com189Sheffield, South Yorkshire, UK53.36202615413913 -1.508903503417968853.352550654139129 -1.5290735034179688 53.371501654139131 -1.4887335034179687tag:blogger.com,1999:blog-7452979786017018311.post-67914821338080322502014-10-28T20:50:00.005+00:002023-04-26T12:07:20.639+01:00Images with colliding MD5 hash<span style="font-family: Arial, Helvetica, sans-serif;">These images of James Brown and Barry White have the same MD5 hash e06723d4961a0a3f950e7786f3766338.</span><br />
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://s3-eu-west-1.amazonaws.com/md5collisions/james.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Arial, Helvetica, sans-serif;"><img border="0" src="https://s3-eu-west-1.amazonaws.com/md5collisions/james.jpg" /></span></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.fishtrap.co.uk/barry.jpg.coll" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="font-family: Arial, Helvetica, sans-serif;"><img border="0" src="https://s3-eu-west-1.amazonaws.com/md5collisions/barry.jpg" /></span></a></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="separator" style="clear: both;">
<span style="font-family: Arial, Helvetica, sans-serif;">Don't believe me. You can easily check it.</span></div>
<div class="separator" style="clear: both;">
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
<div class="separator" style="clear: both;">
</div>
<pre class="brush: shell">$ curl -s https://s3-eu-west-1.amazonaws.com/md5collisions/james.jpg | md5
e06723d4961a0a3f950e7786f3766338
$ curl -s https://s3-eu-west-1.amazonaws.com/md5collisions/barry.jpg | md5
e06723d4961a0a3f950e7786f3766338
</pre>
<span style="font-family: Arial, Helvetica, sans-serif;">You might need to use md5sum instead of md5 on the command line depending on your system.</span><br />
<div>
<span style="font-family: Arial, Helvetica, sans-serif;"><br /></span></div>
Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com10tag:blogger.com,1999:blog-7452979786017018311.post-8126073783968430222014-10-15T20:49:00.000+01:002017-05-03T09:29:09.061+01:00How I made two PHP files with the same MD5 hash<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">I recently posted a link on twitter to two PHP scripts which have different behaviours but the same MD5 hash. To verify this download the files:</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<pre class="brush: shell">$ wget https://s3-eu-west-1.amazonaws.com/md5collisions/a.php
$ wget https://s3-eu-west-1.amazonaws.com/md5collisions/b.php
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span>
<span style="font-size: 13px; font-stretch: normal;">Then check their MD5 hash:</span></span><br />
<pre class="brush: shell">$ md5 a.php b.php
MD5 (a.php) = 62e1d0d1620581693435aa75f5b6c964
MD5 (b.php) = 62e1d0d1620581693435aa75f5b6c964
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">Now run each file:</span><br />
<pre class="brush: shell">$ php a.php
Behaviour A
$ php b.php
Behaviour B
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">The files behave differently and yet hash to the same hex string using MD5. A quick check with another hash algorithm (SHA1 in this case) confirms the two files are in fact different:</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<br />
<pre class="brush: shell">$ shasum a.php b.php
6ef81f904b84d6ea53048004a26157816a85cfa3 a.php
d191a5eb7557231f461c09cb4dc565a23336778a b.php
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">So how were these files made? Well if you look at the source of file you will find there is very little difference between the two files.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<br />
<pre class="brush: bash">cat a.php</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<pre class="brush: php"><?php
$space = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
if ('?G.???-)?????+?B?8????A???P?;"??P?ZU?5?$q4F?1c?-ŏC?S
?e?MIM???~?2?:?]b??\??/5???^?gn~3D7' == '?G.???-)?????+?B?8????A???P?;"??P?ZU?5?$q4F?1c?-ŏC?S
?e?MIM???~?2?:?]b??\??/5???^?gn~3D7' ) {
echo 'Behaviour A',PHP_EOL;
} else {
echo 'Behaviour B',PHP_EOL;
}
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<pre class="brush: bash">$ cat b.php</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span>
<br />
<pre class="brush: php"><?php
$space = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
if ('?G.???-)??????+?B?8????A???P?;"??Py[U?5?$q4F?1??-ŏC?S
?e?MIM???~?2?:?]b?p?\??/5???^?g?~3D7' == '?G.???-)?????+?B?8????A???P?;"??P?ZU?5?$q4F?1c?-ŏC?S
?e?MIM???~?2?:?]b??\??/5???^?gn~3D7' ) {
echo 'Behaviour A',PHP_EOL;
} else {
echo 'Behaviour B',PHP_EOL;
}
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">Spot any difference? It may be hard to spot but there are actually subtle differences in the first of the chunks of binary data. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">So whats with all that binary data? The binary data represents a collision in the MD5 algorithm. Collisions are surprisingly easy to create for MD5 following an attack on MD5 first published by Wang in 2004. What’s more you can choose the starting value for the collision you want to create so it does not have to be at the start of the file. This is because MD5 in common with most hash algorithms operates on one block of data (64 bytes in MD5s case) at a time and then uses this value as the starting point for the next block. The next two blocks differ between the two files, however, they have been specially created to collide when inserted at this location. All subsequent blocks are the same in the two files. This is the method I used to generate the files. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">1) First I created a file which is exactly 64 bytes long and contains all of the content up to the first single quote. The A’s are just there to make it exactly 64 bytes long</span><br />
<pre class="brush: php"><?php
$space = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
if (‘
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">2) Run this content though the MD5 algorithm but a slightly modified version which does not add the length or any padding to the message, I used my own implementation of MD5 in PHP and commented out the parts where padding is appended. You can get the code for this here <a href="https://gist.github.com/natmchugh/fbea8efeced195a2acf2">https://gist.github.com/natmchugh/fbea8efeced195a2acf2</a>.This produces the hash 32f6df28e4a110970fb7f9938bee1686.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;">3) Use this hash as the initial value to find an MD5 collision. The fastest collision script I have found is Marc Stevens fastcoll See </span><span style="color: #042eee; font-size: 13px; font-stretch: normal;"><u>https://marc-stevens.nl/research/</u></span><span style="font-size: 13px; font-stretch: normal;"> . If creating collisions in MD5 sounds like it will take a long time - it doesn’t. It varies how it takes but in this case it took about 15 seconds to produce 2 messages that collide with the initial value of 32f6df28e4a110970fb7f9938bee1686.</span></span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<br />
<pre class="brush: bash">$ time ./fastcoll -i 32f6df28e4a110970fb7f9938bee1686 -o a b
MD5 collision generator v1.5
by Marc Stevens (http://www.win.tue.nl/hashclash/)
Using output filenames: 'a' and 'b'
Using initial value: 32f6df28e4a110970fb7f9938bee1686
Generating first block: .....................................
Generating second block: S10.............................................
Running time: 14.9727 s
real 0m14.981s
user 0m14.969s
sys 0m0.006s
</pre>
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">4) I then appended the two messages produced to file created in 1 to give two files.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">5). The two files should now be of equal length and have the same MD5 however they will not run as PHP. So I added the statement between the two lots of binary data. </span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">‘ == ‘</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">6) Next I appended the collision a to both files. This produces a statement that will be true in file a (since the 2 binary blobs are indeed the same) but false in file b.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">7) I appended the same PHP to both messages to complete the if statement and produce the output you see. So effectively we have two files where the first 64 bytes are identical. The next 128 bytes are a collision in MD5 and then all other text following that is identical. Not surprisingly this produces the same MD5 hash when passed though the whole MD5 algorithm including padding.</span><br />
<span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 13px; font-stretch: normal;"></span><br /></span>
<span style="font-family: "arial" , "helvetica" , sans-serif; font-size: 13px; font-stretch: normal;">While this attack may appear useless since the files are filled with gobbledygook and rely on a fairly obvious if statement there are methods to create files with the same MD5 hash which do not rely on a switch and where the rest of the file contents can be completely different</span>Nathttp://www.blogger.com/profile/12454723595745755348noreply@blogger.com0Sheffield, South Yorkshire, UK53.381128999999987 -1.4700850000000453.078144999999985 -2.11553200000004 53.684112999999989 -0.82463800000004006