Pages

Monday, September 7, 2015

MD5 collisions in ssh keys


You can insert MD5 collision blocks in many data formats and if you do it right the result will be 2 objects that differ but which when passed through MD5 have the same hash value.

MD5 hashes were used as a unique fingerprints in openssh right up until version 6.8  released 2015-03-18. The idea behind public key fingerprints is that as a client you should be able to uniquely identify the public key of a server you are connecting to. Your client will show you the fingerprint of the public key and hopefully you will recognise it. To aid this you can set your ssh client to draw you a pretty piece of random art based on the hash giving you a visual clue if something has changed.


Host key fingerprint is 16:27:ac:a5:76:28:2d:36:63:1b:56:4d:eb:df:a6:48
+--[ RSA 2048]----+
|        .        |
|       + .       |
|      . B .      |
|     o * +       |
|    X * S        |
|   + O o . .     |
|    .   E . o    |
|       . . o     |
|        . .      |
+-----------------+

That is an example of connecting to github. You can check that fingerprint here.

MD5 is no longer considered collision resistant and in fact it is relatively simple to insert collisions in any binary format. This means you can create two ssh public keys that will have the same fingerprint in MD5 and same random art image. All we need to do is take a pre generated ssh public key cut out 128 bytes out of the modulus and replace it with a pair of collision blocks

Here are a couple of 2048 bit public key files to which this technique has been applied:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD74ICRUoaAXkRwqGW51botLrFygSC1SN3C7NR6cfFmpSdRuXZt6d20+GvAzOQKJpPTItK3YRjds1izapAqSM6o/BYWOikTKLHgzJ2s5Lwm6MZnPsnRJ9o9KbnXCytCrj3wFCcay0Re+TJrlPvXggFZlane2yMit2rWx/Fbly44wNzsJLIGTIhsz6UnfmR7Wi7GfiMZ2p48xn3rSHJ3lNPTOzWfg3PFgc2YftzUow1TQ5xaamyPRD/UOXBOj2pfCp75v3TWDwwq3qliBV6pxUhe/ou8ut1zaRET5VTN4kFNziQZoxY4WVmCnCJxjBPt71hM9VzoujZggRNXct8uPdri hostname

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD74ICRUoaAXkRwqGW51botLrFygSC1SN3C7NR6cfFmpSdRuXZt6d20+GvAzOQKJpPTItK3YRjds1izapCqSM6o/BYWOikTKLHgzJ2s5Lwm6MZnPsnRJ1o+KbnXCytCrj3wFCcaS0Re+TJrlPvXggFZlane2yMit2rWx/FbFy44wNzsJLIGTIhsz6UnfmR7Wi7GfiMZ2p68xX3rSHJ3lNPTOzWfg/PFgc2YftzUow1TQ5xaamyPRD/UOXBOj2pfCp75v3TWDwwq3qliBV6pxUhe/ou8ut1zaRET5VTN4kFNziQZoxY4WVmCnCJxjBPt71hM9VzoujZggRNXct8uPdri hostname
Despite looking pretty much identical they do in fact have different moduli. To check the fingerprint you can use ssh-keygen.
$ ssh-keygen -l -E MD5 -v -f 1.pub
2048 MD5:9e:a0:13:86:e0:40:06:59:70:84:b3:7e:57:1a:5d:1d hostname (RSA)
+---[RSA 2048]----+
|+O+       .E.    |
|*.       . .     |
|oo    . .        |
|+. . . o         |
|... o = S        |
| . o = o .       |
|  . +   o        |
|     .           |
|                 |
+------[MD5]------+

$ ssh-keygen -l -E MD5 -v -f 2.pub
2048 MD5:9e:a0:13:86:e0:40:06:59:70:84:b3:7e:57:1a:5d:1d hostname (RSA)
+---[RSA 2048]----+
|+O+       .E.    |
|*.       . .     |
|oo    . .        |
|+. . . o         |
|... o = S        |
| . o = o .       |
|  . +   o        |
|     .           |
|                 |
+------[MD5]------+

To prove they are not the same try replacing MD5 in the above command with anther hash algorithm like SHA1 or SHA256.

The problem is after messing with the moduli they almost certainly don't have two prime factors of length ~ root n and even if they did I have no idea what it might be without trying to factor them which is hard. So effectively they now have no private key which makes them useless. They do in fact have trival factors which makes them doubly useless.


Is there a way to insert a collision into a key and still maintain basic RSA functionality?



Well it turns out this at least partially quite simple too. You see following our collision blocks we can put what ever additional value of the modulus we like as long as it's the same in both keys. So if we imagine the modulus being made up of three parts concatenated together A|B|C,  A and B are essentially fixed but we can adjust C until we find an n which meets the RSA requirement that 

n = A|B|C = p .q 

Where p and q are prime and of length ~ half that of n. The best way I found to do this is to generate a random C, generate a random prime p of the correct length, divide our n made up of A|B|C by p to get the nearest integer q. Then test q to see is prime. If is we can move on  however it is very unlikely that A|B|C will divide exactly by p and q without remainder. But if we multiply our p and q together they will quite likely give us a new n which can be written in the form A|B|D where D is a new number. This n still has the collision blocks B in it and we have the two prime factors p and q. The whole process takes a few seconds to run on a normal laptop. It helps to have a larger public key and so a longer C to mess around with so I have used 4096 bit public keys.

With p and q we can reconstruct the ssh private key to match public key we just created.



What about the other private key?Here we face a pretty much insurmountable problem. The value at the end of the modulus following the collision blocks needs to be the same in both public keys. However, in order to be able to build a private key for the modulus we would need to know two prime factors the likelihood of finding two is extremely small. 


So this attack is impractical?

Yes as far as I can tell collisions in ssh keys of this sort don't throw up any security concerns. Hopefully you are all running newer versions of openssh and now comparing keys with sha256 anyway.

Tuesday, July 7, 2015

Talk Talk

In June I gave talks at both BSides London and the Dutch PHP Conference on the subject of hash functions. I really enjoyed both experences and they both helped me improve my public talking.

At Bsides London I was part of the rookie track which meant I got an excellent mentor in the shape of Yiannis Chrysanthou. It also meant I only had 15 minutes to cram something interesting into which was good as it made me concentrate on the important parts

The Dutch PHP Conference or DPC is a conference I have attended twice before. This year there was no way I could go not as a speaker as they covered travel and accommodation as well as entry to the conference. It was again an excellent conference with great organisation and venue.

Wednesday, July 1, 2015

Github ssh keys some experiments

About a month ago I read this excellent piece of work https://blog.benjojo.co.uk/post/auditing-github-users-keys  . My first reaction was kick myself for not thinking of it before. It reminded me of this paper https://factorable.net/paper.html and associated the presentation which is pretty special. One of the major tools used in that paper was use of a batch version of the Greatest Common Divisor algorithm that can efficiently find common factors in large numbers of semi primes. Common primes should never happen in correctly functioning ssh key generation. However they are known to happen when things go wrong. 

The batchGCD algorithm didn't seem to have been used to examine the keys in the database. I was in good company in spotting this possible extension.


At the time I expected to see people reporting the discovery of further bad keys discovered using batchGCD over the next few days. A few days latter I came across http://cryptosense.com/batch-gcding-github-ssh-keys/ which looks like exactly that just in a more restrained writing style. 

In the mean time I have been having a poke around myself.

Here are some observations I have made from my experiments with the keys on github and uploading new keys.


  • There is now a minimum key length of 1024 bits enforced when submitting keys with a suggestion of 2048 bits
  • There are no keys below 1024 bits still in the database all shorted ones have been removed.
  • You cannot add a public key from a pair generated with the debian bug. If you do you will see this message

  • There are very few obviously bad keys (e.g. with trivial moduls factorisation) on github. I found a handful all very old.
  • I could not find any keys with common divisors in the database. It looks like somebody has been through with batchGCD notified github and they have revoked them
  • New keys added are not checked to see if they have a common divisor but presumably will eventually get discovered
  • Github seems to receive a lot of spam signup these days where users are listed in the api but deleted for viewing