The Debian OpenSSL disaster

I’ve read some more about this and especially had a look at some of the source code, so I’ve completely revised this blog post.

There is no doubt that the Debian Maintainer who added this patch screwed up, seriously. But he’s not the only one to blame:

OpenSSL isn’t valgrind-clean. It should be made, since valgrind is a very important debugging tool; the bad patch was introduced to make OpenSSL better in this respect due to the request by the users.
The OpenSSL source code could be better documented in these places.
There are two instances of this line in OpenSSL which can both generate the ‘using uninitialized data’ warning. One is safe to remove (it’s supposed to fill the buffer with random data, and just uses the existing contents as additional source of randomness), the other is not (it’s used to feed randomness coming from e.g. /dev/random into the pool).
The Debian maintainer received the reply “if it helps with debugging, I’m in favour of removing them” by one of the current OpenSSL devs on the openssl-dev mailing list. Probably just referring to the ‘safe one’ of the two locations where this occurs, though?
Nobody noticed the severity of this change for more than 2 years. We’re all to blame.

I’m really sick of hearing comments like “Still, whoever took out the entire initialization should not be trusted with security intensive code.” (guest comment on LWN). Yes, he screwed up there. But you bet he’s going to be a lot more careful with any change in the future: he has learned his lesson. Better than having someone else screw up in a similar way again. And actually he didn’t do this change half as easy-hearted as many people suggest, if you look at the discussions on the bug report and mailing lists. He was trying to fix the valgrind bug, and he talked to several people on how to do it properly.

The best way the bug could have been avoided was if the OpenSSL upstream developers had cared more about the valgrind issue themselves. (E.g. by teaching valgrind to ignore the issue, documenting in the source code why this is intentional and not an issue, …)

P.S. in the previous version of this blog post I had asked why OpenSSL apparently relies on uninitialized data for security. It doesn’t; the same lines exist in two places, and it’s the other change that caused the problems. One place will ‘usually’ generate the warning, and it’s not important (that’s the place where you could just remove the line). The other place will generate the same warning when someone passes uninitialized data into the RNG. As long as the RNG is sufficiently seeded with other random data that isn’t much of a problem. So if you take a buffer of 1024 bytes and fill it with 1000 bytes of good random data (e.g. from a hardware randomness source) and feed the whole buffer to the RNG, it will be seeded quite well, but the warning will be generated. The 24 uninitizialized bytes won’t take the entropy away.

(This blog does intentionally not have a comment function. Sorry.)