Oh, cool! Have you considered making changes to the interface that's presented t...

tytso · on Oct 15, 2013

Why do you need so much random numbers such that /dev/urandom is too slow for you?

There are plenty of uses of randomness. Let's look at some of them:

1. Monte Carlo Simulations

2. Wiping disk files that might contain sensitive information.

3. Padding in crypto applications

4. Initialization Vectors (IV's)

5. Session keys / D-H key agreement

6. Long-term keys for users

7. Country-level TLD keys for things like DNSSEC

These uses have been rank ordered roughly in order of cryptographic strength that might required. Very often, I've found that people who are complaining about the speed of /dev/urandom are using it for something that it wasn't designed for; for example, "dd if=/dev/urandom of=/dev/hda" to wipe a disk.

And similarly, for people who insist that there is no good difference/distinction between /dev/urandom and /dev/random, they are only thinking of their use case, and/or they aren't considering that there might be a difference between generating a long-term public key for GPG (which today tends to use /dev/random for this purpose, and I think rightly so), and generating a one-time session key when encrypting a message using GPG. Since you only do the first once, a bit more inconvenience might be tolerated by the user. (Today this takes about ten minutes on a laptop, and part of the reason for this is that Chrome is positively profligate with how it uses randomness, including a 32k read from /dev/urandom[!] at startup, and this is draining the entropy available for /dev/random, which slows it down significantly. I have a change pending upstream which time-limits the amount of entropy that can be drawn from the input pool to the nonblocking pool to protect our entropy from abusive users of /dev/urandom, and I have on my todo list tracking down some chrome developers to figure out what the heck they are doing with all of their demands on /dev/urandom.)

The reality is that strong random numbers is a precious resource, and you can't use it willy-nilly. Furthermore, the strength of random numbers is a continuum; it is not a binary "strong" vs. "not strong" distinction. I'm sure it would be very convenient for userspace programmers to be able to assume that randomness is always freely available, and that it only comes in two flavors, "insecure" and "secure", but that's not the way the world works.

Even in the case of hardware random number generators, there are tradeoffs. This is why future Intel processors will be making a distinction between RDRAND (for which you can't assume that two 128 bit RDRAND outputs, when put together, will give you 256 bits of entropy), and RDSEED (which does have this guarantee). And of course, there is also the tradeoff between the auditability of /dev/random and the non-auditability of something which is locked in a complex piece of Silicon.

TL;DR. I'd like to have a Personal Gulfstream Jet, and enough money that it's not a big deal to consider a $12 million dollar home in La Jolla to be teardwon fodder. But sometimes reality intrudes on your dreams/fantasies....

tptacek · on Oct 15, 2013

This is a great comment, but I disagree with it.

The hierarchy of secrets you've come up with sounds useful, but in practice:

* An attack based on entropy on a CSPRNG that ran months or even days ago, barring blatant flaws like zero cold start entropy, isn't realistic†. And, if you're paranoid about your "long term keys", generate them on a different machine than the one on which you serve up disk-wiping entropy. That's what you're supposed to do anyways. But I'm a little concerned that such a suggestion dignifies an unrealistic risk.

* Meanwhile, the notion that "lesser" secrets, like IV and padding, can deal with less randomness doesn't seem to bear empirical scrutiny. The attacks that have tended to matter against cryptosystems in the last few years have, with the exception of the P's and Q's zero-cold-start-entropy issue, not targeted long-term secrets, but rather have hinged on details like padding (random padding in the OAEP/PKCS#1 case, I guess; most crypto padding is deterministic).

* The mentality of "long term secrets" and "short term crypto parameters" that you laid out already missed a doozy: DSA and ECDSA nonces, which people who think like your comment did thought of as "undemanding", but for which it turns out that just being able to predict a couple of bits of that parameter can get you with a few thousand samples all the way back to private keys.

I think you'd be better off, cryptographically and from a software design perspective, with a single CSPRNG that was suitable for all cryptographic use cases, and, if users are paranoid, have them tap a physically different CSPRNG for their "long term keys" --- which they need to do anyways on the presumption that the systems that they use routinely are also probably owned up.

Also, I have to again point out that the Linux CSPRNG design doesn't handle any of these cases well. Its /dev/random is unusable by demanding applications in steady state, with the possible (terrible) workaround of reimplementing CSPRNGs seeded from small amounts of /dev/random data in userspace (and all the attendant flaws); its /dev/urandom is unsafe at cold start. Something's gotta give in that design: it's doing extra work internally and inflicting extra work on developers, all for a worse result than FreeBSD.

† Re: realism: it's always easy to argue "attacks always get better, we should account for every attack, realistic or not". But engineering is about tradeoffs, and here the tradeoff Linux is making is to service an attack unknown to the literature that is likely precluded by the basic design of a modern refreshing CSPRNG, and doing so at the expense of protecting against very realistic attacks that have been discovered in the wild; it doesn't help that Linux (in general) seems to respond to that point by suggesting that userland devs should have done a better job.

Oh, dubiously-informed CSPRNG opinions; I sure do have them.

tytso · on Oct 15, 2013

It's true that many of the attacks (such as BEAST) rely on attacks related to the IV. However, look at underlying mechanism of that attack; it requires that the attacker be able to predict the entire IV (which it can do since it can use the CBC residue from the previous packet). If the attacker can predict 64 out of the 128 bits of an IV used for AES, that's less of a disaster than if the attacker can predict 64 out of the 128 bits for an AES key. So there is a difference, although I might agree with you that relying on application developers to understand these distinctions might be unrealistic.

You're right that for DSA nonces, even a partial failure of the RNG is a disaster; to me, that's more of an indictment of DSA as being really extremely fragile.

I agree that making /dev/urandom to be as secure as is possible, as quickly as possible after a cold start, is a good thing, and indeed most of my efforts have been focused in that area. I'm not convinced the FreeBSD approach is a panacea, since that presumes FreeBSD's entropy estimator is sufficiently accurate. And I'll note that even with the Fortuna design, which completely gives up on the entropy estimator, it's still subject to the cold start problem --- you just don't know when Fortuna has accumulated enough entropy such that it is secure. In that regard, it's no better than Linux's /dev/random approach.

I'm not opposed to some adding a cold-start block to /dev/urandom, and ultimately, I might go there just because it's easier than making recommendations that userspace wait until the last possible moment to lazily generate keying materials. However, it's a bit of a bandaid, so I'm more interested in trying to collect as much entropy as possible right after boot, to narrow the window after a cold start.

FreeBSD's approach certainly won't help with the criticism leveled by the "/dev/random is not robust" paper, since it assumes that fundamentally the entropy estimator is broken. But if it's broken, then FreeBSD Yarrow will also not be robust in the face of the cold start problem. So it's certainly not going to help people who are going to kvetch about entropy estimators being hopeless.

dlitz · on Oct 17, 2013

> The reality is that strong random numbers is a precious resource, and you can't use it willy-nilly. Furthermore, the strength of random numbers is a continuum; it is not a binary "strong" vs. "not strong" distinction. I'm sure it would be very convenient for userspace programmers to be able to assume that randomness is always freely available, and that it only comes in two flavors, "insecure" and "secure", but that's not the way the world works.

Entropy may be a scarce resource on the input side, but it's a big mistake to export that scarcity to userspace, rather than abstracting it away. It's the sort of intuitive decision that made sense in the late 1990s---when we also thought that the "error propagation properties" of CBC mode were useful rather than a liability, that CTR mode was scary, and that compressing plaintext was a good idea---but it was wrong and it's due to be revisited. The idea that developers---as a group---can reliably distinguish between multiple types of strong, random numbers, and then correctly implement PRNGs in userspace that take into account those differences, is nothing but pure fantasy.

Applications don't actually need scarce "entropy" (of the information-theoretic kind that you seem so obsessed about). What they need is an abundance of unpredictable bytes that pass the next-bit test, as quickly and straightforwardly as possible.

There is an entire class of vulnerabilities that only exists because of the artificial scarcity of unpredictable bytes: Since applications can't just get all of their unpredictable bytes from /dev/urandom, they use crypto libraries that implement their own PRNGs in userspace (where it's more complex to do, especially in a library, thanks to forking, threading, signals, etc.). Inevitably, these libraries get something wrong, but nobody notices for years.

Quoting a random link: http://comments.gmane.org/gmane.comp.emulators.libvirt/73841

    [libvirt] [PATCHv5] virtio-rng: Add rate limiting options for virtio-RNG

    Qemu's implementation of virtio RNG supports rate limiting of the
    entropy used. This patch exposes the option to tune this functionality.

Rate limiting?? So, if we have Apache running on a VM, we end up with something like this:

environment -> host kernel -> egd -> host kernel -> qemu master -> fork -> rate limiter -> qemu slaves -> guest kernel -> egd -> guest kernel -> apache+openssl -> fork -> apache+openssl

All of this complexity is madness, and completely unnecessary. Cryptographic protocols don't actually distinguish between varying "quality" of random numbers, and neither does the literature, so it makes no sense to export an interface that expects people to make that distinction. /dev/random is nearly useless and it doesn't even make up for its lack of availability by being academically rigorous. As tptacek mentioned, key generation actually isn't anything special (DSA needs unpredictable numbers every time, for example). Even where /dev/urandom might be good enough, the lack of rigor in its design makes some people nervous, so they use some userspace PRNG anyway. The result is that random number generation on most machines is a complex mishmash of overcomplicated PRNGs connected in overcomplicated ways, resulting in an endless stream of bugs that are as embarrassing and harmful as all of the buffer overflows caused by the continued misuse of strcat().

Speaking of RNG bugs, here's one that I fixed today: http://lists.dlitz.net/pipermail/pycrypto/2013q4/000702.html

    An application may be affected if, within 100 milliseconds, it performs
    the following steps (which may be summarized as "read-fork-read-read"):

    1. Read from the Crypto.Random PRNG, causing an internal reseed;
    2. Fork the process and invoke Crypto.Random.atfork() in the child;
    3. Read from the Crypto.Random PRNG again, in at least two different
         processes (parent and child, or multiple children).

This bug wouldn't even have the opportunity to exist if I could have just read everything from /dev/urandom---which is what I proposed back in 2008, but some of my users complained, citing the lack of speed and rigor in /dev/urandom, so I implemented the most reliable RNG that I knew about at the time (Fortuna), but it was broken in a case where the state-copying behavior of fork() interacts with the rate-limiting behavior of Fortuna, which---ironically---is designed to protect users against attacks that exploit the scarcity of random numbers. Perhaps I should have ignored them, and been as "profligate" with /dev/urandom as Chrome is, but then they might have individually just implemented something on their own, which wouldn't necesasrily be better for end-users.

The fundamental problem is that you're treating /dev/*random as a device driver that provides scarce "entropy", when what's needed is a cryptographic service that provides unpredictable bytes in abundance. You say that /dev/urandom wasn't designed to do "dd if=/dev/urandom of=/dev/hda", and my point is that it ought to be.

chalst · on Oct 15, 2013

> The distinction between /dev/random and /dev/urandom is silly

That the problems with the two sources are completely different tells you that the distinction is not silly.

Your speed requirement sounds to me like you don't want a /dev at all, but maybe a daemon that seeds itself from /dev/random and frequently refreshes itself from /dev/urandom. This wouldn't be kernel space and it wouldn't have threading issues.

tptacek · on Oct 15, 2013

First, other operating systems lose the distinction, and maintain the two device filesystem names solely for historical purposes. FreeBSD, for instance, uses Yarrow and doesn't distinguish between urandom and random.

Second, he just explained why the distinction isn't useful in practice. The sole difference between random and urandom is that random blocks based on an entropy estimate. But that's silly: the only time you want random to block in practice is during cold start, but instead random blocks unpredictably all the time and is unusable. Meanwhile, urandom is effective virtually all of the time... except on cold start (on solid state devices), where it would be OK for it just once to block.

I'm not sure how a userspace daemon helps performance, since it implies roundtripping through the kernel anyways.

chalst · on Oct 15, 2013

Userspace: Being in userspace obviously doesn't increase speed, but I don't think the overhead is material to dlitz's speed concern. The point is that we can cook up things in userspace that might meet dlitz's requirements: this offers flexibility and the possibility of portability, provided the basic ingredients are available from the OS.

Distinction: FreeBSD's /dev/random device allows the desired behaviour (modulo the performance concern), once you issue `sysctl kern.random.sys.seeded=0`, which ensures /dev/random blocks if there isn't enough entropy (the command, in effect, says the source needs more entropy to be trustworthy). I think this is a better behaviour than Linux's, but Linux's behaviour in turn seems to be better than Darwin/OSX's, which AFAICS (`sysctl kern` there shows no settings that obviously influence /dev/random) gives you no opportunity to block before you have enough entropy.

Being able to block until there is enough entropy is essential to what dlitz wants, and the random/urandom distinction offers that.

tptacek · on Oct 15, 2013

OS X took its /dev/random from FreeBSD. The CSPRNG design is Yarrow. It explicitly unifies urandom/random; they're the same device.

Colin may correct me on this, but I looked it up again to be sure.

/dev/urandom is the Unix standard for crypto randomness; I don't think there's much point in being more portable. Any Unix on which you're going to deploy strong crypto should provide a workable /dev/random.

I'm not sure what a userland daemon could do for you that the kernel couldn't do. The parent commenter's demands are as rigorous as any CSPRNG user's could be: he's building a crypto library and can't assume his users are casual. If a CSPRNG can be made faster for his use case, there's a strong argument that the kernel should provide that performance by default.

chalst · on Oct 15, 2013

FreeBSD's current implementation behind /dev/random is more sophisticated that Darwin's - just look at the manpages. I don't see how you can tell if Darwin doesn't have enough entropy, which you can with both Linux and FreeBSD.

tptacek · on Oct 15, 2013

I might be looking at an older FreeBSD man page. COLIN? Where are you?

It's also worth noting that Yarrow is a Ferguson/Kelsey design, and Ferguson has foresworn entropy estimation; hence Fortuna, his newer design.

The whole notion of the kernel CSPRNG having "enough" entropy is pretty silly for systems in steady state. "enough" entropy is important at cold start, but rarely otherwise.

Also worth noting that "just read urandom" is Daniel J. Bernstein and Peter Schwabe's recommendation, based on the NaCl design papers. It's also Golang's whole CSPRNG approach, which inasmuch as it's owned by anyone is owned by Adam Langley.

chalst · on Oct 15, 2013

The current man page for FreeBSD random (4) is at http://www.freebsd.org/cgi/man.cgi?query=random&apropos=0&se...

whose HISTORY section reads:

     A random device appeared in FreeBSD 2.2.  The early version was taken
     from Theodore Ts'o's entropy driver for Linux.  The current software
     implementation, introduced in FreeBSD 5.0, is a complete rewrite by Mark
     R V Murray, and is an implementation of the Yarrow algorithm by Bruce
     Schneier, et al.  The only hardware implementation currently is for the
     VIA C3 Nehemiah (stepping 3 or greater) CPU.  More will be added in the
     future.

dlitz's worry was about initial entropy, which is why he wanted his ideal random device to block initially - this makes a lot of sense. That said, I guess if you are worried about sophisticated attackers [+] getting to see very long stretches of output from your random device, you might want to ensure you are periodically injecting some fresh entropy in there. This should be ensured in the system, so the luser can just read /dev/urandom, but it should be possible to check that the system has fresh enough entropy. FreeBSD's interface seems better suited for this than Linux's.

[+] Let's think of the topical "sufficiently sophisticated attacker", who might have secret knowledge of problems with the CSPRNG we use.

bungle · on Oct 15, 2013

> /dev/random will still block after reading a few bytes.

Have you tried Haveged? http://www.issihosts.com/haveged/

JoachimS · on Oct 16, 2013

Havege and other CPU-based entropy sources are interesting. But they do implicit assumptions such as that the TSC is somehow giving you real measurements (and is not virtualized for example) and that the CPU has cache memories, do out-of-order issue etc which makes the exection time variable.

We have tested Havege and on modern x86 systems you do get good numbers. But there are embedded SSL stacks that use Havege on such things as AVR32 cpus where the exection time is much more predictable. I have yet to test in those environments, but it does not look like a smart move.

Also, Havege includes an entropy estimator that seems to be broken. We used the Simics system simulator and forced the TSC to be stuck and could see that we got bad entropy from Havege. But the estimator happily said that the entropy was good.

My suggestion is that if you are to use Havege and friends, use them to feed the OS entropy pool, not as a replacement for /dev/random, /dev/uramndom.

And speaking of friends. Jytter is one such friend. I have yet to test it with Dieharder though.

http://jytter.blogspot.se/

JoachimS · on Oct 18, 2013

I have now tried Jytter on OSX (MBA with Ivy Bridge Core i7). I get about 50 kbps from jytter. Testing the data with Dieharder I get about 82% success rate on the test cases.