|
|
Subscribe / Log in / New account

Random numbers from CPU execution time jitter

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jake Edge
April 29, 2015

Availability of entropy, especially for embedded devices early in the kernel boot process, is a commonly discussed problem in the kernel random number community. The usual sources of entropy depend on events like the timing of user input or disk interrupts that may not be present yet—or at all. A patch set from Stephan Müller seeks to remedy that by using entropy that is collected from a source that is always present: CPU execution time jitter.

On a modern processor, there are many different factors that can impact the amount of time it takes to execute the same set of instructions. If one measures that execution time precisely multiple times, it will show variation—jitter. An attacker who doesn't have any special hardware-level access to the CPU cannot predict this jitter, which makes it a good source of entropy, according to Müller's lengthy paper on the technique.

There are numerous CPU-level activities that lead to unpredictability in execution time. The fill level of the instruction pipelines, memory wait states, instruction and data caches, branch prediction, interrupts, power management, frequency scaling, and so on can all contribute to changing the execution time. As Müller's paper shows, a wide variety of today's processors show enough jitter (in a statistical sense) to be used as an entropy source for the kernel's random number pools.

The heart of the algorithm to implement Müller's jitter measurement gathering lives in the jent_measure_jitter() function. It is effectively a "random bit generator", since it returns a single bit that has been calculated based on the jitter measured in that function. The first step is to introduce some noise into the measurement based on memory reads and writes. The jitter entropy module allocates a 2KB buffer during initialization that it loops through and simply adds one to the value stored there (which causes both a load and a store). The buffer is larger than the L1 cache of the processor, which should introduce some unpredictable wait states into the measurement.

It then gets a timestamp and calculates a delta from the previous timestamp. This delta is used in a "deliberately inefficient" calculation to fold the value down to a single bit. There are faster ways to do the folding operation, but his algorithm is part of what is being measured for jitter in the execution time. In order to preserve those measurements, optimization for the jitter random number generator (RNG) must be turned off. In fact, there is a test in the code that will cause the build to fail if optimization is enabled to avoid the possibility of getting bad random numbers from a misconfigured build. The comment for the jent_fold_time() function explains a bit further:

This function is the root cause why the code shall be compiled without optimization. This function not only acts as folding operation, but this function's execution is used to measure the CPU execution time jitter. Any change to the loop in this function implies that careful retesting must be done.

The jent_gen_entropy() function generates a 64-bit random number by calling jent_measure_jitter() 64 times. Getting larger amounts of random data is done with jent_read_entropy(), which takes a buffer and length and repeatedly calls jent_gen_entropy() to fill the buffer with the requested amount of random data.

Adding the CPU jitter RNG is only part of what the patch set does, however. The kernel's deterministic random bit generator (DRBG) is currently initialized by making a call to get_random_bytes(), which uses the non-blocking random number pool. In certain circumstances (e.g. for some embedded devices or virtual machines), that pool will not have been seeded from enough events to provide the entropy required.

In mid-April, kernel crypto subsystem maintainer Herbert Xu asked Müller whether the current DRBG implementation was compliant with the US National Institute of Standards and Technology (NIST) SP 800-90A specification for DRBGs that specifies seeding DRBGs from non-deterministic sources. Since the worst case for get_random_bytes() is that it is completely deterministic, Xu felt that some other mechanism should be used to seed the kernel's DRBG so that it complied.

Müller had already proposed inclusion of his CPU jitter RNG in October 2013. He used that code as the basis for this new patch set. Instead of reusing the existing blocking pool (i.e. the one that feeds /dev/random), though, his patch creates a new kernel_pool that is only accessible to the kernel itself. That eliminates a kind of denial-of-service attack where a user-space program can continuously read /dev/random to consume all of the entropy being generated by the system.

The DRBG is then seeded early in the boot process from a combination of get_random_bytes() and the jitter RNG. In addition, an asynchronous call is made for the required amount of random data from the new, blocking kernel_pool. It will only return once the required amount of entropy has been gathered by the system and the random data returned will be used to reseed the DRBG. Thus, the DRBG is always seeded with non-deterministic data early on—as long as the jitter RNG is actually producing random numbers.

Back in 2013, kernel RNG maintainer Ted Ts'o expressed skepticism about the jitter technique. He was concerned that the measurements were not as unpredictable as they appear to be—that a sufficiently knowledgeable attacker could determine enough of the state to predict the timing.

It may be that there is some very complex state which is hidden inside the the CPU execution pipeline, the L1 cache, etc., etc. But just because *you* can't figure it out, and just because *I* can't figure it out doesn't mean that it is ipso facto something which a really bright NSA analyst working in Fort Meade can't figure out. (Or heck, a really clever Intel engineer who has full visibility into the internal design of an Intel CPU....)

Effectively, he was worried that the entropy estimation for the jitter measurements was too high, perhaps far too high.

Ts'o has not commented on the latest patches, at least yet. In fact, there haven't really been any technical comments on the patches as yet. Xu seemed to indicate that he is generally in favor of Müller's solution for the DRBG. If the patches do get merged, perhaps other users for the jitter RNG will emerge. It is a fairly straightforward and speedy mechanism for collecting entropy—the question is how much of that entropy is "real".


Index entries for this article
KernelRandom numbers
SecurityLinux kernel/Random number generation
SecurityRandom number generation


(Log in to post comments)

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 7:25 UTC (Thu) by alonz (subscriber, #815) [Link]

FWIW, I share Ts'o's reservations…

I had spent quite a few years designing SoCs and embedded systems based on them, and the system design process actively aims to reduce non-determinism at all levels – including, in particular, CPU timings.

At least during early boot (before the system communicates with external components) the only sources of timing non-determinism are stray capacitances or environmental heat; on a well-designed system, we usually saw a variance of less than 1% in boot process timing (measured at the clock-cycle level).

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 9:00 UTC (Thu) by epa (subscriber, #39769) [Link]

You may be right but the thing about an entropy pool is that mixing in some new data can never make things worse. Even if this jitter measurement turns out to be totally and trivially predictable, it will not make the random number generator easier to break than it would be without it. So often you may as well throw together ten different entropy sources. Even if nobody is certain that any individual source can't be predicted, it is unlikely an attacker would be able to predict or control all ten.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 9:11 UTC (Thu) by shmget (guest, #58347) [Link]

"The conventional wisdom is that hashing more entropy sources can't hurt: [...]
The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong."

http://blog.cr.yp.to/20140205-entropy.html

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 10:09 UTC (Thu) by epa (subscriber, #39769) [Link]

But what if z comes from a malicious source that can snoop on x and y?
This is an interesting thing to consider but it is not usually that relevant. If my understanding of the article is correct, the assumption is that the attacker cannot snoop on the other entropy sources normally, but can somehow influence the generation of the new entropy source so that it takes into account the others.

So you would have to suppose some means of influencing the CPU jitter measurements that requires knowledge of another entropy source, but at the same time suppose that the other entropy source is not normally predictable by an attacker. This seems very far fetched.

The article goes on to make another argument: that adding more entropy is simply not needed. Once you have enough (say 256 bits) you can generate all the randomness from that. That may or may not be so, but it doesn't in itself add weight to the claim that adding new entropy sources is actively bad because they may be able to snoop on other sources (in some unspecified magical way) and so end up removing randomness from the result.

Random numbers from CPU execution time jitter

Posted May 3, 2015 9:44 UTC (Sun) by alankila (guest, #47141) [Link]

The attack outlined is probably not applicable to a entropy generator input situation. The key problem is that the inputs are likely to contain the current seed of the random number generator in some form. E.g. if you have some new data x you want to feed into the pool, a straightforward solution is to update the random number generator state with "state = H(state || x)" where H is a hash function returning suitably wide result. Since we are going to assume that the attacker is not already in possession of the seed, the attack is not possible.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 14:26 UTC (Thu) by tpo (subscriber, #25713) [Link]

> You may be right but the thing about an entropy pool is that mixing in some new data can never make things worse.

Your assertion is wrong unless you qualify it more precisely.

Let's say you have some entropy pool and add /dev/null as a further source to it. Now, depending on the size of the pipe that sucks randomness out of that pool it might be that the pool is empty - except for /dev/null.

So if instead of blocking you now continue to feed the pipe from /dev/null, then randomness disappears into complete determinism.

So I think you have to explain how the output of your entropy pool is actually mixed before asserting that "it never can make things worse".

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 15:08 UTC (Thu) by fandingo (guest, #67019) [Link]

Your criticism is predicated on a complete change of how entropy pools are used. Randomness is never "sucked out" of an entropy pool. New random data is folded into the existing data, and the overall pool never decreases in size. The output of an entropy pool is transformed data, too, so you're never giving out the seed data (because that would disclose state).

(This seems to confuse a lot of people when they look at the blocking behavior of /dev/random. The pool never depletes, but a calculation of the quality of the randomness in the pool -- i.e. the entropy -- causes blocking, not a depletion of the actual data.)

That's why adding data doesn't hurt. If you have an entropy pool that you trust at t1, folding in a bunch of low-quality data still leaves you with the original t1 randomness.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 15:26 UTC (Thu) by tpo (subscriber, #25713) [Link]

Thanks for enlightening me!
*t

Actually still confused

Posted Apr 30, 2015 15:51 UTC (Thu) by tpo (subscriber, #25713) [Link]

On further searching around I find, that I still do not understand the basic mechanism of an entropy pool.

My current mental model of the likes of /dev/random is that one starts with a certain amount of gathered randomness R (the seed). Then, once someone pulls data from /dev/random, you apply some function f(R,t) or fn(R,fn-1) that calculates/generates random bytes from the initially gathered seed either incrementally via reiteration or by including some monotonically increasing input such as a clock.

Now, as you explain, I effectively am confused by the fact that let's say ssh-keygen or openssl keygen will block and wait to gather more entropy even after the machine has been running for months and thus has seen "infinite" amounts of randomness. What is the reason to repeatedly start gathering further entropy at that point if as you seem to imply generating an infinite amount of random bytes from the initial seed does not reduce the random quality of future generated random bytes?

I've checked "entropy" and "entropy pool" on Wikipedia, but either I misunderstand it or Wikipedia is confused when using phrases like "entropy depletion" and similar, which according to what you say is not possible inside an entropy pool.

Is there basic, coherent explanation of the whole mechanism somewhere?
*t

Actually still confused

Posted Apr 30, 2015 18:10 UTC (Thu) by fandingo (guest, #67019) [Link]

> Then, once someone pulls data from /dev/random, you apply some function f(R,t) or fn(R,fn-1) that calculates/generates random bytes from the initially gathered seed

Not exactly and perhaps this can clear up some of the confusion. There is an entropy pool of data that is filled immediately and always stays full. Over time, this pool has new random data from a variety of sources mixed into it. As data is mixed in, the kernel estimates how much entropy it thinks is now in the pool and sets a counter appropriately. In the background, there is a kernel thread that checks a different output pool. If the pool isn't full, f(epool) is run to populate the output pool.

Both urandom and random are using the same f() and entropy pool, but they do get individual output pools. The only difference between urandom and random is that the background worker to populate random's output pool will block if the estimated entropy is too low.

> I've checked "entropy" and "entropy pool" on Wikipedia, but either I misunderstand it or Wikipedia is confused when using phrases like "entropy depletion" and similar, which according to what you say is not possible inside an entropy pool.

Check out this image: https://i.imgur.com/VIPLO2d.png from this paper: https://eprint.iacr.org/2012/251.pdf *

There is not much information and a lot of confusion about PRNG and CSPRNG. Linux implements a CSPRNG, and the CS stands for cryptographically secure. This means that a partial disclosure of PRNG state should not compromise the output both forwards and backwards. That's the heart of why the kernel says it "consumes" estimated entropy as the entropy pool data is used for output. The state must continually incorporate new entropy data and mix the pool, or else partial state disclosures can make outputted data predictable.

There are a lot of people that disagree with the blocking nature of /dev/random and how much of the CSPRNG operates. In particular, FreeBSD has a nonblocking /dev/random. They also use the very fast arc4 for their output function. Personally, I prefer the Linux CSPRNG because I like the security considerations, even though they come at a high performance cost. It's better to get high quality and secure random data (that includes urandom) from the kernel, and then feed it in as the key for a very fast stream cipher if that's what you need. (For example, `dd if=/dev/urandom of=/dev/sda` is a terrible misuse. Instead, use something like `openssl enc -aes128 -k "shred" < /dev/urandom > /dev/sda`.)

* This is an excellent paper that covers the CSPRNG in both an approachable and mathematical methodology.

Actually still confused

Posted May 1, 2015 10:10 UTC (Fri) by tpo (subscriber, #25713) [Link]

So I've been reading that paper [1] and think it is neither particularly clear nor precise. But my opinions about that paper are besides the issue that I'm interested in, which is how the random generator actually works.

The one thing that has become clearer to me - thank you! - is that there exists a mechanism to add input data to the entropy pool, which has the property of not reducing the existing entropy in the pool no matter what the entropy quality of the new input data is. I've not verified that claim, but assume it true, it being a long standing mathematical finding. That's good news to me.

However you write:

> There is an entropy pool of data that is filled immediately and always stays full. Over time, this pool has new random data from a variety of sources mixed into it. As data is mixed in, the kernel estimates how much entropy it thinks is now in the pool and sets a counter appropriately. In the background, there is a kernel thread that checks a different output pool. If the pool isn't full, f(epool) is run to populate the output pool.

I think the contentious claim here is "the entropy pool ... always stays full". If you mean "stays full" in the sense of "a stack that never gets an element popped out from it" then I agree with that, since the pool is a fixed size structure, that, even if it were "empty" still contains "something" even if its "only all zeros". However that is not what is relevant in this discussion. The relevant thing is that by generating random data from that pool you transfer entropy out of the entropy pool. I quote the paper:

"When k bytes need to be generated, ... k output bytes are generated
from this pool and the entropy counter is decreased by k bytes."

Thus if we measure "fullness" by the here relevant metric of "amount of entropy" contained in the entropy pool, then the pool is *not* always full and in fact sometimes even empty as in the case where you have ssh-keygen pulling random data out of /dev/random and blocking because the kernel is unable to refill the entropy pool from its entropy sources.

All this said, the above is only my understanding acquired by reading what I have been referred to and what I could find. My understanding may well still be insufficient and wrong. If you've put up enough with an ignorant of my likeness then I can fully understand that. Otherwise I'll be happy to hear more and try to improve my understanding of the matter.

Thanks,
*t

[1] https://eprint.iacr.org/2012/251.pdf

Actually still confused

Posted May 1, 2015 13:52 UTC (Fri) by cesarb (subscriber, #6266) [Link]

Think of the pool's entropy as a "measure of unpredictability". If the entropy is ten bits, for instance, you'd need at most 1024 guesses to find the pool state.

You should think of the random generator as having two separate and independent parts: the pool itself and the output function.

Inputs are mixed into the pool using a cryptographic function which takes two values: the previous pool state and the input value, and outputs the new pool state. This function thoroughly mixes its inputs, such that a one-bit difference on any of them would result on average to half of the output bits changing, and other than by guessing it's not possible to get from the output back to the inputs.

Suppose you start with the pool containing only zeros. You add to it an input containing one bit of entropy. Around half of the pool bits will flip, and you can't easily reverse the function to get back the input, but since it's only one bit of entropy you can make two guesses and find one which matches the new pool state. Each new bit of entropy added doubles the number of guesses you need to make; but due to the pigeonhole principle, you can't have more entropy than the number of bits in the pool.

To read from the pool, you use the second part, the output function. It again is a cryptographic function which takes the whole pool as its input, mixes it together, and outputs a number. Other than by guessing, it's not possible to get from this output to the pool state.

Now let's go back to the one-bit example. The pool started with zero entropy (a fixed initial state), and got one bit of entropy added. It can now be in one of two possible states. It goes through the output function, which prevents one reading the output from getting back to the pool state. However, since there were only two possible states (one bit of entropy), you can try both and see which one would generate the output you got... and now the pool state is completely predictable, that is, it now has zero bits of entropy again! By reading from the pool, even with the protection of the output function, you reduced its entropy. Not only that, but there were only two possible outputs, so the output itself had only one bit of entropy, no matter how many bits you had asked for.

Now if you read a 32-bit number from a pool with 33 bits of entropy, you can make many guesses and find out a possible pool state. However, again due to the pigeonhole principle, there's on average two possible pool states which will generate the same 32-bit output. Two pool states = one bit. So by reading 32 bits, you reduced the remaining entropy to one bit.

This is important because if you can predict the pool state, you can predict what it will output next, which is obviously bad.

----

Now, why isn't the situation that bad in practice? First, the input entropy counter tends to underestimate the entropy being added (by design, since it's better to underestimate than to overestimate). Second, "by guessing" can take a long enough time to be impractical. Suppose you have a 1024-bit pool, and read 1023 bits from it. In theory, the pool state should be almost completely predictable: there should be only two possible states. In practice, you would have to do more than 2^1000 guesses (a ridiculously large number) before you could actually make it that predictable.

However, that only applies after the pool got unpredictable enough. That's why the new getrandom() system call (which you should use instead of reading /dev/random or /dev/urandom) will always block (or return failure in non-blocking mode) until the pool has gotten enough entropy at least once.

Actually still confused

Posted May 7, 2015 16:23 UTC (Thu) by itvirta (guest, #49997) [Link]

> (For example, `dd if=/dev/urandom of=/dev/sda` is a terrible misuse.
> Instead, use something like
> `openssl enc -aes128 -k "shred" < /dev/urandom > /dev/sda`.)

Doesn't that still read from urandom as much as the dd since urandom
is used as the input data?

Maybe you mean something like
openssl enc -aes128 -pass file:/dev/urandom < /dev/zero > /dev/sda

(Or even with the -nosalt flag added, because otherwise the
output always starts with the string "Salted__".)

The idea is good, however. I've used shred(1) for wiping disks, and
in random mode it uses urandom directly to get the randomness. It
makes it hideously slow. Perhaps someone(tm) should patch it to support
a faster generator or just make a smarter dedicated (simple-to-use) tool. :)

Actually still confused

Posted Apr 30, 2015 18:43 UTC (Thu) by HIGHGuY (subscriber, #62277) [Link]

> ... will block and wait to gather more entropy even after the machine has been
> running for months and thus has seen "infinite" amounts of randomness.

Now in this case you would assume that the entropy pool is a piece of memory with random content that grows as it gathers more entropy. This would be impractical as it would deplete memory for the "infinite" amount of randomness.

Actually, you should think of it as a fixed size memory buffer where feeding data into it is a transformation function ent_new = f(ent_old, new_data).

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 16:08 UTC (Thu) by epa (subscriber, #39769) [Link]

Right, you may do some accounting of how much entropy you have or whether the data you have added is 'good'. In that case, you had better be sure of the quality of what you're putting in before labelling it 'good'. However, it remains true (despite, IMHO, what djb wrote) that if you add an additional source of entropy and don't let it affect the accounting of 'good enough', it can at worst do nothing.

Lowering entropy

Posted May 13, 2015 9:36 UTC (Wed) by DigitalBrains (subscriber, #60188) [Link]

> if you add an additional source of entropy and don't let it affect the accounting of 'good enough', it can at worst do nothing.

But that's the whole problem, is it not? You can't lower the actual entropy of the pool by mixing in /dev/zero. This is however beside the point! What is actually affected is the kernel's estimate of the quality of the randomness. Say you're down to an estimate of 8 bits of entropy. The kernel starts mixing in something completely deterministic like /dev/zero, but it thinks it is increasing entropy in the pool and is now under the impression it has a whopping 256 bits of entropy to give out. Too bad it still only has 8 bits of actual entropy, which gets used as a cryptographic key!

I'm using overly dramatic numbers, and the kernel purposely underestimates its availability of entropy. But entropy is a well-defined, if difficult to measure, concept. If I need 128 shannons of entropy for my crypto, I will not get there with 96 shannons and something deterministic mixed in.

Lowering entropy

Posted May 13, 2015 19:25 UTC (Wed) by dlang (guest, #313) [Link]

> If I need 128 shannons of entropy for my crypto, I will not get there with 96 shannons and something deterministic mixed in.

how do you decide that you need "128 shannons of entropy for my crypto"?

and even if you only have 96 shannons of entropy, unless the attacker knows/controls the deterministic data that was mixed in, it's still effectively random as far as the attacker is concerned. This only becomes a problem when the deterministic factor can be known by the attacker, and even different amounts of deterministic data will result in different output.

Lowering entropy

Posted May 14, 2015 15:07 UTC (Thu) by DigitalBrains (subscriber, #60188) [Link]

> how do you decide that you need "128 shannons of entropy for my crypto"?

I'm going by the principle that the only secret thing about my crypto application is its key. I'm assuming for the moment that my attacker is able to exhaustively search the 96 shannons, or reduce the search space far enough to do an exhaustive search on what remains.

Because only my key is unknown, I'm assuming the attacker can reproduce the deterministic portions.

When you argue that mixing in bad quality randomness is not a problem because there's still plenty left, this seems like the bald man's paradox. If you have a full set of hair (good quality randomness), and some individual hairs fall out (bad randomness), you still have a full set of hair. Fine, so mixing in some bad sources doesn't make you go bald. But at some point, if enough individual hairs fall out, you are going bald: producing bad quality randomness. So I think that if you suspect that this cpu execution time jitter produces only 0.2 shannons per bit, you should not use it as if it has a full shannon per bit. You can still use it, but you shouldn't pretend it has more information content than it has. And if you don't feel confident giving a reliable lower bound on the amount of entropy delivered by the method, you might even be better off not using it. Better be safe than sorry.

It's about delivering an application what it expects, about quantifying what you mean when you say you are using a certain amount of randomness. A crypto application requesting 128 shannons of entropy does that because its designers decided this is a good amount of entropy to use. There are always margins built in, so it might be safe to give it only 96 shannons. But you're eating away at the built in margins, and at some point you're going past them.

The main point I tried to make is that I agree with commenters saying that you can't lower the entropy by mixing in determinism, but that that is not the point. Other than that, I think this is a really complicated subject and I'm not an expert at all, just an interested hobbyist.

> [...] unless the attacker knows/controls the deterministic data that was mixed in, it's still effectively random as far as the attacker is concerned.

I think that when you say that something is not a good quality source of randomness, you're effectively saying that you suspect someone could predict a part of its output. So yes, then there are attackers that might know the deterministic data to some extent. They can use this knowledge to reduce their search space. It's still only a small reduction; they still have a long way to go.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 9:17 UTC (Thu) by intgr (subscriber, #39733) [Link]

> on a well-designed system, we usually saw a variance of less than 1% in boot process timing (measured at the clock-cycle level).

In terms of percentages, sure, that seems like a small number. But in absolute terms, if you're executing billions of instructions per second, the number of non-deterministic clock cycles in that "less than 1%" is still enormous. In a large sample like yours, such random events will tend to even out, so on a per-instruction basis the non-determinism may be even greater.

To sufficiently seed a random number generator, all you need is 128 random bits -- 128 unpredictable clock cycles.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 14:22 UTC (Thu) by dgm (subscriber, #49227) [Link]

At 1% it means waiting for 12,800 cycles, which is not much.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 11:39 UTC (Thu) by ortalo (guest, #4654) [Link]

And I suppose any sane reader has a duty to experience similar reservations, and I certainly share your prudent posture.
However, this work starts to be convincing - at least convincing enough to be a serious competitor with other sources of randomness (esp. given the recent increase in doubts with respect to potentially flawed hardware generators...).
Plus, the overall thing sounds so absurdly appealing: using the non-determinism of supposedly deterministic processors as fuel for the non-deterministic functions of the system especially to compensate for the potentially malicious determinism of non-deterministic generators. It rocks when you say it!

Admittedly, this is not a fully reasonable argument, but I wonder if that's not the point at the moment: oppose somehow to what all the "very reasonable people" do in the name of security. A minimum of madness, as a precaution.

Random numbers from CPU execution time jitter

Posted May 2, 2015 8:14 UTC (Sat) by jzbiciak (guest, #5246) [Link]

I agree. In the SoCs I've been involved with, the main source of indeterminism involves crossing asynchronous clock domains, where each clock domain is driven by different crystal (or other independent oscillator). Otherwise, the SoC is pretty darn deterministic.

As Ted Ts'o says, just because you can't work it out, it doesn't mean I (or a sufficiently motivated attacker) can't work it out.

That even applies to caches with so-called random replacement policies. They're really pseudo-random, and in principle you can work out whatever state is in that PRNG eventually.

I've spent way too much skull sweat staring at waveforms and what not to think of cache behavior as truly random. Sure, it's unpredictable from the context of a given application running that doesn't know the full state of the machine. But, if you know the actual state of the cache and the sequence of requests coming from the application and so on, the whole memory hierarchy is pretty much deterministic.

(Now that said, it's quite common in the SoC's I've worked with that the external memory is driven by a distinct clock from the processor and memory hierarchy. That will affect the timing of cache misses to external memory by a couple of cycles here or there. So, there is indeterminism in that domain crossing. But, the entropy you should expect to extract from that should be very low.)

Random numbers from CPU execution time jitter

Posted May 4, 2015 12:21 UTC (Mon) by robbe (guest, #16131) [Link]

I find it a bit disheartening that the linked paper only tested one embedded CPU (MIPS) ... the rest were x86-compatible CPUs, which, with their looong pipelines, deep cache hierarchies, turbo-mode et cetera, are very prone to indeterminism.

So this serves the "my (virtual) x86 server needs entropy for HTTPS" case pretty well ... but embedded?

Random numbers from CPU execution time jitter

Posted Jul 6, 2015 8:59 UTC (Mon) by roblucid (guest, #48964) [Link]

Also, how does a Kernel only pool make things worse?
Entropy can be depleted on some systems, and consumed by user space a DOS.

Adding another hurdle, even if imperfect provides practical benefit, and an attacker able to control the machine environment so precisely has probably won the game already.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 7:31 UTC (Thu) by alankila (guest, #47141) [Link]

> It may be that there is some very complex state which is hidden inside the the CPU execution pipeline, the L1 cache, etc., etc. But just because *you* can't figure it out, and just because *I* can't figure it out doesn't mean that it is ipso facto something which a really bright NSA analyst working in Fort Meade can't figure out. (Or heck, a really clever Intel engineer who has full visibility into the internal design of an Intel CPU....)

This is all perfectly theoretical anyway because it will be very hard to attack a random number generator which gets data in from multiple sources. Saving the seed to disk and merging it into the random pool during the next boot is the most important thing, I think. Any source not perfectly controlled by attacker from the start of time will input at least some unpredictable bits sometimes, and unless the attacker can gain access of the PRNG state, the problem is completely intractable.

Since there is no practical usage for "real" entropy, I don't see why Linux bothers with /dev/random.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 9:32 UTC (Thu) by matthias (subscriber, #94967) [Link]

Getting real entropy is a big problem if you want to have cryptography on embedded devices. The way of cracking a key by going all the way down through a RNG is not very practical, but if you do not use enough entropy, then you will e.g. generate RSA keys that share a common factor with other RSA keys produced on similar systems. These keys provide no security at all.

The following is just the first reference, I found:

http://arstechnica.com/business/2012/02/15/crypto-shocker...

The systems did not have enough real entropy, else these collisions should not occur. And saving and reloading a seed is no help, if these devices need to create cryptographic keys on first boot.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 13:56 UTC (Thu) by dkg (subscriber, #55359) [Link]

saving and reloading a seed also has other potential risks:
  • the non-volatile storage itself may not be in tight control of the processor -- it represents a possible risk for both leakage ("i know your seed") and tampering ("i can force your seed to be whatever i want")
  • if the saved seed is somehow (accidentally? due to system failure?) reused across multiple boots, and there is no other source of entropy then the boots that share the seed will have the exact same stream of "randomness", potentially leading to symmetric key reuse, predictable values, and all other kinds of nastiness.
It's not that these risks are impossible to avoid, but avoiding them requires thoughtful system engineering, and might not be possible to do generically. The proposed approach in this article (if it is actually measuring real, non-predictable entropy) seems more robust.

Random numbers from CPU execution time jitter

Posted May 3, 2015 10:08 UTC (Sun) by alankila (guest, #47141) [Link]

The seed on disk won't make things worse, even if it is revealed to attacker or reused. I think technically what is stored as seed is some amount of data from the current entropy pool, and it is fed in as entropy using some userspace random injection API.

So, even if the random seeding entropy is known to attacker, there's still the entropy the system accumulated until that point, so we are no worse off than before; if the seed is shared between multiple systems or reused at boot, the situation is the same as well. It would be good to periodically rewrite the entropy seed while the system is running, though, to limit the risk of reusing the entropy.

In my opinion, it is not difficult to come up with lots of low-quality entropy, the issue is that Linux counts only extremely high quality bits as entropy. Those bits can be made arbitrarily scarce by increasing the requirements posed on what qualifies as random, to the point that the random subsystem is starved of all entropy until relatively late at boot and therefore can't function properly. I think this is a case of making the requirements too hard.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 17:02 UTC (Thu) by xxiao (guest, #9631) [Link]

For embedded system we're just choosing cpus with built-in hardware RNG generator, or an external Hardware RNG will do.

Random numbers from CPU execution time jitter

Posted May 1, 2015 11:35 UTC (Fri) by boog (subscriber, #30882) [Link]

But can you trust black-box HW generators not to have been influenced by the NSA or the Chinese government via their manufacturers?

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 20:39 UTC (Thu) by flussence (subscriber, #85566) [Link]

This sounds conceptually identical to HAVEGE[1], though I suppose there's some good reason for not using that algorithm here.

Looking forward to having it by default; my router box has started taking its sweet time to restore connectivity between reboots, because newer hostapd versions seem to be stricter about the state of the /dev/random pool.

[1]: http://www.issihosts.com/haveged/

Random numbers from CPU execution time jitter

Posted May 2, 2015 7:29 UTC (Sat) by vstinner (subscriber, #42675) [Link]

I already saw HAVEGE used in cloud virtual machines, because /proc/sys/kernel/random/entropy_avail was too low. Virtual machines get no keyboard stroke nor hardware interrupts. The problem is not the quality the of entropy, but the quantity of entropy. Without HAVEGE, SSH quickly hangs at the connection after a few tries...

A better fix is to configure virtio-rng for the virtual machine.

Random numbers from CPU execution time jitter

Posted May 7, 2015 13:06 UTC (Thu) by ksandstr (guest, #60862) [Link]

> The jitter entropy module allocates a 2KB buffer during initialization that it loops through and simply adds one to the value stored there (which causes both a load and a store). The buffer is larger than the L1 cache of the processor, which should introduce some unpredictable wait states into the measurement.

A 2 KiB buffer will fit entirely into any L1 cache since 1998-ish. The only mainstream exception are the Pentium 4 "NetBurst"'s earliest models with their (even at the time) tiny L1d set-ups -- and even those were 2 KiB twice over.

Granted, it's almost guaranteed that a rarely-touched 2 KiB buffer would be at least partially cold if the entropy-mixing code is cold also, however, 1) that's not what the article says, and 2) an entropy mixer's behaviour should remain consistent regardless of how hot its own code path and/or accessory buffer is.

These are characteristics of poorly-understood code that merely appears to do the right thing, rather than provably doing so even in the face of attempted compromise. Experience shows that poorly-understood but established code (i.e. black magic), such as what this has the potential to become, is very difficult to remove even in the face of grave failure[0]. Considering that the cache and/or hardware counter behaviour of future architectures may change arbitrarily, there's little value in poorly-defined cache latency shenanigans such as this over their humble-but-obvious "read counter, stir pool w/ low 2 bits" counterpart.

[0] say, predictable crypto keys as from Debian's valgrind olympics: a black-magic issue because of valgrind's status as monolithic and incomprehensible making it an object of authority.

Random numbers from CPU execution time jitter

Posted May 7, 2015 16:29 UTC (Thu) by itvirta (guest, #49997) [Link]

> [0] say, predictable crypto keys as from Debian's valgrind olympics: a
> black-magic issue because of valgrind's status as monolithic and
> incomprehensible making it an object of authority.

valgrind or openssl?

Random numbers from CPU execution time jitter

Posted May 8, 2015 5:53 UTC (Fri) by toyotabedzrock (guest, #88005) [Link]

Given that he is looping over and over means the CPU will be in a more consistent state for this "jitter".

Ask yourself what is the best way to lower jitter and it would be this code. At best it would provide a very predictable lever of jitter because it would heat up the chip in a predefined way.


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds