[147861] in cryptography@c2.net mail archive
Re: [Cryptography] [RNG] /dev/random initialisation
daemon@ATHENA.MIT.EDU (John Denker)
Mon Oct 28 13:39:10 2013
X-Original-To: cryptography@metzdowd.com
Date: Sun, 27 Oct 2013 15:17:14 -0700
From: John Denker <jsd@av8n.com>
To: Cryptography <cryptography@metzdowd.com>
In-Reply-To: <CACXcFm=ChuoPdeNRxhZUzW5naUugKZu4bQZqx7Sm8r4sb39L8Q@mail.gmail.com>
Errors-To: cryptography-bounces+crypto.discuss=bloom-picayune.mit.edu@metzdowd.com
On 10/24/2013 02:42 PM, Sandy Harris wrote:
> What I want to model here is the harder case where none of the above
   [sources for a good high-entropy seed]
> is in place or urandom starts producing output without waiting for
> them. In an ideal world, this case would never arise but in the real
> world it has and likely will again. 
This is excellent.  This is exactly the sort of discussion we
ought to be having.
> The problem involves avoiding
> duplicate outputs, so the birthday paradox works against us. If we
> need a low risk of collision for 2^n cases, then we need about 2n bits
> of input entropy.
I agree with the sentiment, but having lots of entropy isn't the 
only way of solving the problem;  we can make do with a lot less 
entropy if it is /computationally infeasible/ for the attacker to 
untangle what we have done.
> There are two separate cases to consider: avoiding duplicate outputs
> when a single device is rebooted many times and avoiding it when many
> identical devices are deployed.
OK, good.
> Multiple reboots of the same device are the easy case. One threat
> involves a computer that is rebooted often (perhaps several times a
> day) and whose outputs are monitored continuously over a fairly long
> period (perhaps a year). Another involves an enemy that can force many
> reboots of a network device without the administrators noticing and
> blocking the attack. Neither threat looks at all likely to involve
> more than a few thousand reboots, so a dozen or so bits of timing
> information per reboot should be enough to block them.
It may help to add a little bit more detail to that analysis.  
There are two subcases:
  a) The PRNG starts out in a good state, with a seed that is
   large enough, random enough, and unknown to the attacker.  If 
   we now use the real-time clock (RTC) to /stir/ the seed, that
   should be sufficient to guarantee no collisions.  That should
   be enough to stop replay attacks.
  b) If the PRNG starts out in a bad state, because the seed has
   been compromised, then the PRNG cannot quickly recover, and in
   particular it cannot recover while it is under attack.  The RTC
   is nowhere near sufficient to /substitute/ for a proper seed.
To say the same thing in slightly different words:  There are two
separate objectives that we need to consider:
  A) Resistance to attack, and
  B) Recovery from compromise.
In some rock-candy world it would be nice to do both at the same 
time, but in this world, there are situations where a practical
PRNG cannot do both at the same time, and that's OK.
The moral of the story is that it is super-important to make sure
the PRNG is properly seeded.  The seed needs to be big enough,
random enough, and unknown to the attacker.
Note that the RTC does not add real entropy to the PRNG.  All it
does is stir the PRNG.  Assuming the PRNG's hash function is working
properly, (seed + RTC1) should be effectively very different from
(seed + RTC2).  The attacker could try to mount a related-plaintext
attack against the hash, but for any decent cryptologic hash this
should be computationally infeasible.
There is a fundamental conceptual distinction between (a) reseeding
the PRNG with honest-to-goodness entropy and (b) merely stirring
the existing seed.  However, there are some situations in which
stirring suffices.
> Timing data appears to be the only thing we can use against this
> threat; more-or-less everything else is constant across reboots. Linux
> random already mixes in some timing information, but I am not certain
> how much.
It shouldn't take much stirring, assuming the hash function is working
properly.  We absolutely require that the RTC produce non-repeating 
values, but that's about all we really require.  Again, this is all 
predicated on starting with a high-quality seed.
> Many identical devices are a harder case.
I hate to sound like a broken record, but the solution is the same in
all cases:  You need a proper seed:  big enough, random enough, and
unknown to the attacker.
>  A manufacturer might produce
> tens or hundreds of thousands of the same device
The manufacturer reeeeally needs to provision each instance of the
device with a proper seed.  The cost of doing this is ridiculously
small.  The manufacturer already needs to provision things like the
MAC address, hostname, IMEI, et cetera ... and the addition burden
of provisioning the seed for the PRNG is negligible.
>  a Linux distro
> might send out some huge number of installation images;
Again (!) the solution is the same:  If there is a proper seed, you're
fine.  If there is not a proper seed, you're screwed.
Specific suggestion:
  a) Download the .iso image onto disk.
  b) Provision a proper seed onto the image.
  c) Burn the modified image to CD if desired.
A few years ago I cooked up some tools to do exactly this.  I tried to
get support for it installed into the distributions, but the proposal
was shot down by some "expert" who decided that the RTC was sufficient
security, even if every attacker on earth knew what was in the distributed
seed-file ... and even if every attacker on earth knew what time it was.
More general suggestion:  We ought to produce some sort of "Best Practices"
document that explains to manufacturers, software distributors, VM providers,
et cetera how important it is to provision a proper seed, and explains how
to do it.
We should also provide tools and infrastructure that makes it easy to do
the right thing and hard to do the wrong thing.  For example, in the
general case, unpacking and repacking an .iso image is a pain in the neck,
but if the image is engineered to facilitate rewriting the seed-file the
procedure is much, much simpler.
==================================
Following up on a proposal by Jerry Leichter, on 10/25/2013 05:15 AM, 
John Kelsey wrote:
> I like this idea.  If my PRNG is in a secure state, I can give out
> random numbers to anyone who asks. 
I like this idea so much that I implemented it many years ago.  The
command line looks roughly like:
lynx -auth=harpo:swordfish -source 'https://example.com/pvt/random?bytes=512' | randomize kernel from -
> At first startup, it won't be
> possible to establish a secure connection yet (no entropy), 
I disagree; see below.
> but by
> asking some hosts for a random number, we ensure that if those
> messages aren't recorded, the attacker can't possibly guess our PRNG
> starting state.
Again (!) the key is to have a proper seed stored on the machine
in question.  It's like the proverbial seed-corn:  If you have
none, you're screwed.  If you have some, you can grow more.
On 10/24/2013 07:59 AM, John Kelsey wrote:
> a.  Get 256 bits of entropy from the OS.
> b.  Get 256 bits of entropy from the hardware entropy source.
> c.  Ping several hosts on the internet and measure the response time, and fold that into your seed.
> d.  Fold your ethernet address, IP address, and serial number into the seed.
> e.  Fold the installed-at-birth secret 128 bit value from your device into the seed.
> 
> Initialize a PRNG with all that, and the attacker is in an extremely
> hard place, as he has to be able to guess all five elements. (d)
> isn't all that hard to guess, but the rest will in general be very
> hard to guess.
Some comments:
a) If you have 256 bits of honest-to-goodness entropy from
 /any/ source, OS or otherwise, then that's more than enough
 to seed the PRNG, and nothing on the list matters.
b) Ditto.
c) Not so good. Ping times are not guaranteed to be always
 different, and not guaranteed to be unknown to the attacker.
 In the crypto business it is conventional to worry about
 MITM attacks.  A scenario where the MITM doesn't need to
 mess with the packet contents, just the packet timing,
 is extra-easy for the attacker.
d) Not so good.  All that stuff is constant, and relatively
 easy for the attacker to know or guess.  The RTC wasn't 
 mentioned, but it is better than the items that were
 mentioned, in that it is at least non-constant and non-
 repeating.  The RTC cannot /substitute/ for a proper
 stored seed, but it can /stir/ the seed.
e) That's basically the right idea, except that we should
 call it the "stored" seed rather than the "at-birth" seed.
 The stored seed should be changed at frequent intervals.
 This helps with the recovery from compromise.  Perhaps
 more importantly, it reduces the window of vulnerability
 and reduces the value the attacker gets from peeking at 
 the stored seed.
_______________________________________________
The cryptography mailing list
cryptography@metzdowd.com
http://www.metzdowd.com/mailman/listinfo/cryptography