[onerng talk] OneRNG as a RNG vs entropy generator

Sat Jun 27 05:56:16 BST 2015

On Sat, 27 Jun 2015 15:22:10 phred53 wrote:
> Over in the OneRNG KickStarter Comments you said:
> 
> "I do want to caution people about worrying too much about OneRNG as a
> random number generator (sadly we chose the cute device name) rather than
> as it's primary purpose: an entropy generator - we expect you to take the
> output and feed it into the kernel RNG (or some other cryptographically
> appropriate software RNG) before use - however we also think it's important
> to give people access to the raw RNG output so you can independently test
> it - you shouldn't just trust us"
> 
> Could you elaborate a bit more on the subject of using captured OneRNG as a
> _source_ of random numbers, pitfalls of doing so and specifically
> cryptographically why that wouldn't be sound?

there are probably people here who can explain better than I can .... 

Basically there's two sorts of problems one might try and solve: 

- making random numbers - something we can do quite well using software
- making entropy - something that's hard to do with software 

The way I look at it is that if I'm running a system that's using a software 
random number generator for cryptography (think of the code that backs 
/dev/[u]random in the linux kernel) - if an attacker knows the internal state 
of the kernel's software RNG it can predict its output - every time the 
software RNG gives out some data (for example TCP connection sequence numbers) 
it gives a little bit away about its internal state. We need to add entropy 
(essentially noise) to the kernel's RNG faster than it leaks information to a 
potential attacker.

The kernel makes a small amount of entropy internally by looking at the 
randomness of network processes, disk seek times (maybe not so much on SSDs) 
etc but can need a lot if it needs to make say a lot of SSL keys per second. 
Ideally you put as much entropy into the process as you leak information.

So OneRNG is a source of a lot of entropy (about 64k bytes/second)- the source 
is a thermal/quantum (depending on how you look at it) process that makes 
random noise - the noise spectrum isn't as "perfectly random" as a software 
RNG might provide - ours has a small DC offset, and occasionally I suspect the 
avalanche diode makes big avalanches that hit more than one bit in time - 
we've been claiming > 7.5 bits of entropy per byte of data provided (we 
underestimated it on purpose, in fact new sampling code in the latest firmware 
trades throughput slightly for more samples/byte has increased the measured 
'mode 1' raw data to ~7.9 bits/byte) - we "whiten" [1] the noise on board 
using a simple process (CRC16) which removes much of the DC bias and makes the 
various programs that test randomness and entropy think it is better .... but 
you can't really make entropy that way - the fact that the various programs we 
use for testing (ent, and the fips tests) think OneRNG is 'better' in the 
whitened modes is I think more of a indication of just how hard it is to 
measure entropy well - I do believe their estimates of the entropy of the raw 
bitstreams is probably close.

I'll add to that that you can create better random numbers from hardware 
(certainly less of a DC offset for a start) - but that requires more expensive 
hardware, perhaps hand tweaked resistors during manufacturing - that's why 
some hardware RNGs cost 1000s of dollars - making something cheap everyone can 
use to provide entropy to their system means tradeoffs - 7.9 bits/byte is 
pretty good, and feeding it in volume to the kernel is a great solution for a 
cheap board.

So that's the argument for OneRNG providing entropy for a software RNG.

As far as using the data directly you can, with the understanding that it's 
not a 100% ideal RNG - wityh minimal whitening (mode 0) it passes the fips 
tests (or rather has a 0.1% false negative, less than one failure per 1000 
tests, which is an acceptable number for those tests) - for example it may 
have some spectral failures we haven't spent enough time looking for (we have 
done some basic FFT tests looking for any obvious issues) - I'd recommend 
passing the output through a further whitening stage (have a look in the 
latest onerng.sh startup script, in particular at what we're doing to run the 
data through openssh aes to further 'whiten' the output).

	Paul

[1] "whitening" is the processing of random noise to make it look more like 
'white' (or some other ideal) noise by algorithmic processing - OneRNG does 
minimal whitening of its noise by running it through a CRC16 LFSR, we also 
offer the option of the kernel further whitening the noise by feeding it 
through AES128