Pseudorandom number generator |
A pseudorandom number generator (PRNG) is an algorithm which generates a sequence of numbers, the elements of which are approximately independent of each
other.
The outputs of pseudorandom number generators are not truly random—they only approximate some of the properties of random numbers. John von Neumann emphasized this with the remark "Anyone who considers arithmetical methods of
producing random digits is, of course, in a state of sin". While there have been attempts to generate "truly random" numbers,
using hardware random number
generators, pseudorandom numbers are a critical part of modern computing, from cryptography to the Monte Carlo method for
simulating physical systems. Careful mathematical analysis is required to ensure that the generated numbers are sufficiently
"random;" as Robert R. Coveyou of Oak Ridge
National Laboratory once remarked, "The generation of random numbers is too important to be left to chance."
Most such algorithms attempt to produce samples that are uniformly distributed. Common classes of algorithms are linear congruential generators, lagged Fibonacci generators, linear feedback shift registers and generalised feedback shift registers. Recent instances of algorithms include Blum Blum Shub, Fortuna, and the
Mersenne Twister.
Inherent nonrandomness
Because any PRNG run on a deterministic computer (contrast quantum computer) is a deterministic algorithm, its output will inevitably
have certain properties that a true random sequence would not
exhibit. One of these is guaranteed periodicity—it is
certain that if the generator uses only a fixed amount of memory then, given a sufficient number of iterations, the generator
will revisit the same internal state twice, after which it will repeat forever. A generator that isn't periodic can be designed,
but its memory requirements would grow as it ran. In addition, a PRNG can be started from an arbitrary starting point, or seed
state, and will always produce an identical sequence from that point on.
In practice, many PRNGs exhibit artifacts
which can cause them to fail statistically significant tests. These include, but are certainly not limited to:
- Shorter than expected periods for some seed states (not full period)
- Poor dimensional distribution
- Successive values are not independent
- Some bits may be 'more random' than others
- Lack of uniformity
Defects exhibited by a flawed PRNG may range from unnoticeable to ridiculous. The RANDU random number algorithm used for decades on mainframe computers was flawed and much research work is less reliable than it should have been as a
result. Sometimes, but not always, modeling problems are noticed and lead to improvements in PRNGs.
The recent invention of the Mersenne Twister algorithm, by Makoto Matsumoto and Takuji Nishimura in 1997, avoids most of these problems. It has a colossal period of 219937-1 iterations, is proven to be
equidistributed in 623
dimensions (for 32-bit values), and runs faster than all but the least statistically desirable generators. It is now increasingly
becoming the "random number generator of choice" for statistical simulations and generative modeling.
However, it is possible to efficiently analyze the output of the Mersenne Twister and recognize the numbers as being
non-random (as with the Berlekamp-Massey
algorithm or an extension from it, such as the Reed-Sloane algorithm).
Cryptographically secure pseudorandom number generators
PRNGs that appear to evade statistical analysis are called cryptographically secure PRNGs (CSPRNGs). There are a number
of examples of CSPRNGs. Blum Blum Shub has the strongest security
proofs, though it is slow. Most stream ciphers work by generating a
pseudorandom stream of bits that are XORed with the message; if the stream cypher is of high quality, this stream can be used as
a CSPRNG (though not always: see RC4). A high quality (ie, 'secure') block cipher can also be converted into a CSPRNG by running it in counter mode. This is done by choosing an arbitrary key and encrypting a block of
zeros, then encrypting a 1, then encrypting a 2, etc. The counter can also be started at an arbitrary number other than zero.
Obviously, the period will be 2n for an n-bit block cipher; equally obviously, the initial values
(ie, key and 'plaintext') must not become known to an attacker lest, however good the CSPRNG might be, all security be lost. A
cryptographically secure hash of a counter might also act as a good
CSPRNG in some cases. If the counter is a bignum, then the CSPRNG could have an
extremely long period. This method is emphatically not suitable for a one time pad message of any length, as each term depends completely on the previous term. Finally, there are
PRNGs that have been designed specifically to be cryptographically secure. One example is ISAAC (based on a variant of RC4) which is fast and has an expected
cycle length of 28295 and for which no successful attack in a reasonable time has yet been found.
See also
References
- Donald E. Knuth, The Art of Computer Programming, Volume 2:
Seminumerical Algorithms, 3rd edition (Addison-Wesley, Boston, 1998).
- The GNU Scientific Library, http://www.gnu.org/software/gsl/. A free (GPL)
(C library that includes a number of PRNG
algorithms.
|