How to use random numbers?

Albert Y. C. Lai, trebla [at] vex [dot] net

Random numbers (of course I mean pseudo random numbers) come from generators with implicit or explicit state. This means the use of random numbers in Haskell (through the System.Random library) takes a bit of getting used to, as it involves state passing. The problem is standard and the solutions are also standard.

Most of you who need help with random numbers come from an imperative background, so I start with an imperative way with the IO monad. Afterwards I will also add functional ways, and a way with a custom state monad.

Task

Throughout all the methods, I exemplify with the task of generating a random list of this form: its length is 1 to 7 inclusive (determined uniformly randomly), and each item in the list is a Float between 0.0 and 1.0.

Imperatively

There is a global generator in the IO monad. You can initialize it and get random numbers from it. Here are some common functions you can use:

setStdGen :: StdGen -> IO ()

Initializes, or sets, the global generator to the given one. Usually you give it one created from a given seed, i.e., using mkStdGen. So, a silly usage looks like:

setStdGen (mkStdGen 42)

Of course you replace 42 by an Int acquired from input, the command line, ...

You have the choice of calling setStdGen or not. If you don't, the global generator is still usable, since the runtime initializes it with an arbitrary seed at startup, and every time it is a different seed.

randomRIO :: (Random a) => (a,a) -> IO a

Returns a random number of type a in the given range. The global generator is also updated. You specify the range (inclusive) by the tuple parameter. This example gets a random letter between "a" and "z" inclusive:

c <- randomRIO ('a','z')

Can a be any type you like? Not really. In Haskell 98 the Random library supports Bool, Char, Int, Integer, Float, and Double. (The support is extensible - you can write or obtain code to support more types - but that's another story...) In general types of class Random can be used.

randomIO :: (Random a) => IO a

Returns a random number of type a. (Can a be any type you like? See above.) The global generator is also updated. This example gets a random Double:

x <- randomIO :: IO Double

The range of the returned random number varies by the type, and by Murphy's Law is invariably different from what you expect. Don't assume; check the docs or ask around to find out. Or just use randomRIO.

Note that these are IO functions, so you can use them in your own IO functions only; equivalently, if you write a function to use them, its type becomes an IO function too. So for example, the above example snippets are intended to be in a do-block for IO. This is not a problem, just a reminder, because we plan to be imperative.

Here is how to perform the example task in the IO monad.

import Random

main =
    do { setStdGen (mkStdGen 42)   -- optional
       ; s <- randomStuff
       ; print s
       }

randomStuff :: IO [Float]
randomStuff =
    do { n <- randomRIO (1,7)
       ; sequence (replicate n (randomRIO (0,1)))
       }

Pure-Functionally

There are several reasons why you may like to know how to use random numbers functionally: you have curiosity; you want to escape from the IO monad; for concurrency or other reasons you want several generators in co-existence, and a shared global one is problematic.

There are two ways to use random numbers functionally. One is to work off a stream (infinite list) of random numbers. Another is to pass around generators as part of function arguments and return values.

Pure-Functionally: Stream of Random Numbers

Here are some common functions for creating a generator and infinite lists of random numbers:

mkStdGen :: Int -> StdGen: Makes a generator from the given seed.
randomRs :: (Random a, RandomGen g) => (a, a) -> g -> [a]: Infinite list of random numbers from the given generator in the given range. Example: letters between "a" and "z" inclusive, from seed 42: randomRs ('a', 'z') (mkStdGen 42) The type a is the type of the random numbers. The type g looks general, but in practice is always StdGen.
randoms :: (Random a, RandomGen g) => g -> [a]: Infinite list of random numbers from the given generator. Example: Doubles from seed 42: randoms (mkStdGen 42) :: [Double] The range is determined by the type, and you should always check the docs to find out what it is. Or just use randomRs.

Note that these are functional - there is no in-place, destructive update. In particular the generator is not updated. If you use a generator to make a first list, then use the same generator to make a second list... g = mkStdGen 42 a = randoms g :: [Double] b = randoms g :: [Double] guess what, referential transparency dictates that the two lists are the same! (If you want two different lists, but you only have one seed, I will show you a way soon.)

Here is one way to do the task of creating a random list of 1 to 7 Floats.

import Random

main =
    do { let g = mkStdGen 42
       ; let [s] = take 1 (randomStuff g)
       ; print s
       }

randomStuff :: RandomGen g => g -> [[Float]]
randomStuff g = work (randomRs (0.0, 1.0) g)

work :: [Float] -> [[Float]]
work (r:rs) =
    let n = truncate (r * 7.0) + 1
        (xs,ys) = splitAt n rs
    in xs : work ys

Apart from the necessary I/O of printing the answer, it is purely functional. From the generator, it produces an infinite list of random numbers, and then from that it produces an infinite list of answers. (And then the consumer just takes one.) I do this because even though our task today asks for just one, in reality you need many and I hope this code sets an example.

Here is how the code works. From one generator, it creates an infinite list of random Floats of the right range. One number is snatched off and scaled up to determine the random length 1 to 7, then that many is snatched off from the remaining to yield an answer, and this procedure is repeated over the yet remaining for more answers. In other words, the input list is broken into one number r and the rest rs, r determines the desired length n, and rs is further split into xs, the answer of length n, and ys, the yet unused part. Then ys is used for more answers.

Now here is a variation. I have used the same random stream for both the length and the content. I have found a way to do it, but in general the same trick may be inapplicable to other tasks. You may like to use two separate streams. Here is how. First, I introduce the function that creates two generators from one:

split :: (RandomGen g) => g -> (g, g): Creates two different generators from one source. It is not advisable to re-use the original source for other purposes. Example: g = mkStdGen 42 (ga, gb) = split g -- do not use g elsewhere If you want more than two, you can pick one of the new two and split again. Example: g = mkStdGen 42 (ga, g') = split g (gb, gc) = split g' -- do not use g, g' elsewhere

We can apply split to obtain two generators, from which we can produce two random streams. Here is the idea applied to our task:

randomStuff :: RandomGen g => g -> [[Float]]
randomStuff g = work (randomRs (1, 7) ga) (randomRs (0.0, 1.0) gb)
    where (ga,gb) = split g

work :: [Int] -> [Float] -> [[Float]]
work (n:ns) rs =
    let (xs,ys) = splitAt n rs
    in xs : work ns ys

It splits the generator into two, and produces two streams accordingly. Since the two streams are created to be of just the right types and ranges, their use in the code is much more apparent. This code is more generalizable.

I have hardcoded the seed in the main program. Normally you would want to obtain the seed elsewhere - from some input, from a file, from the clock, from some device - these are all very do-able in the main program, since it runs in the IO monad and has access to all those. You can also fetch the global generator and pass it along:

main =
    do { g <- getStdGen
       ; let [s] = take 1 (randomStuff g)
       ; print s
       }

state monad

To do.

I have more Haskell Notes and Examples