Skip to main content

Unique random numbers

·3 mins

Sometimes it’s useful to have a method for generating unique numbers that are uniformly distributes over some set. For example, we may have a collection of objects that need to be identified uniquely, and serve as a key to an associative container.

It isn’t always desired to use monotonically-increasing counter - for example, if we were to store our objects in an unbalanced tree, we would end up with a degenerate, inefficient, tree. In small or embedded applications, it is sometimes easier to implement a simple unbalanced tree and have the keys be uniformaly distributed than implement a red-black or AVL tree.

A simple and efficient method that guarentees uniqueness, as well as randomness, is to use an LFSR. There are many known maximal-length LFSRs, which guarantee that if an LFSR has a state of n bits (and the initial state is not all-zeros) we can advance the LFSR 2n-1 times before we get any state we already seen. In other words, if an LFSR describes a chain of states, a maximal-length LFSR gurantees there is a single loop for all possible states.

There are a few standartized maxima-length LFSRs being used by the industry: the PRBS sequences. These sequnces are commonly used in communcation equipment testing, because they allow testing of all possible bit-patterns as quickly as possible (and it’s also very easy to detect a defective PRBS sequence, indicating transmission or reception errors).

Let’s assume that we need to choose unique and randomly distributed 32-bit integers. A leading candidate would be PRBS31 (x31+x28+1). Implementation is failry simple:

public int prbs31(int state) {
    int feedback = ((state >> 30) ^ (state >> 27)) & 1;
    return ((state << 1) | feedback) & 0xffffffff;
}

We simply XOR the LFSR’s taps, shift the state and push the feedback:

Note that due to the LFSR’s nature, adjancent values will be very similar (they differ by one shift and one bit, so for every state Nk we have: Nk+1∈{2Nk mod 2n, 2Nk+1 mod 2n}).

To get “non-similar” values, we can just call prbs31 successively, like so:

// initial seed, doesn't really matter which value you use
// as long as it's not zero (remember that the sequence goes over
// all the possible 31-bits values anyway)
private static final int SEED = 0x55555555;

private int state = SEED;

public int nextInt() throws SequenceCycleException {
    for (int i = 0; i < 31; ++i) {
        state = prbs31(state);
        if (state == SEED) {
            throw new SequenceCycleException();
        }
    }
    return state;
}

DISCLAIMER: naive usage of a single LFSR is highly predictable. In this case, knowing one value gives knowledge of all future values - which makes this method good for simulations (such as games) but also very bad for anything with cryptographic or secret significance!

I originally gave this method as an answer in a Stack Overflow question.