Home » Java » Generate random numbers from 1-100 from a generator of 1-50

Generate random numbers from 1-100 from a generator of 1-50

Posted by: admin December 28, 2021 Leave a comment

Questions:

in a recent interview i was asked the following question:

Print random numbers from 1-100 using the given getrnd50() method
which generates the random numbers from 1-50. Each random number
should be printed only once and in random order. Use of no other random number generator
is allowed and i was not allowed to change the definition of
getrnd50().

I came up with the following code which gives the correct output.

import java.util.Random;

public class Test {

public static void main(String[] args) {
    int[] rs = new int[100];
    int count = 0;
    int k;
    while (count != 100) {

        // I decided to simply multiply the result of `getrnd50()` by 2. 
        // But since that would generate only even numbers,

        k = getrnd50() * 2;

        // I decided to randomly subtract 1 from the numbers. 
        // Which i accomlished as follows.          

        if (getrnd50() <= 25) // 25 is to half the possibilities.
            k--;

        // Every number is to be stored in its own unique location 
        // in the array `rs`, the location is `(number-1)`. 
        // Every time a number is generated it is checked whether it has 
        // already been generated. If not, it is saved in its position, printed and 
        // `count` is incremented.

        if (rs[k-1] == 0) {
            rs[k-1] = k;
            count++;
            System.out.print(k + " ");
        }
    }
}
// This is the method and i am not supposed to touch it.
static int getrnd50() {
    Random rand = new Random();
    return (1 + rand.nextInt(50));
}

}

While it was accepted in that round, in the next round the interviewer tells me that getrnd50() is a costly method and even in best case scenario i have to call it twice for every number generated. i.e. 200 times for 1-100. In worst case scenario it would be infinity and tens of thousand in average case. He asks me to optimize the code so as to significantly improve the average case.

He gave me a hint when i expressed my inability to do it, he said:

To consider the number of numbers generated while generating a new
number. For ex. if count becomes 99 i don’t have to call getrnd50() I
can simply find the remaining number and print it.

While i understood his drift i had no idea how it would help me, so obviously i got rejected. Now i am curious to know the answer. Help me! Thanx in advance!

Note: if anyone is feeling lazy to to write a lengthy code just point out the numer generation part, the rest is easy. Also we are not bound to follow the hint.

Answers:

The key is to not check if you have generated the number before, which gets very expensive when looking for only 1 remaining number, but to generate the numbers 1-100 in order, and then shuffle.

In your code, when you have generated 99 out of the 100 numbers, you will loop around, generating random numbers, until you find that 1 remaining number. That’s why the average case in your version is so bad.

If instead you just shuffle an array, you only need to have as many random numbers as you have shuffle operations, and only as many shuffle operations as you need numbers output.

(For full details on shuffling, look up the Fisher-Yates shuffle, specifically the inside-out variant which can generate a shuffled array in place)

To generate the random numbers, you need a variable generator, rather than a fixed 1-50 one. You can approach this in a variety of ways, but be very careful of introducing skew into the results, if you really want the output to have a good distribution across the possible states.

For example, I would recommend using an integral number of bits, with shifting, rather than attempting to use a modulo. This does involve a certain amount of looping if the values are outside of the desired range, but without being able to modify the original random number generation, your hands are somewhat tied.

static int bits = 0;
static int r100 = 0;

static int randomInt(int range)
{
    int ret;

    int bitsneeded = 32 - Integer.numberOfLeadingZeros(range - 1);

    do {
            while(bits < bitsneeded)
            {
                    int r = (getrnd50()-1) * 50 + getrnd50()-1;
                    if(r < 2048)
                    {
                            r100 <<= 11;
                            r100 |= r;
                            bits += 11;
                    }
            }
            ret = r100 & ((1 << bitsneeded) - 1);
            bits -= bitsneeded;
            r100 >>=  bitsneeded;
    } while(ret >= range); 

        return ret + 1;
}

This implementation will use something in the region of 150 random numbers for your 100 value shuffled array. This is worse than the modulo version, but better than 2x the input range, which was the best case of the original version. There is, if the random generation was truly random, still a worst-case scenario of infinity, but random generation doesn’t typically work like that. If it did, I’m not sure unskewed results are realistic given the constraints.

For illustration, as the results are subtle, here’s a graph of my suggested random routine, versus a modulo version:

Graph of random generators

So in summary, I think that while your random generation is a bit inefficient, and could be improved, the really big win that interviewer was looking for, is in not needing so many random numbers in the first place, by doing a shuffle rather than repeated searching with an ever decreasing probability.

###

Since 100 / 50 is an integer, this is quite easy. Since 50 / (100 / 50) is an integer, it’s even easier.

If you didn’t quite get that, here is some sample code:

int rnd1 = getrnd50();
int rnd2 = getrnd50();
if (rnd1 % 2 == 0)
{
    rnd2 += 50;
}
return rnd2;

Here is an outline:

  • Two numbers, chosen randomly between 1 and 50, called a and b.
  • If a is even, add 50 to b.
  • Return b.

You can make this a one-liner if you want:

return getrnd50() + getrnd50() % 2 * 50;

That’s a little too obfuscated though.

Edit: I see the question was really asking for a shuffled list, not a sequence of random integers.

This can be done by creating a list from 1 to 100, and doing 100 random swaps, like a Fisher-Yates shuffle. I imagine that with a Fisher-Yates shuffle, the absolute minimum number of calls is 93 (given with the formula ceil(log50(100!))), but with a much simpler algorithm you can use 200.

The simple algorithm would involve swapping each of the 100 elements with a random element from the 100. The number to choose would be generated from 1-100 with the above generator.

For example:

for (int i = 0; i < 100; i++)
{
    swap(i, getrnd100() - 1); // - 1 for zero base!
}

Here is some complete code:

int[] result = new int[100];
for (int i = 0; i < 100; i++)
{
    result[i] = i + 1;
}
for (int i = 0; i < 100; i++)
{
    int j = (getrnd50() + getrnd50() % 2 * 50) - 1;
    int tmp = result[i];
    result[i] = result[j];
    result[j] = tmp;
}
return result;

(Disclaimer: I don’t know Java, and I haven’t tested it.)

Best case 200, worst case 200, average case 200.

###

Here is how you could answer it. It exploits the fact that,

  • assuming you are using shuffle to get an O(n) swapping of “cards”, the modulus decreases in a shuffle. i.e. start with an int[] of every values and shuffle it like Collections.shuffle() does.
  • you have more randomness than you need if you call getrnd50() twice, esp when you have less than 50 values left to swap with.

EDIT: For those not familar with how shuffle works, I have added the code for shuffling

import java.util.*;
import java.lang.*;

class Main {
    public static void main(String... args) {
        int samples = 100;

        // all the numbers [1, 100]
        int[] nums = new int[samples];
        for (int i = 0; i < samples; i++) nums[i] = i + 1;

        for (int i = samples - 1; i > 0; i--) {
            int swapWith = nextInt(i + 1);

            // swap nums[i] and nums[swapWith]
            if (swapWith == i) continue;
            int tmp = nums[swapWith];
            nums[swapWith] = nums[i];
            nums[i] = tmp;
        }
        System.out.println("calls/sample " + (double) calls / samples);
        System.out.println(Arrays.toString(nums));

        int[] count49 = new int[49];
        for (int i = 0; i < 49 * 10000; i++)
            count49[nextInt(49) - 1]++;
        int[] count54 = new int[54];
        for (int i = 0; i < 54 * 10000; i++)
            count54[nextInt(54) - 1]++;
        System.out.println("Histogram check (49): " + Arrays.toString(count49));
        System.out.println("Histogram check (54): " + Arrays.toString(count54));

    }

    // keep track of the range of values.
    static int maxRandom = 1;
    // some random value [0, maxRandom)
    static int rand100 = 0;

    static int nextInt(int n) {
        while (maxRandom < 10 * n * n) {
            maxRandom *= 50;
            rand100 = rand100 * 50 + getrnd50() - 1;
        }
        int ret = rand100 % n;
        maxRandom = (maxRandom + n - 1) / n;
        rand100 /= n;
        return ret + 1;
    }

    static final Random rand = new Random();
    static int calls = 0;

    static int getrnd50() {
        calls++;
        return (1 + rand.nextInt(50));
    }
}

prints

calls/sample 0.94

[1, 37, 4, 98, 76, 53, 26, 55, 9, 78, 57, 58, 47, 12, 44, 25, 82, 2, 42, 30, 88, 81, 64, 99, 16, 28, 34, 29, 51, 36, 13, 94, 80, 66, 19, 38, 20, 8, 40, 89, 72, 56, 75, 96, 35, 100, 95, 17, 74, 69, 11, 31, 86, 92, 6, 27, 22, 70, 63, 32, 93, 84, 71, 15, 23, 5, 14, 62, 49, 43, 87, 65, 83, 33, 45, 52, 39, 91, 60, 73, 68, 24, 97, 46, 50, 18, 79, 48, 77, 67, 59, 10, 7, 54, 90, 85, 21, 61, 41, 3]

Histogram check (49): [10117, 10158, 10059, 10188, 10338, 9959, 10313, 10278, 10166, 9828, 10105, 10159, 10250, 10152, 9949, 9855, 10026, 10040, 9982, 10112, 10021, 10082, 10029, 10052, 9996, 10057, 9849, 9990, 9914, 9835, 10029, 9738, 9953, 9828, 9896, 9931, 9995, 10034, 10067, 9745, 9873, 9903, 9913, 9841, 9823, 9859, 9941, 10007, 9765]

Histogram check (54): [10124, 10251, 10071, 10020, 10196, 10170, 10123, 10096, 9966, 10225, 10262, 10036, 10029, 9862, 9994, 9960, 10070, 10127, 10021, 10166, 10077, 9983, 10118, 10163, 9986, 9988, 10008, 9965, 9967, 9950, 9965, 9870, 10172, 9952, 9972, 9828, 9754, 10152, 9943, 9996, 9779, 10014, 9937, 9931, 9794, 9708, 9978, 9894, 9803, 9904, 9915, 9927, 10000, 9838]

In this case, 100 numbers need less than 100 calls to getrnd50

If you have 1000 values to shuffle

calls/sample 1.509

###

Ok, so you are allowed to print the last missing number of anny set of n numbers without it being generated by the random number generator?

If so you could make use of recursion and decrease the size of the set with each call, until you only have n=2 and then call getrnd50() once. when you go back recursively, simply print the missing number on each set.

###

(1) Create an array A initialized with {1,…,100}. Keep a variable ‘length’ of this array.

(2) Create a random method to randomly generate a number from 1 to length. Each call of this method will call getrnd50() for no more than 2. Call the returned value as ‘index’.

(3) Output A[index], swap A[length] to A[index] and length–.

(4) Repeat (1)-(3) until the array is empty.

###

The performance penalty of your code is in that line

if (getrnd50() <= 25)

You need to find a way to get more information out of that single generated random number, otherwise you are wasting those costly generated resources. Here is my proposal for that:

First imagine that we would have a random number generator for the numbers 0-15. Every number can be represented as a path in a binary tree where the leafs represent the numbers. So we can say that we evaluate that if condition to true every time we walk left in the tree when starting at the root.

The problem is that the random number generator generates numbers in an interval that doesn’t end at a power of two. So we need to expand that tree. This is done like so:
If the random number is in the range 0-31 we are fine with a tree for those numbers. If it is in the range 32-47 we use the tree from 0-15 for those and in the case 48-49 we use a tree for the numbers 0-1.

So in the worst case, we aren’t using much more information from that random number, but in the most cases we are. So this should significantly improve the average case.

###

 List<Integer> lint ; 

  public void init(){
      random = new Random();
      lint = new LinkedList<>();
      for(int i = 1 ; i < 101; ++i) {
          lint.add(i); // for truly random results, this needs to be randomized.
      }
  }

  Random random ; 
  public int getRnd50() {
      return random.nextInt(50) + 1;
  }

  public int getRnd100(){

      int value = 0;
      if (lint.size() > 1) {
          int index = getRnd50()%lint.size();
          value = lint.remove(index); 
      } else if (lint.size() == 1 ) {
          value = lint.remove(0);
      }      

      return value;      
  }

Call getRnd50() exactly 99 times. Its not truly random though since numbers stored in List of 100 integers are in sequence.