Home » Php » php – Returning a random value from array with probability proportional to it's value

php – Returning a random value from array with probability proportional to it's value

Posted by: admin July 12, 2020 Leave a comment

Questions:

I have an array like

$keywords = array('apple'=>10,'orange'=>2,'grape'=>12); 

I want to randomly pick one of the “Key” from the array. However the probability distribution should be such that probability of picking an element should be proportional to it’s value.

How to&Answers:

Add all values (10+2+12 is 24); get a random number in the range [0, 24), and pick the corresponding element depending on whether the number lies in [0, 10), [10, 12), or [12, 24).

Answer:

I’d do it like this:

    $probabilities = array('apple'=>50, 'orange'=>20, 'banana'=>10);

    function random_probability($probabilities) {
      $rand = rand(0, array_sum($probabilities));
      do {
        $sum = array_sum($probabilities);
        if($rand <= $sum && $rand >= $sum - end($probabilities)) {
          return key($probabilities);
        }
      } while(array_pop($probabilities));
    }

Answer:

An O(log(n)) approach (this is ripped directly from an answer to a very similar question):

The usual technique is to transform the array into an array of cumulative sums:

 [10 60 5 25]  --> [10 70 75 100]

Pick a random number in the range from zero up to the cumulative total (in the example: 0 <= x < 100). Then, use bisection on the cumulative array to locate the index into the original array:

Random variable x      Index in the Cumulative Array      Value in Original Array
-----------------      -----------------------------      ----------------------
 0 <= x < 10                      0                            10
10 <= x < 70                      1                            60
70 <= x < 75                      2                             5
75 <= x < 100                     3                            25 

For example, if the random variable x is 4, bisecting the cumulative array gives a position index of 0 which corresponds to 10 in the original array.

And, if the random variable x is 72, bisecting the cumulative array gives a position index of 2 which corresponds to 5 in the original array.