Weighted Random Selection

Posted by JH

Random Selection Based on Weights

Random Selection LinkedIn Interview Question

Q1: How to uniformly random-select a number from an array A?
Follow-up: If an additional array F is given, where F[i] is the frequency of occurrence for A[i], how to randomly select a number from A based on the frequency? The probability of A[i] being chosen should be greater if F[i] is larger.

e.g.
A = [A1, A2, A3]
F = [0, 5, 15]

P(A1 chosen) = 0%
P(A2 chosen) = 25%
P(A3 chosen) = 75%

Solution

Two-pass solution:
The probability of each A[i] gets selected is
F[i] / sum(F[0] , F[1], ..., F[n - 1])

Let SUM be sum(F[0] , F[1], ..., F[n - 1]), randomly generate a number X between 0 to SUM - 1.
For the given example,
if X falls between [0, 0), choose A1;
if X falls between [0, 5), choose A2;
if X falls between [5, 20), choose A3;

The solution needs to know SUM beforehand to generate the random number, so the first pass is needed to get the SUM.

Greedy single-pass solution:
An optimal way is to build up the selection incrementally.

Iterate i through 0 to N - 1.
At the ith step, we only worry about whether A[i] is chosen or not.
If A[i] is chosen, set chosen number C = A[i].
Otherwise, keep the number chosen before.

To decide whether or not to choose A[i], up to the current step i, the total frequency of all the numbers before A[i], is SUM(F[0: i -1]), which represents the weight of A[i] not being chosen, while the weight of A[i] being chosen is F[i].

Therefore,
P(A[i] not chosen) = SUM(F[0: i - 1]) / SUM(F[0: i])
P(A[i] chosen) = F[i] / SUM(F[0: i])

Generate a random number between 0 and SUM(F[0: i]), if it falls in 0 to SUM(F[0: i - 1]), keep whatever has been chosen from before. Otherwise it falls in SUM(F[0: i - 1]) to SUM(F[0: i]), choose A[i].

No worries about whatever is after i at the moment, since the greedy solution only solves the subproblem up to the current i.

Variants

There are a few variants of this problem that the same model will apply.
Google Variant
Randomly select a point in a bunch of rectangles. (Here the frequency is like the areas of rectangles)
Solution: Randomly select a rectangle based on the areas. Then it will be easy to get a random point from the one selected.

Linked List Variant
Random select a node in a linked list based on the weight of the nodes.

Implementation

public Integer random(int[] array, int[] freq) {
    int totalFreq = 0;
    Integer selected = null;
    Random rand = new Random();

    for(int i = 0; i < array.length; i++) {
        int r = rand.nextInt() * (totalFreq + freq[i]);
        if(r >= totalFreq) {
            selected = array[i];
            totalFreq += freq[i];
        }
    }
    return selected;
}


Get one-to-one training from Google Facebook engineers

Top-notch Professionals

Learn from Facebook and Google senior engineers interviewed 100+ candidates.
Most recent interview questions and system design topics gathered from aonecode alumnus.
One-to-one online classes. Get feedbacks from real interviewers.

Customized Private Class

Already a coding expert? - Advance straight to hard interview topics of your interest.
New to the ground? - Develop basic coding skills with your own designated mentor.
Days before interview? - Focus on most important problems in target company question bank.

Image placeholder
One-on-One Algorithm and Coding Training
Image placeholder
One-on-One System Design Training
Image placeholder
One-on-One Mock Interview