An efficient algorithm to sample non-duplicate random elements from an array

Issue

This Content is from Stack Overflow. Question asked by Kaiyakha

I’m looking for an algorithm to pick M random elements from a given array. The prerequisites are:

  • the sampled elements must be unique,
  • the array to sample from may contain duplicates.

This is what I’ve managed to come up with. Here I’m also making an assumption that the amount of unique elements in the array is greater (or equal) than M.

#include <random>
#include <vector>
#include <algorithm>
#include <iostream>


const std::vector<int> sample(const std::vector<int>& input, size_t n) {
    std::random_device rd;
    std::mt19937 engine(rd());
    std::uniform_int_distribution<int> dist(0, input.size() - 1);
    
    std::vector<int> result;
    result.reserve(n);

    size_t id;
    do {
        id = dist(engine);
        if (std::find(result.begin(), result.end(), input[id]) == result.end())
            result.push_back(input[id]);
    } while (result.size() < n);

    return result;
}


int main() {
    std::vector<int> input{0, 0, 1, 1, 2, 2, 3, 3, 4, 4};

    std::vector<int> result = sample(input, 3);

    for (const auto& item : result)
        std::cout << item << ' ';
    std::cout << std::endl;
}

This algorithm does not seem to be the best. Is there a more efficient algorithm to solve this task? It would be good if this algorithm could also assert the amount of unique elements in the input array is not less than M.



Solution

This question is not yet answered, be the first one who answer using the comment. Later the confirmed answer will be published as the solution.

This Question and Answer are collected from stackoverflow and tested by JTuto community, is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?