Issue
This Content is from Stack Overflow. Question asked by Kaiyakha
I’m looking for an algorithm to pick M random elements from a given array. The prerequisites are:
- the sampled elements must be unique,
- the array to sample from may contain duplicates.
This is what I’ve managed to come up with. Here I’m also making an assumption that the amount of unique elements in the array is greater (or equal) than M.
#include <random>
#include <vector>
#include <algorithm>
#include <iostream>
const std::vector<int> sample(const std::vector<int>& input, size_t n) {
std::random_device rd;
std::mt19937 engine(rd());
std::uniform_int_distribution<int> dist(0, input.size() - 1);
std::vector<int> result;
result.reserve(n);
size_t id;
do {
id = dist(engine);
if (std::find(result.begin(), result.end(), input[id]) == result.end())
result.push_back(input[id]);
} while (result.size() < n);
return result;
}
int main() {
std::vector<int> input{0, 0, 1, 1, 2, 2, 3, 3, 4, 4};
std::vector<int> result = sample(input, 3);
for (const auto& item : result)
std::cout << item << ' ';
std::cout << std::endl;
}
This algorithm does not seem to be the best. Is there a more efficient algorithm to solve this task? It would be good if this algorithm could also assert the amount of unique elements in the input array is not less than M.
Solution
The comments already note the use of std::set
. The additional request to check for M unique elements in the input make that a bit more complicated. Here’s an alternative implementation:
- Put all inputs in a
std::set
orstd::unordered_set
. This removes duplicates. - Copy all elements to the return vector
- If that has more than M elements,
std::shuffle
it andresize
it to M elements. - Return it.
This Question was asked in StackOverflow by Kaiyakha and Answered by MSalters It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.