permutator icon indicating copy to clipboard operation
permutator copied to clipboard

Add support for repetitions

Open TheIronBorn opened this issue 3 years ago • 3 comments

k-combinations with repetition would be useful.

See https://en.wikipedia.org/wiki/Combination#Number_of_combinations_with_repetition

TheIronBorn avatar May 06 '21 06:05 TheIronBorn

@TheIronBorn Are you referring to something like this algorithm ?

NattapongSiri avatar May 22 '21 08:05 NattapongSiri

I think this means the same as the issue I've hit where equivalent items are treated as different.

        let mut data = &mut [1, 1, 3];
        let mut counter = 1;
        data.permutation().for_each(|p| {
            println!("k-permutation@{}={:?}", counter, p);
            counter += 1;
        });

        counter = 1;
        data.combination(2).for_each(|mut c| {
            println!("k-combination@{}={:?}", counter, c);
            counter += 1;
        });

Which doesn't generate a unique set of combinations or permutations, but has repetition of results due to not recognizing multiple entries of the same value in the list as interchangeable.

k-permutation@1=[1, 1, 3]
k-permutation@2=[1, 1, 3]
k-permutation@3=[3, 1, 1]
k-permutation@4=[1, 3, 1]
k-permutation@5=[1, 3, 1]
k-permutation@6=[3, 1, 1]
k-combination@1=[3, 1]
k-combination@2=[3, 1]
k-combination@3=[1, 1]

skyphyr avatar Apr 15 '23 21:04 skyphyr

I think, that'd better be done with dedup the vec first before using it. If I was to implement it, I'd do that as well. Otherwise, it'd degrade performance significantly if I was to check whether the permuted one was generated before.

Another point is, I imagine that it is possible that user might intentionally put duplicate data before shuffle to add bias into sample set.

I did that several time in my data science project.

NattapongSiri avatar Apr 17 '23 08:04 NattapongSiri