pagmo icon indicating copy to clipboard operation
pagmo copied to clipboard

Sampling without replacement in jDE

Open krzysztof opened this issue 10 years ago • 11 comments

Replace https://github.com/esa/pagmo/blob/master/src/algorithm/jde.cpp#L192-L223 with something more efficient:

a) Implement Dario Izzo's Swapping Algorithm®:

  N =  [0,1,2,3,4, ...]
  M = 5
  for i in range(M):
    rnd_i = randint(0, N - i)
    swap(N[rnd_i], N[len(N) - i - 1])
  rand_5 = N[:-M]

(Supposedly there's a threshold where |N|=21 where this stops being efficient and you should fall back to current method)

http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement http://rosettacode.org/wiki/Knuth's_algorithm_S

krzysztof avatar Jan 26 '15 10:01 krzysztof

lol

darioizzo avatar Jan 26 '15 11:01 darioizzo

I think that in that snippet Dario achieved the impossible feat of making C++ look like javascript :)

bluescarni avatar Jan 26 '15 11:01 bluescarni

See https://github.com/python-git/python/blob/master/Lib/random.py#L320-L322 for threshold if k>5.

dhennes avatar Jan 26 '15 11:01 dhennes

I have a question regarding the suggested solution a) (Dario Izzo's Swapping Algorithm®). When I compare the suggestion a) with the code in: https://github.com/esa/pagmo/blob/master/src/algorithm/jde.cpp#L192-L223 I see some functional differences What I mean is that the current implementation for each i in range(NP) assigns r1,r2,...,r7 to numbers in the set {0,1,...,NP-1} \ {i} (without repetition and every 7-tuple being equally likely selected). On the other hand, the suggestion a) assign r1,r2,...,r7 to numbers in the set {0,1,...,NP-1} in each iteration. Are these differences acceptable or not?

AVCZ avatar Mar 06 '15 16:03 AVCZ

Hi I'm new here and this is my first bug, How do I begin ?

AnirudhGP avatar Mar 07 '15 12:03 AnirudhGP

@AVCZ , notice that we sample from a diminishing range of [0, N-i], for increasing i, and swap the elements with the tail.

krzysztof avatar Mar 08 '15 13:03 krzysztof

@kiryx Yes, I understand that part. I had a bit different question in mind and I probably just didn't make myself clear enough. (I assume that in the algorithm a) above the last line of the code should actualy be -- rand_5 = N[-M:] -- and that it was just a typo). The algorithm a) basically selects 5 elements from N = [0,1,2,3,4,....] at random, or more precisely said, each 5-tuple has exactly the same probability of being selected. On the other hand, the code in https://github.com/esa/pagmo/blob/master/src/algorithm/jde.cpp#L192-L223 in each iteration (iteration over the variable i from 0 to NP-1) selects a random 7-tuple from the range {0,1,...,NP-1} \ {i} which is not the same as selecting a 7-tuple from range {0,1,...,NP-1}.

AVCZ avatar Mar 08 '15 13:03 AVCZ

@AVCZ, Sorry, I misunderstood you. Yes, you should sample from {0, 1, ..., NP-1} \ {i}. It would be nice to not have to instantiate the N for every i though... I'm not sure if its possible without at least linear scan through the vector N...

EDIT: Its possible at the cost of extra sampling round: Sample from {0, 1, .. NP-1} 7 times + 1 extra if you sampled the i-th element somewhere along the way. Alternatively you can maintain the position of each element (e.g. in a separate vector), and always swap the i-th element with the end of vector before the sampling from {0, 1, ..., NP-2} (this saves you a linear search for i).

krzysztof avatar Mar 08 '15 13:03 krzysztof

@kiryx OK, I guess I understand now and I will go through it and send a PR later today.

AVCZ avatar Mar 08 '15 14:03 AVCZ

@AVCZ I've already worked a bit on this issue (sorry about that, didn't know whether or not you were still interested it resolving it). My current branch is at https://github.com/neduard/pagmo/tree/issue95_sampling_without_replacement_jde

Currently however I'm instantiating the N for every i. It shouldn't take too long though to fix that. I'm thinking if you haven't started implementing the solution yet, leaving this issue for me? Of course I can't force you to do that, so you it is up to you whether to continue working on this issue or on something different :)

neduard avatar Mar 08 '15 14:03 neduard

@neduard Well, I guess I don't have to continue with my branch now, because I looked into your code and it seems to me that you have solved this issue.

AVCZ avatar Mar 08 '15 16:03 AVCZ