nupic.core-legacy icon indicating copy to clipboard operation
nupic.core-legacy copied to clipboard

Handle RNG seeds consistently

Open scottpurdy opened this issue 9 years ago • 20 comments

Currently some algorithms default to seed=-1, which will use a random seed each time, while others default to a fixed seed. We should handle this consistently.

This issue can track everyone's opinion on it. @subutai @chetan51 @breznak

Some current fixed seed defaults:

Some current random seed defaults:

scottpurdy avatar Oct 07 '15 23:10 scottpurdy

Thanks, on a quick thought..

FIXED rng seed:

  • Pros:
    • 100% reproducible tests
    • better to spot platform numeric instability
    • one suspect off when "something misbehaves"
    • max determinism
  • Cons:
    • no meaning in "restarted local search", aka. no improvement in running twice the same and observing results
    • will not spot possible perf. breakage due to "bad luck" in rng seed selection, aka we can just be lucky with this seed, but other seed might screw up.

In conclusion, I'd go with a fixed seed (for testing, 99% users), and in special case one can always set to random seed -1

breznak avatar Oct 08 '15 00:10 breznak

@breznak Thanks for the great list. I agree, for the same reasons you stated. You can always use -1 if you need non-determinism for some reason.

subutai avatar Oct 08 '15 17:10 subutai

To play Devil's advocate - we almost always specify the seeds explicitly when instantiating the algorithms so we would still have deterministic tests, ability to spot platform differences, etc.

One advantage of having the default be the special value "-1" is that we can introduce a function to change the default across the board. By default, "-1" would choose a random seed but you could call Random.setDefaultSeed(42) and any time an algorithm instantiates Random(-1) after that it uses the seed "42" rather than a random seed.

Again, just playing devil's advocate.

scottpurdy avatar Oct 08 '15 18:10 scottpurdy

@numenta/nupic-reviewers - please chime in if you have an opinion

scottpurdy avatar Oct 08 '15 18:10 scottpurdy

:+1: for fixed default seeds

oxtopus avatar Oct 08 '15 18:10 oxtopus

I agree with @breznak. Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

chetan51 avatar Oct 08 '15 19:10 chetan51

+1 on DRY

vitaly-krugl avatar Oct 08 '15 19:10 vitaly-krugl

Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

Can we then introduce a (header) file with nupic CONSTANTS? (not sure if it makes sense at a global level though..)

breznak avatar Oct 08 '15 20:10 breznak

Can I take this as an agreement on fixed default rng seed?

Also, on the

Only thing I want to add is that it would be good to put the fixed default seed in one place, to keep the code DRY.

this goes back to my "interface" proposals. We could have a class "Block" (can't come up with a better name, just something with columns); the Block would have param seed, TP,SP,TM inherit from Block and pass kwargs; like that? But let's keep that for other PR, here I just want same params for SP implementations, this 'rng seed' is just a blocker, so I'd manually change that on the 3 places..

breznak avatar Oct 09 '15 12:10 breznak

Based on the comments here I propose that we set the seed keyword-argument default value to 0 everywhere in both C++ and Python. Please follow up soon if you disagree with this conclusion.

There was some idea of defining this in a variable somewhere but using that variable consistently is not any more DRY than using 0 consistently. Not to mention that the seed has no meaningful value across algorithm implementations. So to avoid a bunch of unnecessary imports (and thus code dependencies) I am excluding that from the proposal. I like the idea of using interfaces where they benefit but objects that have seeds don't actually share any functions. A lot of the time they take the seed as a constructor parameter but that isn't something you can enforce with a parent class.

scottpurdy avatar Oct 28 '15 23:10 scottpurdy

@scottpurdy why was this issue closed? The agreement seemed to be on fixed rng seed, but no PR has resolved the matter as I've checked now.

breznak avatar Nov 25 '15 11:11 breznak

@scottpurdy Please respond to Mark above, because there is a chain of issues and PRs that are blocked by this.

rhyolight avatar Dec 05 '15 00:12 rhyolight

I must have been thinking of this issue as a proposal as opposed to implementation issue. Fine to track the implementation here though since it has all the context.

scottpurdy avatar Dec 05 '15 00:12 scottpurdy

I'll try to resolve this soon. So in https://github.com/numenta/nupic.core/issues/649#issuecomment-152034312 we agreed on default fixed seed (with value 0), -1 would mean random. I'll make a PR to C++, python would follow.

breznak avatar Mar 30 '16 23:03 breznak

@breznak pointed out an issue with the proposed solution to random seeds here. Here are a few options:

  1. Change the Random class to treat 0 as a fixed seed and have a different value (accessed via final static variable, probably set to the max UInt value) that results in a random seed. This way we can stick with current plan without changing the signature of Random.
  2. Change the Random class to take an signed int for the seed (it currently takes an unsigned int) and change the default, "random" value from 0 to -1. This way we can stick with plan but have to change the signature of Random.

There are other options as well but these seem to be the two that deviate the least. I think 1 makes a little more sense but would like to hear from @numenta/nupic-reviewers (in particular @subutai )

scottpurdy avatar Mar 31 '16 21:03 scottpurdy

Just to throw in my 2 pennies' worth, if the Random seeding API looks and sounds like the standard C++ or C counterpart, don't change the semantics to differ from the standard behavior. People don't expect that and will tend to use the API in the same way as the standard seeding API that they are familiar with. This always result in subtle, difficult to debug bugs.

vitaly-krugl avatar Apr 01 '16 06:04 vitaly-krugl

@vitaly-krugl does that mean you are in support of 1? The builtin implementation has a separate srand function for seeding that takes an unsigned int.

scottpurdy avatar Apr 01 '16 19:04 scottpurdy

@scottpurdy, I'm not sure yet. Would it be wrong to overload the method signature with an additional argument just for the purpose of signaling the default. This way, everyone who uses it in the standard way will get the expected behavior (no matter the value), while the users of the additional arg will make their intention explicit, too.

vitaly-krugl avatar Apr 01 '16 22:04 vitaly-krugl

@scottpurdy @vitaly-krugl Seems like this discussion needs to continue?

rhyolight avatar Jun 23 '16 20:06 rhyolight

@rhyolight: need to revisit when @scottpurdy returns.

vitaly-krugl avatar Jun 23 '16 20:06 vitaly-krugl