proper icon indicating copy to clipboard operation
proper copied to clipboard

seed: introduce seed user option

Open fenollp opened this issue 6 years ago • 22 comments

Any reason this wasn't added earlier? This is an otherwise pretty common option for a QuickCheck implementation.

Note: not sure how much type-checking I need to do on the option's value as not much checking is done on the other options. Plus this might prove useful for RNGs that aren't seeded with the usual triple.

fenollp avatar May 17 '18 04:05 fenollp

Seeing CI failures I expect they are due to older OTP releases using different RNGs. If that’s true and there’s no backport of the RNG used in >=19 I should probably update the test to be otp-vsn dependent. Thoughts?

fenollp avatar May 17 '18 06:05 fenollp

I am a bit skeptical about introducing a test that relies on the RNG producing the same random number based on a given seed. Looks to me that this test is a perfect candidate to break not only in some past releases but also in some future ones. I would suggest another test instead; perhaps a weaker one that just checks that PropEr accepts this option.

kostis avatar May 17 '18 08:05 kostis

In your PR the seed get's only parsed from the options but it does not actually get set as seed for the RNG. The reason for failing the test case is that >=19 PropEr uses random and afterwards rand (see proper_internal.hrl the macro ?RANDOM_MOD and ?SEED_NAME)

You have to set the seed using proper_arith:rand_start(Seed) for it to have any effect.

Furthermore, I think that the test is not good as it depends on the implementation of the RNG. What you want to test instead is that if you set the seed by hand it is set to exactly that value (probably using the ?SETUP macro since that gets evaluated before any input value is generated)

(see also Kostis comments)

TheGeorge avatar May 17 '18 08:05 TheGeorge

Well I’d rather test what comes out of the RNG that way I can be sure I have all the inputs needed for deterministic/reproducible runs. Maybe along with the seed I should add the necessary RNG functions as parameters too?

I’m not sure where start_rand should be called if at all since I can reproduce my test’s output with and with the call. Also I tried reading ?SEED_RAND from ?SETUP but that does not seem to have much to do with the triplet seed. Should I be comparing ?SEED_RAND during cleanup with what was in setup?

Thanks!

fenollp avatar May 17 '18 10:05 fenollp

Thinking about it a little bit, what you want to achieve with setting the seed is that runs of the same property with the same seed produce the same input as you say. So I think that running a property twice with the same seed and then comparing that the generated input is the same (e.g. fails on the same value) should be a good test. I still don't think that the test should rely on a specific implementation of a RNG.

In proper:global_state_init() proper uses the seed option already to set the seed. Sorry for the confusion earlier.

TheGeorge avatar May 17 '18 12:05 TheGeorge

Disregard commit that just went through... WIP!

Maybe along with the seed I should add the necessary RNG functions as parameters too?

What do you think about me also adding the following user options:

  • {rng_seed, fun rng_seed/1}
    • default: fun ?RANDOM_MOD:seed/1
  • {rng_uniform, fun rng_uniform/0}
    • default: fun ?RANDOM_MOD:uniform/0
  • {rng_uniform, fun rng_uniform/1}
    • default: fun ?RANDOM_MOD:uniform/1

There are still a few calls to os:timestamp() in proper_arith:rand_reseed/0: shouldn't these be replaced by some function of Seed?

Something like fun ({A,B,C}) -> {B,2*C,A} end? Something deterministic I mean.

Thanks a lot for your help!

fenollp avatar May 17 '18 13:05 fenollp

I am working on a fix for cleaning up the targeted part from the state.

TheGeorge avatar May 18 '18 09:05 TheGeorge

This might work but I really think that proper should clean up after itself without the user needing to call proper_target:cleanup_strategy manually.

proper:quickcheck should not leave this stuff in the process dictionary.

On Sun, May 20, 2018, 15:08 Pierre Fenoll [email protected] wrote:

@fenollp commented on this pull request.

In test/proper_tests.erl https://github.com/proper-testing/proper/pull/165#discussion_r189459423:

@@ -1040,6 +1042,23 @@ options_test_() -> ?FORALL(_,?SIZED(Size,integer(Size,Size)),false), [{start_size,12}])].

+seeded_test_() ->

  • Seed = os:timestamp(),
  • Opts = [{seed,Seed}, noshrink, {start_size,65536}],
  • QC = fun (Prop) ->
  •             R = proper:quickcheck(Prop, Opts),
    
  •             proper:clean_garbage(),
    
  •             catch proper_target:cleanup_strategy(),
    

@TheGeorge https://github.com/TheGeorge this fixed the issue of cleaning up TPBT.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/proper-testing/proper/pull/165#pullrequestreview-121644047, or mute the thread https://github.com/notifications/unsubscribe-auth/AAR1j5vmXmHR9WFxbXdp4MfkBm-hFl88ks5t0WrggaJpZM4UCZ2s .

TheGeorge avatar May 20 '18 14:05 TheGeorge

Codecov Report

Merging #165 into master will increase coverage by 0.29%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #165      +/-   ##
==========================================
+ Coverage   83.23%   83.53%   +0.29%     
==========================================
  Files          13       13              
  Lines        3132     3140       +8     
==========================================
+ Hits         2607     2623      +16     
+ Misses        525      517       -8     
Impacted Files Coverage Δ
src/proper.erl 86.15% <100.00%> (+0.48%) :arrow_up:
src/proper_arith.erl 86.11% <100.00%> (-0.30%) :arrow_down:
src/proper_target.erl 90.62% <100.00%> (+0.30%) :arrow_up:
src/proper_gen_next.erl 77.34% <0.00%> (+0.27%) :arrow_up:
src/proper_statem.erl 94.34% <0.00%> (+2.17%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b0fbe32...81d9968. Read the comment docs.

codecov-io avatar Jun 26 '18 20:06 codecov-io

@TheGeorge Any progress on proper_target:cleanup_strategy?

fenollp avatar Jun 27 '18 11:06 fenollp

I'm sorry for being so slow. I have a patch that fixes the issue and will upload it in a minute.

TheGeorge avatar Jun 27 '18 11:06 TheGeorge

#174 should now clean up the strategy correctly.

TheGeorge avatar Jun 27 '18 12:06 TheGeorge

Amazing! Will amend once #174 is merged :) Thanks a lot

fenollp avatar Jun 27 '18 12:06 fenollp

Ready!

fenollp avatar Jun 27 '18 13:06 fenollp

Ready!

No, it's not.

For starters, I do not "buy" the reasoning to introduce all these unnecessary ?assert() diffs (see how the current diff looks). They have absolutely nothing to do with the primary goal of this pull request.

Also, please use the current style for the test -- i.e., no commas in the beginning of lines, etc.

Finally, and most importantly, I do not really understand what the test is actually doing. It cannot be setting the seed to some thing that is changing (os:timestamp/0 call) but it needs to be setting it to something stable and be checking that the behavior we get is

  1. the "expected" one and
  2. different than that that one would get without specifying a seed.

kostis avatar Jun 27 '18 15:06 kostis

Ready!

No, it's not.

Yes it is: for review.

WRT the asserts commit (it is a separate commit, no need to show a full diff...):

The advantage is to show what’s in the get() when it’s not empty and to report the line where the macro is called.

That helped a lot debugging failing tests.

WRT style: what's in the "etc"? you mention no leading commas, I'm guessing ] shouldn't be on a line on their own I guess? what else? Is there a coding style document I missed somewhere?

WRT the test:

  • yes the seed changes for each run of the test. That is to ensure the property is valid not just for a specific seed.
  • I believe running the same property with the same seed twice and comparing the counterexample is enough to ensure the seeding works. I set a very high size & disabled shrinking so that the generated counterexamples "have very unexpected values", not the common values that any seed could find such as false, 0, 4, .... Also it is trickier to test the property against a specific value since the RNG can be changed and has changed over OTP releases.
  • Now I should probably test seeded shrinking of a hard-to-shrink property. What do you think?
  • I like the idea of testing that the counterexample is different without specifying a seed! I'll probably also test that same property using a different seed.

Thanks for reviewing, I'll push an update soon.

fenollp avatar Jun 27 '18 15:06 fenollp

Hi there, I've replied and pushed a commit 20 days ago now. Anything more I need to do?

fenollp avatar Jul 17 '18 08:07 fenollp

Is this PR still under consideration? This feature might be useful in PropCheck (issue) to check if a persisted counterexample is still valid, or if the generator used to produce that example no longer exist.

evnu avatar Mar 18 '19 20:03 evnu

@kostis @TheGeorge is there something that still needs to be finished with this? How can I help to get this merged?

x4lldux avatar Mar 29 '20 14:03 x4lldux

Rebased ;)

fenollp avatar Mar 31 '20 18:03 fenollp

Rebased again & enhanced the tests:

  • now there's a flakyness flag that allows some failures (for now)
  • the test failure reports which "property runner" failed Right now the reproducibility test is failing with and only with the targeted runner. I haven't run a git-bisect yet but I know test passed when I opened the PR 2 years ago. I'd say there's a good chance this comes from the somewhat recent changes to TPBT but of course this is not finger pointing.

I'm currently investigating whether reproducibility can be achieved by making proper_arith:rand_reseed/0 deterministic.

fenollp avatar Apr 02 '20 19:04 fenollp

Rebased again. Is CI disabled?

fenollp avatar Nov 08 '20 22:11 fenollp