proper
proper copied to clipboard
seed: introduce seed user option
Any reason this wasn't added earlier? This is an otherwise pretty common option for a QuickCheck implementation.
Note: not sure how much type-checking I need to do on the option's value as not much checking is done on the other options. Plus this might prove useful for RNGs that aren't seeded with the usual triple.
Seeing CI failures I expect they are due to older OTP releases using different RNGs. If that’s true and there’s no backport of the RNG used in >=19 I should probably update the test to be otp-vsn dependent. Thoughts?
I am a bit skeptical about introducing a test that relies on the RNG producing the same random number based on a given seed. Looks to me that this test is a perfect candidate to break not only in some past releases but also in some future ones. I would suggest another test instead; perhaps a weaker one that just checks that PropEr accepts this option.
In your PR the seed get's only parsed from the options but it does not actually get set as seed for the RNG. The reason for failing the test case is that >=19 PropEr uses random
and afterwards rand
(see proper_internal.hrl
the macro ?RANDOM_MOD
and ?SEED_NAME
)
You have to set the seed using proper_arith:rand_start(Seed)
for it to have any effect.
Furthermore, I think that the test is not good as it depends on the implementation of the RNG. What you want to test instead is that if you set the seed by hand it is set to exactly that value (probably using the ?SETUP
macro since that gets evaluated before any input value is generated)
(see also Kostis comments)
Well I’d rather test what comes out of the RNG that way I can be sure I have all the inputs needed for deterministic/reproducible runs. Maybe along with the seed I should add the necessary RNG functions as parameters too?
I’m not sure where start_rand should be called if at all since I can reproduce my test’s output with and with the call. Also I tried reading ?SEED_RAND from ?SETUP but that does not seem to have much to do with the triplet seed. Should I be comparing ?SEED_RAND during cleanup with what was in setup?
Thanks!
Thinking about it a little bit, what you want to achieve with setting the seed is that runs of the same property with the same seed produce the same input as you say. So I think that running a property twice with the same seed and then comparing that the generated input is the same (e.g. fails on the same value) should be a good test. I still don't think that the test should rely on a specific implementation of a RNG.
In proper:global_state_init()
proper uses the seed option already to set the seed. Sorry for the confusion earlier.
Disregard commit that just went through... WIP!
Maybe along with the seed I should add the necessary RNG functions as parameters too?
What do you think about me also adding the following user options:
-
{rng_seed, fun rng_seed/1}
- default:
fun ?RANDOM_MOD:seed/1
- default:
-
{rng_uniform, fun rng_uniform/0}
- default:
fun ?RANDOM_MOD:uniform/0
- default:
-
{rng_uniform, fun rng_uniform/1}
- default:
fun ?RANDOM_MOD:uniform/1
- default:
There are still a few calls to os:timestamp()
in proper_arith:rand_reseed/0
: shouldn't these be replaced by some function of Seed
?
Something like fun ({A,B,C}) -> {B,2*C,A} end
? Something deterministic I mean.
Thanks a lot for your help!
I am working on a fix for cleaning up the targeted part from the state.
This might work but I really think that proper should clean up after itself
without the user needing to call proper_target:cleanup_strategy
manually.
proper:quickcheck
should not leave this stuff in the process dictionary.
On Sun, May 20, 2018, 15:08 Pierre Fenoll [email protected] wrote:
@fenollp commented on this pull request.
In test/proper_tests.erl https://github.com/proper-testing/proper/pull/165#discussion_r189459423:
@@ -1040,6 +1042,23 @@ options_test_() -> ?FORALL(_,?SIZED(Size,integer(Size,Size)),false), [{start_size,12}])].
+seeded_test_() ->
- Seed = os:timestamp(),
- Opts = [{seed,Seed}, noshrink, {start_size,65536}],
- QC = fun (Prop) ->
R = proper:quickcheck(Prop, Opts),
proper:clean_garbage(),
catch proper_target:cleanup_strategy(),
@TheGeorge https://github.com/TheGeorge this fixed the issue of cleaning up TPBT.
— You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/proper-testing/proper/pull/165#pullrequestreview-121644047, or mute the thread https://github.com/notifications/unsubscribe-auth/AAR1j5vmXmHR9WFxbXdp4MfkBm-hFl88ks5t0WrggaJpZM4UCZ2s .
Codecov Report
Merging #165 into master will increase coverage by
0.29%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #165 +/- ##
==========================================
+ Coverage 83.23% 83.53% +0.29%
==========================================
Files 13 13
Lines 3132 3140 +8
==========================================
+ Hits 2607 2623 +16
+ Misses 525 517 -8
Impacted Files | Coverage Δ | |
---|---|---|
src/proper.erl | 86.15% <100.00%> (+0.48%) |
:arrow_up: |
src/proper_arith.erl | 86.11% <100.00%> (-0.30%) |
:arrow_down: |
src/proper_target.erl | 90.62% <100.00%> (+0.30%) |
:arrow_up: |
src/proper_gen_next.erl | 77.34% <0.00%> (+0.27%) |
:arrow_up: |
src/proper_statem.erl | 94.34% <0.00%> (+2.17%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update b0fbe32...81d9968. Read the comment docs.
@TheGeorge Any progress on proper_target:cleanup_strategy
?
I'm sorry for being so slow. I have a patch that fixes the issue and will upload it in a minute.
#174 should now clean up the strategy correctly.
Amazing! Will amend once #174 is merged :) Thanks a lot
Ready!
Ready!
No, it's not.
For starters, I do not "buy" the reasoning to introduce all these unnecessary ?assert()
diffs (see how the current diff looks). They have absolutely nothing to do with the primary goal of this pull request.
Also, please use the current style for the test -- i.e., no commas in the beginning of lines, etc.
Finally, and most importantly, I do not really understand what the test is actually doing. It cannot be setting the seed to some thing that is changing (os:timestamp/0
call) but it needs to be setting it to something stable and be checking that the behavior we get is
- the "expected" one and
- different than that that one would get without specifying a seed.
Ready!
No, it's not.
Yes it is: for review.
WRT the asserts commit (it is a separate commit, no need to show a full diff...):
The advantage is to show what’s in the get() when it’s not empty and to report the line where the macro is called.
That helped a lot debugging failing tests.
WRT style: what's in the "etc"? you mention no leading commas, I'm guessing ]
shouldn't be on a line on their own I guess? what else?
Is there a coding style document I missed somewhere?
WRT the test:
- yes the seed changes for each run of the test. That is to ensure the property is valid not just for a specific seed.
- I believe running the same property with the same seed twice and comparing the counterexample is enough to ensure the seeding works. I set a very high size & disabled shrinking so that the generated counterexamples "have very unexpected values", not the common values that any seed could find such as
false
,0
,4
, .... Also it is trickier to test the property against a specific value since the RNG can be changed and has changed over OTP releases. - Now I should probably test seeded shrinking of a hard-to-shrink property. What do you think?
- I like the idea of testing that the counterexample is different without specifying a seed! I'll probably also test that same property using a different seed.
Thanks for reviewing, I'll push an update soon.
Hi there, I've replied and pushed a commit 20 days ago now. Anything more I need to do?
Is this PR still under consideration? This feature might be useful in PropCheck (issue) to check if a persisted counterexample is still valid, or if the generator used to produce that example no longer exist.
@kostis @TheGeorge is there something that still needs to be finished with this? How can I help to get this merged?
Rebased ;)
Rebased again & enhanced the tests:
- now there's a flakyness flag that allows some failures (for now)
- the test failure reports which "property runner" failed Right now the reproducibility test is failing with and only with the targeted runner. I haven't run a git-bisect yet but I know test passed when I opened the PR 2 years ago. I'd say there's a good chance this comes from the somewhat recent changes to TPBT but of course this is not finger pointing.
I'm currently investigating whether reproducibility can be achieved by making proper_arith:rand_reseed/0
deterministic.
Rebased again. Is CI disabled?