haskell-hedgehog Document the magic number 100

I'm puzzled by the behavior of linear.

Construct a range which scales the second bound relative to the size parameter.

The first example in the doc is bounds 50 $ linear 0 10. I'd expect you scale 10 by 50 and you get (0,500) but instead you get (0,5).

The other examples seem to confirm that there's an implicit factor of 1/100 multiplied in there somewhere.

And then when the size parameter is above 100, it stops scaling entirely and becomes a constant.

λ> Range.bounds 5000 $ Range.linear 0 10
(0,10)

So what's the concept here? Is the size parameter in general supposed to be thought of as a percentage, and should it never go above 100?

Apr 11 '17 18:04 chris-martin

I also just noticed that the size parameter used by sample is 30 - Is there a particular reason for that?

Apr 11 '17 19:04 chris-martin

This concept is inherited from QuickCheck and I haven't thought much about whether it could be made better. It does make sense to think of the size parameter as a percentage, it ranges from 0 - 99, so if you run 200 tests it will cycle back to 0, then ramp up to 99 again.

I think the general idea is that if the test fails with an input which is initially smaller, then shrinking it is easier, so we should start with inputs that are small. Also, smaller inputs tend to run faster, so only generating a few big inputs helps test performance.

I'm not sure any of these are great reasons for having size as a thing. If we had perfect shrinking every time then I think the concept of size could potentially be removed altogether.

Apr 11 '17 23:04 jacobstanley

Ah. That makes sense. I always thought size was an underdocumented concept in quickcheck.

Does it make sense to make this explicit and have the Size type constrain its value to the valid range?

If it's a percentage, wouldn't it go to 100?

Is there a reason for it to be integral rather than fractional? A fractional value between 0 and 1 would make more sense to me, rather than hard-coding a maximum precision of "99 increments".

Apr 11 '17 23:04 chris-martin

Does it make sense to make this explicit and have the Size type constrain its value to the valid range?

Yes but it needs some thought to make sure that doesn't break anything.

If it's a percentage, wouldn't it go to 100?

I guess 0-99 makes more sense as that is exactly 100 levels and the default number of tests is 100.

Is there a reason for it to be integral rather than fractional?

Only that it matches what QuickCheck does, maybe being 0 - 1 as a fractional would be better, I'm not sure.

Apr 11 '17 23:04 jacobstanley

This concept is inherited from QuickCheck and I haven't thought much about whether it could be made better.

Size can be confusing because there are some generators which totally ignore it. I'm wondering if the range combinators could help removing size entirely at some point.

Apr 12 '17 03:04 moodmosaic

I just noticed that int64 on an exponential range produces very bad results for sizes 100 and up.

λ> bounds 100 (exponentialBounded :: Range Int64)
(-9223372036854775808,-9223372036854775808)

λ> sample (resize 100 (int64 (exponential 0 maxBound)))
0

So there's another motivation to constrain Size to its expected range, I think.

Jul 27 '18 02:07 chris-martin

Apart from the idea of constraining the Size which indeed looks interesting and useful, this looks weird:

λ> sample (resize 100 (int64 (exponential 0 maxBound))) 0

I think I would expect a number close to maxBound :: Int64, but not 0.

And so this also looks weird:

λ (\x -> bounds x $ exponential 0 (maxBound :: Int64)) <$> [97 .. 101]
[(0,3817338227552192512),(0,5933694520551410688),(0,0),(0,0),(0,0)]

as I would expect to see something like

[(0,3817338227552192512),(0,5933694520551410688),(0,9223372036854775807),(0,9223372036854775807),(0,9223372036854775807)]

because

λ (\x -> bounds x $ exponential 0 (maxBound :: Int8)) <$> [97 .. 101]
[(0,115),(0,121),(0,127),(0,127),(0,127)]

λ (\x -> bounds x $ exponential 0 (maxBound :: Int16)) <$> [97 .. 101]
[(0,26559),(0,29500),(0,32767),(0,32767),(0,32767)]

λ (\x -> bounds x $ exponential 0 (maxBound :: Int32)) <$> [97 .. 101]
[(0,1391252730),(0,1728494283),(0,2147483647),(0,2147483647),(0,2147483647)]

Aug 01 '18 05:08 moodmosaic

this looks weird

It is weird because when I wrote the code for exponential ranges, I didn't consider sizes above 99 at all. My understanding was that size 100 was an invalid input, and so its behavior could be undefined. I'm still not entirely sure whether this understanding was correct.

Aug 04 '18 18:08 chris-martin

Another reason for size to be constrained: a negative size results in confusing results for those who have not figured out it is only valid for 0-99. Documentation could help in the meantime.

Range.bounds (-2) (Range.linear (-10) 10)
(-10, -10)

is the same as:

Range.bounds (-1) (Range.linear (-10) 10)
(-10, -10)

Nov 23 '19 11:11 Skyfold

haskell-hedgehog haskell-hedgehog copied to clipboard

Document the magic number 100

haskell-hedgehog
haskell-hedgehog copied to clipboard