scalacheck icon indicating copy to clipboard operation
scalacheck copied to clipboard

Arbitrary, Gen and Choose instances for primitive types should respect size

Open jbracker opened this issue 6 years ago • 4 comments

Currently all generators, arbitrary and choose instances for primitives types such as Short, Int, Long, Float, Double and String do not respect the size afaik. Is there a reason for this?

After a quick glance I also have the feeling that the buildable instances (e.g. List, etc) do not respect size either.

If there is no good reason, I think this should be changed. The QuickCheck implementation in Haskell provides a good starting point for properly sized instances of these types. I would certainly appreciate them, because it would allow scaling down numbers and array/list sizes for tests if necessary (e.g. numeric calculations where errors become to big when using huge floats or doubles).

I'd be happy to implement this if it will become incorporated.

jbracker avatar Jan 17 '19 11:01 jbracker

What's the benefit of having generators and arbitrary for primitive types respect Gen.size and Gen.resize? Seems like these types have scales and limits that are too diverse to be parameterized by an Int. Not sure how it could be composed with Gen.choose.

Can you give more detail on what exactly you're looking for here in ScalaCheck that you already have in QuickCheck?

The size test parameter is respected in ScalaCheck for collections like List. For example:

property("Gen[List[Int]].size <= 100") = {
  Prop.forAll(Arbitrary.arbitrary[List[Int]]) { xs: List[Int] =>
    100 >= xs.size && xs.size >= 0
  }
}

property("Gen[List[Int]].size <= 10") = {
  Prop.forAll(Gen.resize(10, Arbitrary.arbitrary[List[Int]])) { xs: List[Int] =>
    10 >= xs.size && xs.size >= 0
  }
}

property("Gen[List[Int]].size <= 1000") = {
  Prop.forAll(Gen.resize(1000, Arbitrary.arbitrary[List[Int]])) { xs: List[Int] =>
    1000 >= xs.size && xs.size >= 0
  }
}

Since String is a sequence, it also respects the test size parameter. For example:

property("Gen[String].size <= 100") = {
  Prop.forAll(Arbitrary.arbitrary[String]) { s: String =>
    100 >= s.size && s.size >= 0
  }
}

property("Gen[String].size <= 10") = {
  Prop.forAll(Gen.resize(10, Arbitrary.arbitrary[String])) { s: String =>
    10 >= s.size && s.size >= 0
  }
}

property("Gen[String].size <= 1000") = {
  Prop.forAll(Gen.resize(1000, Arbitrary.arbitrary[String])) { s: String =>
    1000 >= s.size && s.size >= 0
  }
}

ashawley avatar Jan 18 '19 05:01 ashawley

You are probably right about the size not being to useful for choose, since there you expect to get something from within the full range.

For primitive types like Int and Float I think the main advantage is that the generated test cases are much more readable and useful. I would argue that if we generate a List of Ints or Floats the following test cases [0,10,-23,5] [0.1,-0.24,0.003,23.4] would be way more useful and give a lot more insight to what failed then for example: [23243242,31231954,-234852829,403424,0] [2e-45,34e102,-3e66,3000.332424]

There is also the Argument made in the QuickCheck paper: "We do vary the size between different test cases: we begin testing each property on small test cases, and then gradually increase the size bound as testing progresses. This makes for a greater variety of test cases, which both makes testing more effective, and improves our chances of finding enough test cases satisfying the precondition of conditional properties. It also makes it more likely that we will find a small counter example to a property, if there is one." -- https://www.cs.tufts.edu/~nr/cs257/archive/john-hughes/quick.pdf , P. 5

Of course, shrinking will also help find a smallest example, but I would still argue that it is more useful to find small examples that fail first before looking for large ones.

In QuickCheck for example the Int and Float generators deliver numbers in range from -size to size. Where the complete randomness of the ScalaCheck default generators is really unhelpful is when I try to check some calculations involving floats, because sometime I cannot even add the delivered floats without getting infinity as a result.

jbracker avatar Jan 22 '19 22:01 jbracker

I agree. ScalaCheck does start picking large values sooner rather than later.

For example,

property("arbitrary[List[Int]]") = {
  Prop.forAll { xs: List[Int] =>
    xs.forall(_ > 0)
  }
}

Shrinking gets it right, but the original argument is a lot of noise and isn't always helpful:

[info] ! Arbitrary.arbitrary[List[Int]]: Falsified after 6 passed tests.
[info] > ARG_0: List("0")
[info] > ARG_0_ORIGINAL: List("2147483647", "728865839", "-1402066210", "2147483647", "-917775453")

We do vary the size between different test cases: we begin testing each property on small test cases, and then gradually increase the size bound as testing progresses.

The search heuristic that you cited from the QuickCheck paper would be more useful, but I think it is in the context of generating recursive trees. Constraining size is a very necessary requirement for trees. If such a heuristic was used generally, it would probably risk making the tests too predictable. A search heuristic that favors randomness produces better coverage of inputs.

ashawley avatar Jan 23 '19 00:01 ashawley

Here's a fun little example. Consider this (incorrect) property:

scala> Prop.forAll((x: UUID, y: UUID) => x != y).check(_.withMinSuccessfulTests(10000))
+ OK, passed 10000 tests.

Arbitrary[UUID] doesn't respect size, so finding a collision is highly unlikely. Even if one goes hunting for collisions by micromanaging size, it is not respected:

scala> Prop.forAll((x: UUID, y: UUID) => x != y).check(_.withMinSuccessfulTests(10000).withMaxSize(0))
+ OK, passed 10000 tests.

There's a class of properties similar to this one, that actually benefit from checking them with low entropy, because that "flushes out" possible conflicts.

By the way, a similar property is easily falsified for Ints and Longs - because those arbitraries have a few special values with artificially increased frequences, while Arbitrary[UUID] is uniform.

nigredo-tori avatar Mar 04 '21 08:03 nigredo-tori