scalacheck icon indicating copy to clipboard operation
scalacheck copied to clipboard

Generating a list of size N where each element is distinct.

Open wjlow opened this issue 7 years ago • 7 comments

I'm proposing something like:

def sizedDistinctListOf[T](size: Int, gen: Gen[T], eq: (T, T) => Boolean): Gen[List[T]]

In my own codebase, I've got something similar where T is constrained by the Eq typeclass from Cats. This allows me to define what distinctiveness means. From my understanding, ScalaCheck has no dependency on Cats.

For instance, you may have a case class Person(name: String, age: Int) and for a specific scenario, you might want to generate List[Person] where each element contains a different name and age pair, whereas for some other cases, you might want to generate List[Person] where each element has a different age.

Happy to submit a PR if people find this useful.

wjlow avatar Aug 10 '17 04:08 wjlow

How about?

val genPersonSameName: Gen[Person] = arbitrary[Person].map(_.copy(name = "Same Name"))
val genPersonSameAge: Gen[Person]  = arbitrary[Person].map(_.copy(age  = 42))

Then you can generate distinct lists with Set and containerOfN:

val genPersonDiffName: Gen[List[Person]] = containerOfN[Set, Person](n, genPersonSameAge).map(_.toList)
val genPersonDiffAge: Gen[List[Person]]  = containerOfN[Set, Person](n, genPersonSameName).map(_.toList)

I don't enough about cats to know if Eq cooperates well with scala collections.

ashawley avatar Aug 10 '17 23:08 ashawley

Looks like there is an effort to make cats work with scala collections, including Set:

https://github.com/non/alleycats

And also an initiative to build cats extensions to Scalacheck:

https://github.com/non/cats-check

ashawley avatar Aug 10 '17 23:08 ashawley

Thanks for the suggestion @ashawley. I feel like this approach isn't as obvious though. Can we provide a simple function that generates a collection of T provided a way to define distinctiveness?

Your approach works and I like how it works by composing Gens. However I feel like it's easier to provide a function that determines equality (A,A) => Boolean as opposed to having to write a Gen for for it instead.

Not sure if this makes sense.

wjlow avatar Aug 11 '17 11:08 wjlow

I've found this sort of thing useful in the past, and have opened #394 with a potential implementation.

morgen-peschke avatar Apr 06 '18 20:04 morgen-peschke

It could be useful to have a simpler syntax for this kind of thing. Unfortunately, it is difficult to reliably generate distinct sets of values. This is why it was relaxed in #89:

https://github.com/rickynils/scalacheck/blob/0a2f1c5/src/main/scala/org/scalacheck/Gen.scala#L609

The property tests for containerOfN for Set and mapOfN are relaxed to be only _.size <= n rather than _.size == n:

https://github.com/rickynils/scalacheck/blob/0a2f1c5/jvm/src/test/scala/org/scalacheck/GenSpecification.scala#L172-L178

ashawley avatar Apr 10 '18 13:04 ashawley

@ashawley - that's definitely an important point.

I think the main point of this feature request isn't to try to guarantee a particular size (though a best effort there is always appreciated), as much as it is to have some way of customizing what "distinct" means for a particular data set.

Best-effort or fail-fast semantics are nice alternatives to have, as that can be useful to when tweaking our Gens.

morgen-peschke avatar Apr 17 '18 18:04 morgen-peschke

there is a PR at #394 that made it almost all the way through review, but the author doesn't currently have time to finish it off. perhaps another volunteer would like to pick it up?

SethTisue avatar Feb 05 '21 18:02 SethTisue