scalacheck
scalacheck copied to clipboard
trees in scalacheck
Trees are a pain to generate with scalacheck because scalacheck makes them too big. This means that folks who want to experiment with large trees have to build an ad-hoc depth-limiting function into their tree generators. Scalacheck should provide tooling so that we don't have to constantly reimplement this.
A possible solution (and perhaps to illustrate my complaint better):
sealed trait Tree
case object Leaf extends Tree
case class Branch(left: Tree, right: Tree) extends Tree
def tree2[T](maxDepth: Int, g: Gen[(T, T) => T], leaves: Gen[T]): Gen[T] = {
def go(depth: Int): Gen[T] = {
if (maxDepth) == 0 leaves
else {
Gen.oneOf(
for {
fn <- g
next = go(maxDepth - 1)
left <- next
right <- next
} yield fn(left, right),
leaves
)
}
}
recurse(maxDepth)
}
tree2(50, Gen.const((left, right) => Branch(left, right)), Gen.const(Leaf))
FYI, scalacheck-shapeless allows to (almost) explicitly limit the size of the generated trees, since its version 1.1.4
, like
import org.scalacheck._, Shapeless._
sealed trait Tree
case object Leaf extends Tree
case class Branch(left: Tree, right: Tree) extends Tree
implicit val treeRecursive = derive.Recursive[Tree](Gen.const(Leaf)) // argument is the generator used (at the leaves) if the generated tree gets too big
The size
from the Gen.Parameters
used during the generation then makes the tree generation stop after some depth.
~~Currently, a small glitch in the case class generation makes this not put a hard limit on the depth of the generated trees, but it's about to be fixed.~~ (https://github.com/alexarchambault/scalacheck-shapeless/pull/52 fixes that)
Very cool! Would love to be able to do it without adding a dependency on shapeless though.
@mosesn Just do like scalacheck-shapeless does: use Gen.sized
/ Gen.resize
, making sure the Tree
arbitrary makes the size decrease.
@alexarchambault how do I make the size decrease?
in scalaz (Rose Tree)
- https://github.com/scalaz/scalaz/blob/v7.3.0-M7/scalacheck-binding/src/main/scala/scalaz/scalacheck/ScalazArbitrary.scala#L311-L336
- https://github.com/scalaz/scalaz/blob/v7.3.0-M7/tests/jvm/src/test/scala/scalaz/TreeTestJVM.scala#L13-L16
@alexarchambault I don't think using Gen.resize like you do in scalacheck-shapeless doesn't seem like it would make ad-hoc trees simpler.
Clearly this is a common use case, since there are at least two in finagle, there's one in scalaz, and one in scalacheck-shapeless. Maybe it's time for one of the generic implementations to graduate to scalacheck? What do you think, @rickynils @non?
actually, @tixxit even took a stab at generating DAGs in a generic way.
@oscar-stripe is that open source? might it get into scalacheck?
@rickynils have you had a chance to take a look at this? what do you think?
@mosesn It is not open source (currently just a prototype in a gist somewhere). I'll take a stab at cleaning it up and "releasing" it this weekend though!
@tixxit oh, cool! have you had a chance to open source it yet?
On the topic of generating test data for recursive data structures, this answer clarified how to use Gen.sized and Gen.resized:
http://stackoverflow.com/a/42855840/409976.
I ran into this and couldn't get scalacheck-shapeless
to generate instances (likely due to the fact that the recursion was indirect, the ASTs were wide, and there was some mutual recursion).
Thinking about the problem, I realized that a big part of the issue is that the applicative combinators (mostly thinking of resultOf
) pass through the same size to the generators for the subtrees. This tends not to matter too much if your types aren't too big, but it becomes a showstopper in the presence of recursion. Borrowing from the ideas of the SO question above, I came up with these additional combinators:
/** Split the current generator size into the specified number of sub-groups.
*
* The sum of the sizes should equal 1 less than the initial generator size. The length of the
* list returned is equal to the requested number of groups.
*
* @param n how many sub-groups to split into?
* @return size of sub-groups
*/
private[this] def partitionReducedSize(n: Int): Gen[Seq[Int]] =
for {
size <- Gen.size
decrementedSize = size - 1
if decrementedSize >= 0
groupSize = decrementedSize / n
remainder = decrementedSize % n
groups = List.tabulate(n)(i => if (i < remainder) 1 + groupSize else groupSize)
shuffledGroups <- Gen.pick(n, groups)
} yield shuffledGroups.toList
def reducedResultOf[T1: Arbitrary, R](f: T1 => R): Gen[R] =
for {
Seq(s1) <- partitionSize(1)
t1 <- Gen.resize(s1, arbitrary[T1])
} yield f(t1)
def reducedResultOf[T1: Arbitrary, T2: Arbitrary, R](f: (T1, T2) => R): Gen[R] =
for {
Seq(s1, s2) <- partitionSize(2)
t1 <- Gen.resize(s1, arbitrary[T1])
t2 <- Gen.resize(s2, arbitrary[T2])
} yield f(t1, t2)
def reducedResultOf[T1: Arbitrary, T2: Arbitrary, T3: Arbitrary, R](f: (T1, T2, T3) => R): Gen[R] =
for {
Seq(s1, s2, s3) <- partitionSize(3)
t1 <- Gen.resize(s1, arbitrary[T1])
t2 <- Gen.resize(s2, arbitrary[T2])
t3 <- Gen.resize(s3, arbitrary[T3])
} yield f(t1, t2, t3)
// and so on
With this, I was able to just use Gen.lzy
+ Gen.oneOf
for sealed class
/sealed trait
and reducedResultOf
for case class
. The "size" of the generator bounds the total number of nodes in the generated AST. The only other gotcha I worked around is to add a biased variant of Gen.oneOf
that short-circuits to a base case Gen
if the size is too small (since otherwise the tree generator risks failing due to too many skipped test cases).
I do wish that utilities like these came with ScalaCheck (OTOH, I see there isn't much for this in Haskell's QuickCheck either...). If there is interest I can open a PR to upstream these.