qcheck icon indicating copy to clipboard operation
qcheck copied to clipboard

Provide unfold in Gen and/or in Tree

Open sir4ur0n opened this issue 4 years ago • 6 comments

I had this idea for an alternative constructor (to be discussed! I'm not sure):

  let pure x = Tree (x, Seq.empty)

  let rec make_unfold (f: 't -> 'a * 't Seq.t) (t0: 't) : 'a t =
    let root, subs = f t0 in
    Tree (root, Seq.map (fun sub_t -> make_unfold f sub_t) subs)

which does the recursion directly. It's directly linked to what @aspiwack was saying about the abstract encoding of trees.

in the .mli:

  val make_unfold : ('t -> 'a * 't Seq.t) -> init:'t -> 'a t
  (** [make_unfold f ~init] uses [f] on [init] to compute the root of the tree
      and a list of sub-states of type ['t].
      Each of theses sub-states is then lazily turned into a tree by
      a recursive call to {!make_unfold}.
  *)

Originally posted by @c-cube in https://github.com/c-cube/qcheck/pull/109#discussion_r625457909

sir4ur0n avatar May 27 '21 15:05 sir4ur0n

My 2 cents:

  • I am not convinced by exposing Tree.unfold because Tree.t is currently only exposed in read-only. I don't see any use case for users to build a value of type Tree.t. They should only be interested in Gen.t. (But it could still be part of the implementation)
  • Exposing Gen.unfold makes sense, and I would even go as far as saying "We can remove Gen.make_primitive" if we add such a Gen.unfold. One thing I can't decide is whether the initial state should be a 'state or a 'state Gen.t
    • 'state: this initial state needs to get a form of randomness somewhere, right? Otherwise it would be a generator that always returns exactly the same thing. Typical usage would be some_gen >>= fun input -> Gen.unfold f ~init:input (but this would still first use some_gen shrinks before shrinking on output of f) or Gen.prune some_gen >>= fun input -> Gen.unfold f ~init:input to get a clean, f-only shrinks.
    • 'state Gen.t: should we reuse its shrinks? While I don't have any example popping in my mind, I suspect both could be useful. Does that mean we should provide 2 functions?

Actually by putting it into words, I would go for solution 1, but abundantly document both with and without Gen.prune depending on if the user wants to shrink on the input, or only get an initial value and then only rely on f to build shrinks

What do you think?

sir4ur0n avatar May 27 '21 15:05 sir4ur0n

I think it's important to provide escape hatches if people need to write custom generators; why is it bad to expose a builder for Tree.t if we keep the type hidden?

edit: and yes, let the state management be handled by a monad or applicative, it's more orthogonal this way imho. This way one can even cleanly interleave the shrinking of the state and the shrinking of each unfold…

c-cube avatar May 27 '21 16:05 c-cube

Why is it useful for users to manually craft a tree? They can't build a Gen.t from that :thinking:

sir4ur0n avatar May 27 '21 16:05 sir4ur0n

Well I'd expose a make_primitive : (random_state -> 'a tree) -> 'a gen :p

c-cube avatar May 27 '21 16:05 c-cube

Well I think it's not a good idea to expose Random.State.t (I suspect this is what you meant by random_state?) if it's not necessary, because it will add even more breaking changes or backward-compatibility nightmares when moving to another Random library (e.g. one that supports splitting)... And I think if there is Gen.unfold then there is no need to manually craft a Tree.t nor a need to expose make_primitive to users.

Since all functions of Random.State.t are indirectly exposed in Gen, the solution I proposed would be enough to conveniently create all possible generators, wouldn't it?

sir4ur0n avatar May 27 '21 16:05 sir4ur0n

Ok, so:

  • for a start, I'm not sure we'll ever move away from Random.State.t, not if we want to keep some compatibility. Here again we may have some internal sum type with RS of Random.State.t | Splittable of whatever | …. So it could be that random_state -> 'a tree comes in several flavors depending on what state we wa,t
  • why is Gen.unfold better (and we haven't discussed it before afaik, what is the signature?). I'm pretty sure it's sometimes simpler to be able to use "old style" separated generator and shrinker, even if it's a building block in a more complicated property. In this case being able to generate the first value, and then turn it into a tree by calling shrink_as_seq first_value is convenient.

c-cube avatar May 27 '21 16:05 c-cube