QuickChick icon indicating copy to clipboard operation
QuickChick copied to clipboard

Combinator library × eager evaluation = bad performance?

Open Lysxia opened this issue 6 years ago • 8 comments

Consider the following generator of positive. That is more or less what Derive Arbitrary for positive. outputs and that is also affected by this issue.

Although one might expect this to run in time linear in n, this actually has exponential complexity. Under eager evaluation, we pay for the two recursive calls g n' even though only one is needed.

Require Import QuickChick.QuickChick.

Require Import String.
Require Import List.
Import ListNotations.

Require Import PArith.

Instance Gen_pos : Gen positive :=
  { arbitrary := sized (fix g n :=
      match n with 
      | O => ret xH
      | S n' =>
        frequency (ret xH)
                  [(1, ret xH);
                   (1, liftGen xI (g n'));
                   (1, liftGen xO (g n'))]
      end)
  }.

Instance Show_pos : Show positive := { show t :=
  match t with xH => "1" | _ => "_" end }.

Time Sample (resize 17 arbitrary : G positive).
(* 2.5s *)

Time Sample (resize 18 arbitrary : G positive).
(* 5s *)

More generally, eager evaluation often causes us to evaluate the "spine" of the generator before actually running it. Thus:

  • we have to "traverse" the generator twice;
  • as the above example shows, that spine can be exponentially larger than the actual output of the generator.

A workaround is to separate the frequency choice from the recursive calls g n' with a monadic >>=, delaying evaluation under a lambda, but that still assumes one is aware of the problem in the first place.

Is there a good way to improve the API to avoid this problem?

Instance Gen_pos : Gen positive :=
  { arbitrary := sized (fix g n :=
      match n with 
      | O => ret xH
      | S n' =>
        x <- frequency (ret 0) [(1, ret 0), (1, ret 1), (1, ret 2)]
        match x with
        | 1 => liftGen xI (g n')
        | 2 => liftGen xO (g n')
        | _ => ret xH
        end
      end)
  }.

Probably explains #22

Lysxia avatar May 28 '18 21:05 Lysxia

One solution could be to make frequency take thunks fun (_ : unit) => ... and use the freq notation already in place to ensure the user never actually sees that. But that would break a lot of the proofs...

lemonidas avatar May 29 '18 07:05 lemonidas

We could also use notations that eta expand all combinators, that would also be more efficient in general.

Lysxia avatar Jun 27 '18 20:06 Lysxia

I added a new combinator so it's at least possible to work around this manually.

I'm still not sure what the best way to fix the notations is. Full laziness seems ideal for performance, but we would lose the generic monadic interface from ext-lib.

Lysxia avatar Jun 30 '18 22:06 Lysxia

Is it really more efficient to use thunkGen everywhere instead of bind? A good compromise would be to have thunked generators in all combinators that take multiple generators as arguments (oneOf, frequency, backtrack, etc.) We already use notations to hide the default elements...

lemonidas avatar Jun 30 '18 22:06 lemonidas

It may be alright. I can't think of a good example where we pay for more than a constant factor.

I applied your solution in fc3bc8e. It doesn't seem possible (nor a good idea) to steal the notation for pairs so freq will have to use a new %: combinator. It's also more convenient by removing one set of parentheses compared to the pair notation.

Derive Arbitrary needs to be updated right?

Lysxia avatar Jul 01 '18 01:07 Lysxia

I'm noticing similar problems elsewhere in QC too while testing vellvm.

A simple example is whenFail. Coming from Haskell-land I didn't think about the fact that the string parameter would be evaluated first, and on every test case. In the case of vellvm, this means a lot of wasted effort serializing large LLVM programs for every test case, unless you work around the issue.

In fact, I think it's even worse. I could be mistaken, but it seems like with forAll, every time you generate a value it actually calls show, which could be really expensive! This should be easy to thunk behind the scenes, though?

Definition forAll {A prop : Type} {_ : Checkable prop} `{Show A}
           (gen : G A)  (pf : A -> prop) : Checker :=
  bindGen gen (fun x =>
                 printTestCase (show x ++ newline) (pf x)).

Chobbes avatar Jul 27 '22 14:07 Chobbes

printTestCase (show x ++ newline) (pf x)).

Oh god. Good catch!

Lysxia avatar Jul 28 '22 09:07 Lysxia

sigh

Great catch. Maybe redesign the main API functions for 2.0? I'm gearing up towards its release soon.

lemonidas avatar Jul 28 '22 09:07 lemonidas