derive icon indicating copy to clipboard operation
derive copied to clipboard

quickcheck's arbitrary derivation is bad for data types mostly with recursive constructors

Open ndmitchell opened this issue 9 years ago • 2 comments

From https://code.google.com/p/ndmitchell/issues/detail?id=561

Let's define a datatype for expressions

data Expr = Plus Expr Expr | Minus Expr Expr | Times Expr Expr | Divide Expr Expr | Number Integer

The derived instance has the same probabilty of every constructor.

So the probability for a recursive constructor is 4/5 and the probability of a nonrecursive one is 1/5.

When using quickcheck it generates random data structures using this function. When the probability is like that I noticed that most checks loop, because they build a super big tree, with so many nodes, that the probability of finishing it is near zero.

I was thinking of a solution and a fast hackish solution would be to just distribute the sum of 1/2 probability to all nonrecursive constructors, and 1/2 among all other constructors (or just check the constants using a few thousand checks on some tree size counting function).

A better solution would need a piece of paper and a pen and a probability wikipedia page.

ndmitchell avatar Jun 05 '15 20:06 ndmitchell

use smallcheck instead?

jwaldmann avatar Feb 09 '16 14:02 jwaldmann

This is a library for deriving instances, and it can derive both SmallCheck and QuickCheck instances. The comment above refers to the quality of the generated QuickCheck instances. Certainly someone could just use SmallCheck, but there are many reasons to prefer QuickCheck in a lot of situations (I often use both).

ndmitchell avatar Feb 11 '16 20:02 ndmitchell