propcheck icon indicating copy to clipboard operation
propcheck copied to clipboard

Statistics for targeted properties

Open alfert opened this issue 5 years ago • 3 comments

The heat map is interesting thingy. It is a visualisation of the statistics usually done by collect() and its friends, but which cannot be used by PropEr's TPBT implementation. Also the examples from Fred Hebert's book would be way nicer because such a statistic could replace the printout of all generated data.

At best, such a statistic would be a part of PropEr. If we do it in PropCheck only, than I assume we need to start some GenServer collecting the data and generate a ref for each run. In contrast to regular properties, we would to explicitly send the data to server process. Perhaps we can hide this in the forall_targed etc macros.

Originally posted by @alfert in https://github.com/alfert/propcheck/issues/87#issuecomment-462049972

alfert avatar Feb 09 '19 19:02 alfert

A heat map of the generated values is a very interesting idea and while there is one in PropEr TPBT tutorial I think it is possible to get one with random testing too. The issue though is finding a generic way of generating these heatmaps: what do we put on x and y axises? Tools should have these axes as inputs I guess. This idea may have already been explored in other Quickcheck/hypothesis implementations? Anyway I am interested and can probably help get this in PropEr :)

fenollp avatar Feb 10 '19 15:02 fenollp

The issue though is finding a generic way of generating these heat maps: what do we put on x and y axises?

I was wondering about that as well. The heat map works great for the example on TPBT, because the search space is two dimensional over natural numbers. This of course does not generalize well. I find it hard to imagine what a possible visualization might look like, possibly because of missing examples.

I wonder if principal component analysis (PCA) could be used for something like this, but the resulting visualizations are certainly not easy to interpret if one doesn't already have some understanding of how PCA works. The problem with PCA is that we cannot easily map generated terms for a variable to a corresponding axis. Some metric might have to be applied. Alas, as the bindings also allow a variable to accept completely different types, such a metric is not exactly something which is easy to find.

evnu avatar Mar 18 '19 22:03 evnu

Aren't that two things, which should be handled independently:

  • a way of streaming the generated values out of the property
  • apply some kind of statistics, visualisation or similar to that stream of data and report on that

My impression is that the appropriate visualisation depends on the use case and may be used for general properties as well whereas the streaming part is something new which is not available in Proper.

alfert avatar Mar 19 '19 07:03 alfert