distributional icon indicating copy to clipboard operation
distributional copied to clipboard

Comparing distributions

Open mitchelloharawild opened this issue 4 years ago • 7 comments

Comparisons are a reasonable operation to perform on a vector, and so a reasonable approach for comparing distributions with numbers or distributions is required.

This is especially problematic when inverting a Box-Cox transformation with lambda < 0, which has an exception for certain values: https://github.com/tidyverts/fabletools/blob/9812804afd6602ed5ddd5dcb262bba9728c8c2de/R/box_cox.R#L28-L30

library(distributional)
fabletools::inv_box_cox(dist_normal(), -0.3)
#> Registered S3 methods overwritten by 'fabletools':
#>   method                  from          
#>   guide_geom.guide_level  distributional
#>   guide_train.level_guide distributional
#> Error: Can't compare lists with `vctrs_compare()`

Created on 2020-04-20 by the reprex package (v0.3.0)

In order to support https://github.com/tidyverts/fabletools/issues/126, we now pass the distribution itself into the transformation function (so x is some dist_normal()). This allows us to make simplifications where possible, so that dist_normal(0,1)+1 gives N(1,1) rather than t(N(0,1)), and hence resolving #126 and creating simpler data objects where possible.

The issue is with how do you compare a distribution with a numeric. What should dist_normal() < 0 mean, and how does that compare with what the box_cox() code wants it to return. Some ideas include:

  • Return a transformed distribution which now has a quantile method that returns TRUE or FALSE depending on if the base quantile is <0 or >0.
  • Return the probability P(x<0)

Following on from this, should the choice here extend to comparing distributions? distA < distB?

mitchelloharawild avatar Apr 20 '20 08:04 mitchelloharawild