ProbabilityBoundsAnalysis.jl icon indicating copy to clipboard operation
ProbabilityBoundsAnalysis.jl copied to clipboard

Use any copula from `Copulas.jl`

Open lrnv opened this issue 1 year ago • 3 comments

Hey,

What would be needed for your code to use any copulas from Copulas.jl? It might reduce maintenability burden in your copulas.jl file, and allow for many more parametric (or non-parametric) copula assumptions, as the possibilities of Copulas.jl allow many more models that what you have here.

If you are interested in making this move, I might try to sketch a PR, but I'll need your help ;)

It would also allow for multivariate pboxes.

lrnv avatar Aug 08 '22 15:08 lrnv

Hi!

Thanks for opening this. Yes, I think this would be beneficial. We originally had BivariateCopulas.jl as a dependence, but we decided to re-implement here as we were playing with some new ideas. I think moving to your more general package would be great.

I don't think it's very difficult, but there are a couple of features I'd like to keep.

  • We should allow for parametric imprecise copulas, which are a set of copulas given by the point-wise upper and lower bounds (e.g., [W, M] for bounds on all copulas). We currently make these by introducing interval parameters to a copula family, and working out the bounds, which for the current 2-copulas that we have is just the envelope of interval endpoints. Not sure how this generalises to higher dimensions. For the first PR, we could leave this out for copulas higher than 2.

  • We use an (experimental) outer approximation to copulas in the same way we do for p-boxes. P-boxes and distributions are represented using a finite vector of inverse cdfs evaluated on a grid in [0,1]. Bounds on all probabilistic quantities (samples, cdf, measure) are then derived in terms of this representation. We experimented with doing the same with copulas: https://github.com/AnderGray/ProbabilityBoundsAnalysis.jl/blob/319712bd5a7bc58bbab546b6f311a4efab306419/src/pbox/copulas.jl#L50-L53 where cdfU and cdfD are the 'up' and 'down' values of the copula on the same probability grid as the marginals. This also allows for a nice representation of imprecise copulas. This should be quite easy though, here's how independence is defined https://github.com/AnderGray/ProbabilityBoundsAnalysis.jl/blob/319712bd5a7bc58bbab546b6f311a4efab306419/src/pbox/copulas.jl#L394-L402

  • C- or H-volume: currently we only have the 2D version, which we call mass (for probability mass, measure is more a exact name): https://github.com/AnderGray/ProbabilityBoundsAnalysis.jl/blob/319712bd5a7bc58bbab546b6f311a4efab306419/src/pbox/copulas.jl#L190 We actually have a N-dimensional Julia implementation of this in which we haven't added yet. But maybe this should be added to Copulas.jl ? :)

In short, I think a copula type should be kept (maybe calling it something else), and move all the cdf and sampling calls to your package.

AnderGray avatar Aug 09 '22 13:08 AnderGray

Thanks for all these precise informations.

  • For imprecise copulas, assuming the family exists in Copulas.jl (which is not the case for W and M e.g., which are not yet in the same family), you should be able to do exactly the same trick using Distributions.cdf on the copula object with an interval parameter. We'll also have to check that Copulas.jl's code is complient with interval parameters.

Hum... Intervals seems not to comply to the abstractfloat standard. What about using Measurements ?

julia> using Copulas, Distributions, Measurements, Intervals

julia> cdf(ClaytonCopula(3,Interval(1,2)),[0.5,0.5,0.5])
ERROR: MethodError: no method matching sign(::Interval{Int64, Closed, Closed})
Closest candidates are:
  sign(::Unsigned) at C:\Users\lrnv\AppData\Local\Programs\Julia-1.7.2\share\julia\base\number.jl:163
  sign(::Rational) at C:\Users\lrnv\AppData\Local\Programs\Julia-1.7.2\share\julia\base\rational.jl:254
  sign(::Dates.Period) at C:\Users\lrnv\AppData\Local\Programs\Julia-1.7.2\share\julia\stdlib\v1.7\Dates\src\periods.jl:103  
  ...
Stacktrace:
 [1] ϕ⁻¹(C::ClaytonCopula{3, Interval{Int64, Closed, Closed}}, t::Float64)
   @ Copulas c:\Users\lrnv\.julia\dev\Copulas\src\ArchimedeanCopulas\ClaytonCopula.jl:22
 [2] cdf(C::ClaytonCopula{3, Interval{Int64, Closed, Closed}}, u::Vector{Float64})
   @ Copulas c:\Users\lrnv\.julia\dev\Copulas\src\ArchimedeanCopula.jl:15
 [3] top-level scope
   @ REPL[32]:1

julia> cdf(ClaytonCopula(3,0.5 ± 0.5),[0.5,0.5,0.5])
0.199 ± 0.06

julia> 
  • cdfU and cdfD could be extracted from Distributions.cdf on the (interval-)copula and on your grid. This should be straightforward (a simple call to cdf), but I think this extraction should be inside probabilityBoudsAnalaysis.jl and not Copulas.jl. For me, this logic should stay here, but be "standardized" to accept any copula.

  • the H-volume should definitely go to Copulas.jl, indeed. I am not sure about the naming of the function, but I am definitely sure the implementaiton of the generic multivariate case should be easy (already done that in R a few years ago, and got a neat ref on fast implementation of such H-volume). An interface such as :

function measure(C::Copula{d}, intervals::NTuple{d,Interval}) where d

is probably the way to go.

I'll try to address some of these once I have some time upfront. Thanks :)

lrnv avatar Aug 09 '22 14:08 lrnv

I now implemented Copulas.measure(C,u,v), which computes the C-volume of the hypercube [u,v] in any dimensions. if you think your port of the Matlab code is better than what I did, you might open a PR.

Furthermore, I think that the computation of the grid points should stay in ProbabilityBoundsAnalysis.jl for the first draft of this PR: only the generation of these points might be standardized to a call to any copula from Copula.jl, and thus the upper/lower part of the mechanism will stay here.

Do you think the tests of this repo are enough to allow me to make these changes without breaking anything you need ? If you feel confortable abut this, i'll try to make a PR that passes your tests.

lrnv avatar Nov 23 '22 13:11 lrnv