Copulas.jl icon indicating copy to clipboard operation
Copulas.jl copied to clipboard

[Feature req] Convolutions of dependent random vectors.

Open rand5 opened this issue 10 months ago • 27 comments

I'm migrating a problem discussed here that involves computing the convolution of two dependent random variables where their joint distribution is defined by a copula. I believe the issue may stem from the fact that evaluating the pdf of the copula sometimes returns NaN when it should return 0. Here is an example with marginal distributions that should have support over all reals:

X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.1),X1],[0.9,0.1])

ρ = 0.4

C = GaussianCopula([1. ρ; ρ 1.])

D1 = SklarDist(C,(X1,X2))

pdf(D1,[-30., -30.]) #Returns NaN

The link above has an attempt at a brute force work-around, but it doesn't seem to be sufficient (see the plot in the link).

rand5 avatar Jan 09 '25 19:01 rand5

Looks indeed like a bug, or rather a missed special case somewhere. By the way:

julia> pdf(D1,big.([-30., -30.])) # works with bigfloats
9.800107565129330824766557211886842343262822051986735333987174617133175015382767e-5024

julia> pdf(C, [0,0]) # here is where your nan is coming from
NaN

julia>

so yes, should definitely return 0. Let me take a look as soon as i have a moment.


ERR: Scratch that, the bug is comming from Distributions.jl :

julia> pdf(MvNormal(C.Σ),[-Inf,-Inf])
NaN

julia>

lrnv avatar Jan 09 '25 19:01 lrnv

I tried a few other copulas (Clayton, Gumbel, Joe) and am getting similar domain errors. Would this need to be addressed for C::ArchimedeanCopula as well?

rand5 avatar Jan 09 '25 19:01 rand5

For archimedeans the wanted pdf value at [0,0] ios not always 0, depends on the archimedean. But yes if some of them give domainerrors that is an issue.

Can you post your other examples ? Let's make tests cases out of them so that it stays fixed in the future.

lrnv avatar Jan 09 '25 20:01 lrnv

Hum... Taking another look at the gaussian case, the NaN actually comes from Distributions.jl :

julia> pdf(MvNormal(C.Σ),[-Inf,-Inf])
NaN

julia>

lrnv avatar Jan 09 '25 20:01 lrnv

Here is what I was using to check the Archimedian Copulas

X1 = Normal()


d = 2
θ = 1.

Cs = [ClaytonCopula(d,θ), FrankCopula(d,θ), GumbelCopula(d,θ), JoeCopula(d,θ)] #Copulas to check. Make sure θ is appropriate

for k=1:length(Cs)
    D1 = SklarDist(Cs[k],(X1,X1))

    N = 100
    Zs = collect(range(-10.,10.,N))
    ConvD = map(Zs) do z
        quadgk(x->pdf(D1,[x,z-x]), -100., 100., rtol=1e-6)[1]
    end
end

It is not clear to me at which point the NaN is being produced since

X1 = Normal()

d = 2
θ = 1.

C = ClaytonCopula(d,θ)

 D1 = SklarDist(C,(X1,X1))

N = 100
Zs = collect(range(-10.,10.,N))
ConvD = map(Zs) do z
    quadgk(x->pdf(D1,[x,z-x]), -100., 100., rtol=1e-6)[1]
end

gives

DomainError with 0.0:
integrand produced NaN in the interval (-100.0, 100.0)

Stacktrace:
  [1] evalrule(f::var"#87#89"{Float64}, a::Float64, b::Float64, x::Vector{Float64}, w::Vector{Float64}, wg::Vector{Float64}, nrm::typeof(norm))
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\evalrule.jl:39
  [2] #8
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:54 [inlined]
  [3] ntuple
    @ .\ntuple.jl:48 [inlined]
  [4] do_quadgk(f::var"#87#89"{Float64}, s::Tuple{Float64, Float64}, n::Int64, atol::Nothing, rtol::Float64, maxevals::Int64, nrm::typeof(norm), _segbuf::Nothing, eval_segbuf::Nothing)
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:52
  [5] #50
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:83 [inlined]
  [6] handle_infinities(workfunc::QuadGK.var"#50#51"{Nothing, Float64, Int64, Int64, typeof(norm), Nothing, Nothing}, f::var"#87#89"{Float64}, s::Tuple{Float64, Float64})
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:189
  [7] #quadgk#49
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:82 [inlined]
  [8] quadgk
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:80 [inlined]
  [9] #86
    @ .\In[152]:14 [inlined]
 [10] iterate
    @ .\generator.jl:47 [inlined]
 [11] _collect(c::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#86#88"}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base .\array.jl:854
 [12] collect_similar(cont::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#86#88"})
    @ Base .\array.jl:763
 [13] map(f::Function, A::Vector{Float64})
    @ Base .\abstractarray.jl:3285
 [14] top-level scope
    @ In[152]:13

but

pdf(D1,[0.,0.]) # not NaN

rand5 avatar Jan 09 '25 20:01 rand5

Okay so let's split up.

For the Gaussian, i reported upstream there : https://github.com/JuliaStats/Distributions.jl/issues/1937

For the archimedeans, I'll come back when I know more

lrnv avatar Jan 09 '25 20:01 lrnv

@rand5 The archimedean ones should be fixed in the (currently registrating) v0.1.27 version of the copulas package. There was indeed an issue with the coputation of the pdf in the corners. On the new version, I have :

using Copulas, Distributions, QuadGK
X1 = Normal()


d = 2
θ = 1.

Cs = [
    ClaytonCopula(d,θ), 
    FrankCopula(d,θ), 
    GumbelCopula(d,θ), 
    JoeCopula(d,θ)
] #Copulas to check. Make sure θ is appropriate

N = 100
Zs = collect(range(-10.,10.,N))
ConvDs = zeros(length(Cs), length(Zs))
for k=1:length(Cs)
    D1 = SklarDist(Cs[k],(X1,X1))
    ConvDs[k,:] = map(Zs) do z
        quadgk(x->pdf(D1,[x,z-x]), -100., 100., rtol=1e-6)[1]
    end
end

p = plot()
for k in 1:length(Cs)
    plot!(p, Zs, ConvDs[k,:], label=k)
end
p

producing the folloiwng densities:

plot_6

However, the Gaussian one is not fixed yet because the issues comes from Distributions.jl (and potentially form PDMats.jl directly), will take a bit more time as it is not up to me alone. :/

Apart the from gaussian, is this fix enough to cover what you need or do you have other questions?

lrnv avatar Jan 10 '25 11:01 lrnv

@lrnv I'm using v0.1.27 and still getting domain errors:

X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.1),X1],[0.9,0.1])

θ = 2.5
d= 2
C = GumbelCopula(d,θ)

D1 = SklarDist(C,(X1,X2))

N = 100
Zs = collect(range(-10.,10.,N))
ConvD = map(Zs) do z
    quadgk(x->pdf(D1,[x,z-x]), -100., 100., rtol=1e-6)[1]
end

which throws:

DomainError with 0.0:
integrand produced NaN in the interval (-100.0, 100.0)

Stacktrace:
  [1] evalrule(f::var"#2#4"{Float64}, a::Float64, b::Float64, x::Vector{Float64}, w::Vector{Float64}, wg::Vector{Float64}, nrm::typeof(norm))
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\evalrule.jl:39
  [2] #8
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:54 [inlined]
  [3] ntuple
    @ .\ntuple.jl:48 [inlined]
  [4] do_quadgk(f::var"#2#4"{Float64}, s::Tuple{Float64, Float64}, n::Int64, atol::Nothing, rtol::Float64, maxevals::Int64, nrm::typeof(norm), _segbuf::Nothing, eval_segbuf::Nothing)
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:52
  [5] #50
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:83 [inlined]
  [6] handle_infinities(workfunc::QuadGK.var"#50#51"{Nothing, Float64, Int64, Int64, typeof(norm), Nothing, Nothing}, f::var"#2#4"{Float64}, s::Tuple{Float64, Float64})
    @ QuadGK C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\adapt.jl:189
  [7] #quadgk#49
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:82 [inlined]
  [8] quadgk
    @ C:\Users\Monali\.julia\packages\QuadGK\BjmU0\src\api.jl:80 [inlined]
  [9] #1
    @ .\In[3]:4 [inlined]
 [10] iterate
    @ .\generator.jl:47 [inlined]
 [11] _collect(c::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#1#3"}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
    @ Base .\array.jl:854
 [12] collect_similar(cont::Vector{Float64}, itr::Base.Generator{Vector{Float64}, var"#1#3"})
    @ Base .\array.jl:763
 [13] map(f::Function, A::Vector{Float64})
    @ Base .\abstractarray.jl:3285
 [14] top-level scope
    @ In[3]:3

rand5 avatar Jan 10 '25 16:01 rand5

pdf(D1,[-100.,100.]) # = NaN

Edit: this error is being thrown despite the fact the test code above it works fine

rand5 avatar Jan 10 '25 16:01 rand5

@rand5 You can replace the pdf call in your loop with

function verbose_pdf(D,x)
    u = cdf.(D.m, x)
    @show u, pdf(D.C, u), pdf.(D.m, x)
    return pdf(D, x)
end
ConvD = map(Zs) do z
    quadgk(x->verbose_pdf(D1,[x,z-x]), -100., 100., rtol=1e-6)[1]
end

With this, we see that in the Gumbel case the problem comes from values of the pdf on the boundary:

# These calls are currently NaN but should be 0:
pdf(GumbelCopula(2,2.5), [0.1,0.0])
pdf(GumbelCopula(2,2.5), [0.0,0.1])

# This one is truly undefined, but for simplicity well make it zero too.
pdf(GumbelCopula(2,2.5), [0.0,0.0])

This is fixed in #244 and now released in v0.1.28, if you have time to take a shot ?

lrnv avatar Jan 13 '25 11:01 lrnv

@lrnv Thank you for fixing that- I can confirm I'm no longer getting the domain errors that were being thrown before.

I think this has opened the door to another issue though. Take a look at these jumps:

X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.1),X1],[0.9,0.1])

θ = 2.5
d= 2
C = GumbelCopula(d,θ)

D1 = SklarDist(C,(X1,X2))

N = 1000; zmin = -10.; zmax = 60.
Zs = collect(range(zmin,zmax,N))

ConvD = map(Zs) do z
    quadgk(x->pdf(D1,[x,z-x]), -Inf, Inf)[1]
end

plot(scatter(;x=Zs,y=ConvD))

newplot (25)

If the standard deviation of the Normal component of X2 increases, they disappear, and if the standard deviation decreases they increase. Here is with

X2 = MixtureModel([Normal(-6.,0.05),X1],[0.9,0.1])

newplot (26)

I see similar behavior X1 is a Gamma distribution instead of a Gumbel. I don't see the issue if X1 is Normal.

Let me know if you'd like me to migrate this to its own issue.

rand5 avatar Jan 13 '25 19:01 rand5

I guess this is simply integration errors and has nothing more to do with Copulas.jl in particular, right ?

One "well-converging approximation" would be to decompose your integral and approximate it by monte-carlo on X1. Using your second most-problematix X2, we obtain:

using Copulas, Distributions, QuadGK, Plots
X1 = Gumbel(15.,5.)
X2 = MixtureModel([Normal(-6.,0.05),X1],[0.9,0.1])

θ = 2.5
d= 2
C = GumbelCopula(d,θ)

D1 = SklarDist(C,(X1,X2))

N = 1000; zmin = -10.; zmax = 60.
Zs = collect(range(zmin,zmax,N))
ConvD = similar(Zs)

x1 = rand(X1, 10000) # you need enough for monte-carlo to kick in.
u1 = cdf.(X1, x1)
x2, u2 = similar(x1), similar(u1)
for i in eachindex(Zs)
    x2 .= Zs[i] .- x1
    u2 .= cdf(X2, x2)
    p2 = pdf(X2, x2)
    c = pdf(D1.C, hcat(u1,u2)')
    ConvD[i] = mean(c .* p2)
end

plot(Zs, ConvD)

plot_13

You might find other more efficient and reliable integration methods, though.

If you do find something reliable and generic enough, I would be interested to provide the convolve(D::SklarDist) function in the package.


Edit: You could also leverage the symetry and exchange x1 and x2 to obtain another approximate:

# We could do the same by inverting 1 and 2 : 
ConvD_sym = similar(Zs)
x2 = rand(X2, 10000) # you need enough for monte-carlo to kick in.
u2 = cdf.(X2, x2)
x1, u1 = similar(x2), similar(u2)
for i in eachindex(Zs)
    x1 .= Zs[i] .- x2
    u1 .= cdf(X1, x1)
    p1 = pdf(X1, x1)
    c = pdf(D1.C, hcat(u1,u2)')
    ConvD_sym[i] = mean(c .* p1)
end

plot!(Zs, ConvD_sym)

plot_14

which is weirdly much more stable... Maybe we could understand why ?

lrnv avatar Jan 13 '25 20:01 lrnv

My last idea is to swap the mixture and the convolution, as these operations should commute nicely : X + (Y | Z) should be equal to (X+Y | X+Z), and the second one will be much easier to compute if the supports of Y and Z are separated as they are in our example. The presence of the copula does not change this.

lrnv avatar Jan 13 '25 20:01 lrnv

Looking at the Monte Carlo decomposition you described, it seems like the variability between the two approaches (going from X2->X1 versus X1->X2) comes down to whether we evaluate p1 = pdf(X1, x1) or p2 = pdf(X2, x2). The range of p1 is approximately [0, 0.07] while the range of p2 is two orders of magnitude higher at about [0, 7]. So when we go from X1->X2 and are forced to compute p2, we are mapping the variability of the Monte Carlo approach onto a much wider interval, which results in a less stable result.

I agree that the fundamental issue is that the supports are very different from each other and like the idea of taking advantage of the commute, but I'm having trouble translating this into a MWE. Let me know if you have something I can pressure test.

rand5 avatar Jan 13 '25 23:01 rand5

For the commute, you might want to leverage multiple dispatch, since every information you need is in the type of D1. Something like that:


# Special mixture case: 
function convolve(D::SklarDist{CT, Tuple{T1,T2}}) where {T1 <: Mixture}
	return Mixture(
		convolve(SklarDist(D.C, (D.m[1].components[1],D.m[2]))),
		convolve(SklarDist(D.C, (D.m[1].components[2],D.m[2]))),
		D.m[1].weights # carry on the mixing weights.
    )
end

# Generic version: 
function convolve(D::SklarDist)
	# do the generic stuff.
end

But I think this requires defining a subtype of Distributions.UnivariateDistribution to hold these convolutions.

lrnv avatar Jan 14 '25 07:01 lrnv

How about the following for a slightly revised implementation of the generic version:

function convolveF(D::SklarDist,Zs::Vector{Float64};N=1000)

    # The primary should be the function with the largest range
    MaxRange = [maximum(x->pdf(X,x),Zs) for X in D.m]
    PrimaryFunID = findmax(MaxRange)[2]
    SecondaryFunID = findmin(MaxRange)[2]

    ConvD = similar(Zs)

    xPrimary = rand(D.m[PrimaryFunID], N)
    uPrimary = cdf.(D.m[PrimaryFunID], xPrimary) 
    xSecondary, uSecondary = similar(xPrimary), similar(uPrimary)

    for i in eachindex(Zs)

        xSecondary .= Zs[i] .- xPrimary
        uSecondary .= cdf(D.m[SecondaryFunID], xSecondary)
        pSecondary = pdf(D.m[SecondaryFunID], xSecondary)

        if PrimaryFunID == 1
            c = pdf(D.C, hcat(uPrimary,uSecondary)')
        else
            c = pdf(D.C, hcat(uSecondary,uPrimary)')
        end
        ConvD[i] = mean(c .* pSecondary)
    end
        
    return ConvD
end

Testing it:

X1 = Gamma(3.,8.)
X2 = Normal(-6.,0.01)
X3 = Normal(-6.,1.01)
X4 = Normal(-10.,1.01)

θ = 2.5
d= 2
C = GumbelCopula(d,θ)

D1 = SklarDist(C,(X1,X2))
D2 = SklarDist(C,(X1,X3))
D3 = SklarDist(C,(X1,X4))

N = 1000; zmin = -20.; zmax = 80.
Zs = collect(range(zmin,zmax,N))
CD1 = convolveF(D1,Zs)
CD2 = convolveF(D2,Zs)
CD3 = convolveF(D3,Zs)

plot([scatter(;x=Zs,y=CD1),scatter(;x=Zs,y=CD2),scatter(;x=Zs,y=CD3)])

newplot (27)

I'm not that familiar with Copulas, and one thing I don't understand is why uPrimary = cdf.(D.m[PrimaryFunID], xPrimary) couldn't be replaced with uPrimary = rand(Uniform(),N). Shouldn't they be equivalent due to the Probability Integral Transform?

rand5 avatar Jan 14 '25 18:01 rand5

Yes they are equivalent, you can definitely replace these two lines with uPrimary = rand(N) and then xPrimary = quantile(D.m[PrimaryFunID], uPrimary), but it might be a bit slower.

Furthermore, your code will only work for bivariate SklarDist, and not for d-variates ones for d > 2, so maybe you could restrict to bivariates cases only (the dimension is available in the type of the object).

Anyway, it looks like on your example this is working quite nicely ? Do you have anything else that you need here ? May i ask you why you ended up on this problem in the first place ?

lrnv avatar Jan 14 '25 18:01 lrnv

Correct, this is just for bivariates. I was imagining that if there were 3 components (A, B and C), then we could do some sort of nested evaluation like convolveF(convolveF(A,B),C) but I haven't yet thought about the details.

I think the next key item for the bivariate case, which you highlighted, is to take the function evaluations that convolveF produces and convert them into univariate distributions that are compatible with MixtureModel. I don't have any experience defining custom distributions, but I'm starting to look at how the standard distributions are implemented.

The reason I'm on this topic is I'm trying to model a decision process where you need to assemble a portfolio of "games" that maximizes returns. For each game there are multiple opportunities to trigger small negative returns and an opportunity to trigger a much larger positive return, represented by something like:

Game1:

MixtureModel{Distribution{Univariate, Continuous}}(K = 3)
components[1] (prior = 0.3000): Normal{Float64}(μ=-0.2235, σ=0.012)
components[2] (prior = 0.0630): Normal{Float64}(μ=-0.2635, σ=0.0152)
components[3] (prior = 0.6370): Gamma{Float64}(α=7.0, θ=3.0)

Game2:

MixtureModel{Distribution{Univariate, Continuous}}(K = 5)
components[1] (prior = 0.4600): Normal{Float64}(μ=-0.0335, σ=0.0012)
components[2] (prior = 0.3564): Normal{Float64}(μ=-0.07350000000000001, σ=0.0044)
components[3] (prior = 0.0551): Normal{Float64}(μ=-0.2235, σ=0.0164)
components[4] (prior = 0.0116): Normal{Float64}(μ=-0.2635, σ=0.0196)
components[5] (prior = 0.1170): Gamma{Float64}(α=7.0, θ=3.0)

Game3:

MixtureModel{Distribution{Univariate, Continuous}}(K = 4)
components[1] (prior = 0.6600): Normal{Float64}(μ=-0.07350000000000001, σ=0.0032)
components[2] (prior = 0.1020): Normal{Float64}(μ=-0.2235, σ=0.0152)
components[3] (prior = 0.0214): Normal{Float64}(μ=-0.2635, σ=0.0184)
components[4] (prior = 0.2166): Gamma{Float64}(α=7.0, θ=3.0)

There is some dependence between Game1 and Game2 and between Game1 and Game3, which I'm modeling via copulas. The fundamental question is, is it better to play Game1 and Game2 or Game1 and Game3? The expected total return of playing Game1 and Game2 (or Game1 and Game3) is given by the convolution of those distributions, but, as you can see, in each case the support for the components associated with negative returns in each game is very different from the support for the positive returns, making it difficult to compute the convolutions.

rand5 avatar Jan 14 '25 19:01 rand5

For more than 2 dimensions, you will be able to do nested evaluations like convolveF(convolveF(A,B),C) if and only if the dependence structure can be "disasembled", which might not always be the case.

For the mixtures on the other hand, I think you'll better use the convolution of mixture <=> mixture of convolution. to recusively simplify your problem, by defining a few methods as I proposed in a previous message (if you have trouble doing it, please come back I'd gladly help you).

Very cool problem indeed ! Do not hesitate to forward the results / publications / output of the project i'd love to read it more in depth.

lrnv avatar Jan 14 '25 19:01 lrnv

@lrnv Any interest in collaborating on a technical paper?

rand5 avatar Jan 14 '25 19:01 rand5

@rand5 Sure, if I can be of any help

lrnv avatar Jan 15 '25 08:01 lrnv

With regards to your suggestion to do something like this:

function convolve(D::SklarDist{CT, Tuple{T1,T2}}) where {T1 <: Mixture}
	return Mixture(
		convolve(SklarDist(D.C, (D.m[1].components[1],D.m[2]))),
		convolve(SklarDist(D.C, (D.m[1].components[2],D.m[2]))),
		D.m[1].weights # carry on the mixing weights.
    )
end

# Generic version: 
function convolve(D::SklarDist)
	# do the generic stuff.
end

Thinking through this-- It boils down to taking one component from the first mixture, then convolving it with one component from the second mixture with the joint distribution defined by the copula. The components being convolved are not necessarily the same distribution type, so, even if they had been independent, there won't be a clear formula for the result in terms of the parameter of the input distributions. I guess the reason Distributions.jl has implemented the ones they have is because there are nice tidy formulas for the resulting distribution in terms of the parameters of the distributions being convolved.

I think for what we are envisioning, the convolution isn't really the hardest part- its then also finding the CDF, quantile function, and six other quantities that need to be defined to make the result compliant with Distributions.jl so that it can be fed back into a MixtureModel.

One solution would be to have a general pipeline where the input is a function that produces the pdf of a random variable, then the output is a Distributions.jl-compliant distribution (with the internals automatically estimating the CDF, quantile, etc.). Does something like that already exist? I suspect not since otherwise Distributions.jl would already be using it to extend the convolve function beyond the special cases.

rand5 avatar Jan 15 '25 19:01 rand5

Even if there are components that might possess a theoretical convolution, these convolutions are usually assuming independence, while here we have dependence structures... and thus the "theoretically known" cases are very very few

On the other hand providing all the interface of Distributions.jl is not that hard:

  • pdf : we already have it, either through the formal integration or through monte-carlo approximation.
  • cdf : I think the integration formula is very close to the one you had for the pdf, and another monte-carlo version could be constructed as well. (You have to derive it by hand, it uses the partial derivatives of the copula IIRC)
  • quantile function: there might be other algorithms relevant here, but in the mean time we can simply inverse the cdf
  • rng : well just sample from the original model and sum, that is the easiest one.
  • minimum / maximum / support : easy to deduce from the minimum, maximum, and support of the components.

So maybe we could provide a structure DependentConvolution <: Distributions.UnivariateDistribution that do it. Then the bit of algorithm to simplify mixtures can be implemented as methods to DependentConvolution's constructor.

lrnv avatar Jan 17 '25 13:01 lrnv

Also other algorithms exists, e.g. using fourrier transform : https://github.com/bionanoimaging/FourierTools.jl

And a few years ago at the end of my phd there was https://github.com/lrnv/ThorinDistributions.jl to handle the projection of (positive) multivariate distributions into the space of convolutions of gamma distributions, in which convolutions are very easy.

lrnv avatar Jan 17 '25 14:01 lrnv

Hey @rand5 Now (since #194 got merged), we have a way of creating conditional distributions from copulas. This may help your convolution issues by using the following identity

$$f_{X_1+X_2}(t) = \int f_{X_1 | X_2 = x_2}(t - x_2)f_{X_2}(x_2) dx_2 \approx \sum_{i=1}^N f_{X_1 | X_2 = x_2}(t - X_{2, i})$$

Now, the conditional distribution is provided directly by the package. So you can simply do:

X = SklarDist(ClaytonCopula(2,7), (Normal(),Normal()))
pdf_of_sum(X, t; j=1; N=10000) = mean(pdf(condition(X, j, x2), t - x2) for x2 in rand(X.m[j],N))
cdf_of_sum(X, t; j=1; N=10000) = mean(cdf(condition(X, j, x2), t - x2) for x2 in rand(X.m[j],N)) # same idea

Depending on the copula (we can look at the result of condition(C, j, x2) to decide), the conditionals might not depend on the value of x2, so it coudl be precomputed to save time. Also you can choose j=1 to condition on the first one, which might work better dependning on the variance of both marginals as we already saw.

lrnv avatar Sep 09 '25 21:09 lrnv

@lrnv Thank you for following up on this. I'm not too familiar with Github-- it looks like #194 was merged as v0.1.30, but I'm running into the following after I updated to v0.1.30:


pdf_of_sum(X, t; j=1, N=10000) = mean(pdf(condition(X, j, x2), t - x2) for x2 in rand(X.m[j],N))

pdf_of_sum(X,1.)


UndefVarError: `condition` not defined

Stacktrace:
 [1] (::var"#2#3"{Int64, SklarDist{ClaytonCopula{2, Int64}, Tuple{Normal{Float64}, Normal{Float64}}}, Float64})(x2::Float64)
   @ Main .\none:0
 [2] iterate
   @ .\generator.jl:47 [inlined]
 [3] mean(f::typeof(identity), itr::Base.Generator{Vector{Float64}, var"#2#3"{Int64, SklarDist{ClaytonCopula{2, Int64}, Tuple{Normal{Float64}, Normal{Float64}}}, Float64}})
   @ Statistics C:\Users\Monali\AppData\Local\Programs\Julia-1.10.3\share\julia\stdlib\v1.10\Statistics\src\Statistics.jl:62
 [4] mean
   @ C:\Users\Monali\AppData\Local\Programs\Julia-1.10.3\share\julia\stdlib\v1.10\Statistics\src\Statistics.jl:44 [inlined]
 [5] pdf_of_sum(X::SklarDist{ClaytonCopula{2, Int64}, Tuple{Normal{Float64}, Normal{Float64}}}, t::Float64)
   @ Main .\In[7]:1
 [6] top-level scope
   @ In[9]:1

But Copulas.SubsetCopula is defined, which appears as part of the same commit...

rand5 avatar Sep 10 '25 17:09 rand5

Hum .. sorry for ringing you a bit early, these are indeed changes that we plan for 0.1.31, they ar eon master right now. If you want to check out master you can otherwise nevermind I'll ping you again when it's released.

Sorry about that :/

lrnv avatar Sep 10 '25 17:09 lrnv