Conditional Distributions
I'm interested writing a set of functions that condition on a subset of variables, and return a new (conditional) distribution. This is roughly how I'd imagine doing it for a multivariate normal distribution:
# Given a multivariate distribution p(x),
# returns new distribution p(x | xᵢ=aᵢ), where
# "a" is a vector of values to condition on, and
# "d2" is a vector of indices providing the
# dimensions (subscript i) in increasing order.
function condition(
joint::MultivariateDistribution,
a::Vector{Float64},
d2::Vector{Int}
)
if length(d2) != length(a)
error("list of conditioned values and dimension indices bet the same length")
end
max_dim = length(joint.μ)
if minimum(d2) < 1 || maximum(d2) > max_dim
error("dimension indices must be between 1 and dimension of joint distribution")
end
# d1 = remaining dimensions
# d2 = removed/conditioned dimensions
d2 = sort(d2)
d1 = setdiff(1:max_dim,d2)
# covariance matrix blocks
Σ = full(joint.Σ)
Σ11 = Σ[d1,d1]
Σ12 = Σ[d1,d2]
Σ22inv = inv(Σ[d2,d2])
μ_new = joint.μ[d1] + (Σ12 * Σ22inv * (a - joint.μ[d2]))
Σ_new = Σ11 - (Σ12 * Σ22inv * Σ12')
return MvNormal(μ_new,Σ_new)
end
Questions:
-
I know this is really inefficient, particularly the step
Σ = full(joint.Σ)to convert aPDmatto a normal matrix. Any suggestions on how to do this better -- i.e. using the (already computed) Cholesky factorization? I admit I haven't given this much thought at all yet. -
Would a set of functions like this be useful to add to
Distributions.jlor does it not fit within the scope of this package? Where (if anywhere at all) would this live in JuliaStats? -
Is the above approach completely off base, or am I on the right track at least?
I would be very interested in providing this functionality in Distributions.jl, but I think it'll take some work to get us into a state where it's ready to be merged in. We'll need to make sure the functionality is totally correct (up to numeric stability), the API is what we'd like to finalize on, and that the performance is as good as anything else people might offer.
I suspect @andreasnoack knows more about ways to handle this conditioning via matrix operations. People are also doing very similar things in the GaussianProcesses.jl library, so that may be another source of useful information.
This might be useful for the gaussian distribution, but I it doesn't really generalize well as other distributions don't have as nice properties when conditioning.
Please also take a look inside PDMats. Most of compuations you need are implemented efficiently there such that you avoid full and inv.
Multivariate Student's T distribution has the same properties, so actually multivariate normal is not the only one. See https://arxiv.org/pdf/1604.00561.pdf and https://www.researchgate.net/publication/228613894_A_short_review_of_multivariate_t-distribution for the formulas.
I'm using this for myself for the bivariate case:
tcond_x1c2(g,x2)=GeneralizedTDist2( g.μ[1]+g.Σ.mat[1,2]/g.Σ.mat[2,2](x2-g.μ[2]), g.df + 1, sqrt((1+(x2-g.μ[2])/(g.df * g.Σ.mat[2,2])(x2-g.μ[2]))*g.df/(g.df+1) *(g.Σ.mat[1,1]-g.Σ.mat[1,2]/g.Σ.mat[2,2]*g.Σ.mat[2,1])))
not sure this needs to live in Distributions itself?
I would be very interested in this functionality for the Gaussian distribution.
For the meantime: https://github.com/mschauer/GaussianDistributions.jl/blob/8be87a4ccba1b2fb143c9005c38b0e63db53b17a/src/GaussianDistributions.jl#L149
Hey, just in case someone still cares 10 years later, there is now an almost-equivalent interface in Copulas.jl : https://lrnv.github.io/Copulas.jl/dev/manual/conditioning_and_subsetting#Conditioning which is a bit more generic, but can definitely be used for pure-Gaussian or pure-T cases :)
There is a generic AD-based implementation, but it also does the smart parametric thing for Gaussian and Student copulas.