DistributionsAD.jl Quadrature with Dirichlet and Matrix Beta

Hello, I'm trying to use quadrature to compute basic statistics for different multivariate distributions such as Dirichlet and Matrix Beta, such as the sum(x_i), variance(sum(x_i)) and mean of sqrt(sum(x_i^2)) where x_i = 1, 2, ..., N is drawn from the Dirichlet or Matrix Beta distribution with parameters a0[1:N] or a0[1:N] and b0[1:N].

For a univariate distribution, I can do this just fine: i.e.

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

for i = 1:N
    prob = QuadratureProblem(f,0.0,1.0,[α0[i]...,β0[i]...])
    sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]
    final_sum += sol
end

However, since these are multivariate distributions, I can't just forloop to compute the mean. To compute a similar metric, I try:

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

dist_pdf(x,p) = exp(DistributionsAD.logpdf(DistributionsAD.TuringDirichlet(N,p[1]),x[1]))
f(x,p) = dist_pdf(x,p) .* x
prob = QuadratureProblem(f,0.0,1.0,[α0...])
sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]

Here I get the error: MethodError: no method matching logpdf(::DistributionsAD.TuringDirichlet{Float64,Array{Float64,1}}, ::Float64)

N =2 

#User Inputs
α0 = 10 .* (1 .- rand(N))
β0 = 50 .* (1 .- rand(N))
reltol_val = 1e-5
abstol_val = 1e-5
alg = HCubatureJL()

dist_pdf(x,p) = exp(DistributionsAD.logpdf(DistributionsAD.MatrixBeta(N,p[1],p[2]),x[1]))
f(x,p) = dist_pdf(x,p) .* x
prob = QuadratureProblem(f,0.0,1.0,[α0...])
sol = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)[1]

Here I get the error: no method matching logpdf(::MatrixBeta{Float64,Wishart{Float64,PDMats.ScalMat{Float64},Int64}}, ::Float64)

Is there simply no way to do this for multivariate distributions? Or am I making a mistake somewhere?

Thank you very much,

Mar 24 '21 20:03 CharlesRSmith44

Hej!

It's not exactly clear to me what you want to do. Do you want to compute the expectation E f(X) for some function f and random variable X, where X is distributed according to a univariate, multi-, or matrixvariate distribution?

For the univariate case Distributions.jl provides a function expectation and Expectations.jl is a whole package dedicated to these computations. It should be possible to extend these approaches to multi- and matrixvariate distributions (AFAIK currently they are not supported) but you would have to study these packages and open issues over there.

Regarding your examples, they fail since you try to evaluate the logpdf of some higher-dimensional distribution at a scalar value which is not possible. You need something in the support of these distributions. More concretely, x[1] is always a scalar but would have to be a vector from the probability simplex in the case of (Turing)Dirichlet and a matrix from the support of the matrixvariate Beta distribution in the case of MatrixBeta (see https://en.wikipedia.org/wiki/Matrix_variate_beta_distribution).

In general, it is not clear to me why you use DistributionsAD.jl, and e.g. TuringDirichlet instead of Distributions.Dirichlet. DistributionsAD also just reexports Distributions.logpdf and Distributions.MatrixBeta. Additionally, you can simplify your code a bit by using Distributions.pdf(distribution, x) which is defined as exp(Distributions.logpdf(distribution, x)).

Mar 24 '21 20:03 devmotion

Hi David! Thanks for your reply. I am working with @smitch151 on the same project.

You are right that we were passing a scalar as the input to the (Turing)Dirichlet pdf when it should have been a vector. Thanks for catching that! In addition, I agree that in the code, as provided, there is no reason to use DistributionsAD rather than Distributions, and little reason to use Quadrature.jl over Expectations.jl. However, in the end, we do want to use DistributionsAD to solve a larger optimization problem where AD over the parameters of distributions is very useful, and integrals are high-dimensional, which is not to my understanding doable in Expectations.jl.

After fixing the scalar vs vector mistake and going a bit further, we realized one source of our issues is that the latest release of Distributions (v0.24.15) always outputs 0.0 for the pdf of the Dirichlet distribution at a point x whenever x is not in the support. In contrast, DistributionsAD (I tried with the latest release v0.6.20 and the master branch) returns either a non-zero value for some points outside of the support, and errors out if any of the elements of x is negative. The behavior of DistributionsAD is the same as in older versions of Distributions (I checked v0.23.8).

So one super useful fix would be to have DistributionsAD be consistent with Distribtions v0.24.15. Is there a way for us to achieve this?

The code that shows what I just wrote:

using Pkg
Pkg.add(Pkg.PackageSpec(name="Distributions", version="0.23.8"))
Pkg.add(Pkg.PackageSpec(name="DistributionsAD", version="0.6.20"))
using Distributions, DistributionsAD

# test pdf of Dirichlet distribution
α = [0.43,0.21]
d = Distributions.Dirichlet(α)
Turing_d = DistributionsAD.TuringDirichlet(α)

x_not_sup = [0.5,0.8] # a point not in the support
x_neg = [-0.5,1.5] # a point with a negative entry

Distributions.insupport(d,x_not_sup)
Distributions.insupport(d,x_neg)

Distributions.pdf(d,x_not_sup) # a non-zero number
Distributions.pdf(d,x_neg) # throws an error

DistributionsAD.pdf(Turing_d,x_not_sup) # a non-zero number equal to Distributions.pdf(d,x_not_sup)
DistributionsAD.pdf(Turing_d,x_neg) # throws an error

Pkg.add(Pkg.PackageSpec(name="Distributions", version="0.24.15"))
Pkg.add(Pkg.PackageSpec(url="https://github.com/TuringLang/DistributionsAD.jl", rev="master"))
using Distributions, DistributionsAD

# test pdf of Dirichlet distribution
α = [0.43,0.21]
d = Distributions.Dirichlet(α)
Turing_d = DistributionsAD.TuringDirichlet(α)

x_not_sup = [0.5,0.8] # a point not in the support
x_neg = [-0.5,1.5] # a point with a negative entry

#insupport works well
Distributions.insupport(d,x_not_sup)
Distributions.insupport(d,x_neg)

# in the new version of Distributions, the issue is solved
Distributions.pdf(d,x_not_sup) # now gives zero! (the desired behavior)
Distributions.pdf(d,x_neg)  # now gives zero! (the desired behavior)

# the master branch of DistributionsAD behaves like Distributions v0.23.8
# would be great if pdf evaluates to 0.0 if evaluated outside its support
DistributionsAD.pdf(Turing_d,x_not_sup) # a non-zero number
DistributionsAD.pdf(Turing_d,x_neg) # throws an error

Mar 26 '21 16:03 fernando-duarte

I realized the issue is now quite different. I will open a new issue and maybe @smitch151 can close this one?

Mar 30 '21 12:03 fernando-duarte

Yes that works for me. I’ll look through all your comments on the code this evening and implement them. I’ll also send results from running the code this evening as well

On Mar 30, 2021, at 08:39, Fernando Duarte @.***> wrote:

I realized the issue is now quite different. I will open a new issue and maybe @smitch151 can close this one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Mar 30 '21 15:03 CharlesRSmith44

DistributionsAD.jl DistributionsAD.jl copied to clipboard

Quadrature with Dirichlet and Matrix Beta

DistributionsAD.jl
DistributionsAD.jl copied to clipboard