Integrals.jl icon indicating copy to clipboard operation
Integrals.jl copied to clipboard

Quadrature Time to Integrate as function of N and tolerance

Open CharlesRSmith44 opened this issue 3 years ago • 4 comments

Hi, I'm trying to find the mean of a dirichlet distribution using Quadrature. However, I'm finding that the time to complete increases drastically as a function of N. For example for N =5 it takes less than 20 seconds, but N = 10 takes over an hour. I wonder if I'm doing anything wrong with this code or if these results should be expected.

I'm curious about your intuition with respect to N, the tolerance level, and the algorithm - CubaDivonne vs CubatureJLh().

Thank you, code is below:

Pkg.add(Pkg.PackageSpec(;name="Distributions", version="0.24.15"))
using Quadrature, Cuba, Cubature, Base.Threads
using Distributions, Random
using DataFrames, CSV
using Flux, CUDA

## User Inputs
N = 7
tol = 5
ind_gpu = 0 # indicator whether to use gpu
alg = CubatureJLh() # CubaDivonne() #works for CubaCuhre, CubaDivonne CubaSUAVE, fails for cubavega

# Setting up Variables
if ind_gpu == 1
    α0 = 10 .* (1 .- rand(N)) |> gpu
else
    α0 = 10 .* (1 .- rand(N))
end
reltol_val = 10.0^(-tol)
abstol_val = 10.0^(-tol)

# Setting up function
dist_dirichlet_pdf(x,p) = Distributions.pdf(Distributions.Dirichlet(p),x)
function f_dirichlet(dx,x,p)
    Threads.@threads for i in 1:N
        dx[i] = (dist_dirichlet_pdf([x;1.00-sum(x)],p) .* [x;1.00-sum(x)])[i]
    end
end

# Solving Integral
prob = QuadratureProblem(f_dirichlet,zeros(N-1),ones(N-1), α0, nout = N)
time_start = time()
mem_usage = @allocated sol_mean = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)
total_mem = mem_usage/1000/2^20

# Checking Answer
mean_dirichlet(p) = p./sum(p)
display(mean_dirichlet(α0))
display(sol_mean)

test_passed = sum((abs.(sol_mean .- mean_dirichlet(α0)) .< 1e-4))

## Saving Meta Data
time_end = time()
total_time = time_end-time_start
'''

CharlesRSmith44 avatar Apr 08 '21 19:04 CharlesRSmith44

I find that the code also takes a really long time to run using a multvariate normal distribution


#User Input
N = 7
tol = 2
alg = HCubatureJL()

# Setting up problem
reltol_val = 10.0^(-tol)
abstol_val = 10.0^(-tol)

means = rand(N)
vars = Diagonal(abs.(rand(N,N)))
d = MvNormal(means , vars)
m(x , p) = pdf(d, x) .* x
prob = QuadratureProblem(m, -Inf*ones(N) , Inf*ones(N))
time_start = time()
mem_usage = @allocated sol_mean = Quadrature.solve(prob,alg,reltol=1e-2,abstol=1e-2)
total_mem = mem_usage/1000/2^20

display(sol_mean)
display(means)
'''

CharlesRSmith44 avatar Apr 08 '21 20:04 CharlesRSmith44

Methods like HCubature grow exponentially with dimension. CubaVegas and CubaSUAVE have better properties at higher dimensions.

ChrisRackauckas avatar Apr 09 '21 10:04 ChrisRackauckas

I'm finding that with CubaSUAVE and CubaVegas that for large N (N > 10) the solutions are inaccurate. Is this expected behavior or am I doing something wrong?


#User Input
N = 15
tol = 3 #even if I try tol = 10 or 15 still inaccurate.  
alg = CubaSUAVE()

# Setting up problem
reltol_val = 10.0^(-tol)
abstol_val = 10.0^(-tol)

means = rand(N)
vars = Diagonal(abs.(rand(N,N)))
d = MvNormal(means , vars)
m(x , p) = pdf(d, x) .* x

function f_normal(x,p)
    sum(pdf(d,x) .* x)
end

prob = QuadratureProblem(f_normal, -Inf*ones(N) , Inf*ones(N), nout=1)
time_start = time()
mem_usage = @allocated sol_mean = Quadrature.solve(prob,alg,reltol=reltol_val,abstol=abstol_val)
total_mem = mem_usage/1000/2^20

display(sol_mean.u)
display(sum(means))

CharlesRSmith44 avatar Apr 15 '21 13:04 CharlesRSmith44

You might need to increase maxiters.

ChrisRackauckas avatar Apr 16 '21 20:04 ChrisRackauckas