Cuba.jl icon indicating copy to clipboard operation
Cuba.jl copied to clipboard

add support for userdata and the corresponding tests

Open kunyuan opened this issue 1 year ago • 8 comments

Hi, Thank you for creating this fantastic package! I use it heavily for calculating Feynman diagrams in condensed matter physics.

For my applications, I need to provide the integral with a lot of userdata (parameters, the structure of diagrams, etc.). Using global constants makes the code much less flexible. So I need to add support of userdata to your package.

The implementation is straightforward: I pass a tuple of (func!, userdata) to the C function, then unpack it in the generic_integrand function which is on the julia side.

To keep the package back-compatible. The new userdata APi will be invoked only when the user provides the userdata keyword explicitly. I also didn't touch the existing core functions to avoid any possible regression.

The usage is shown as below:

# the previous API are still there
julia> @time result = vegas((x, f) -> f[1]=x[1])
  0.079806 seconds (1.10 M allocations: 37.326 MiB, 73.86% compilation time)
Component:
 1: 0.49999995061461283 ± 4.9696306524036384e-5 (prob.: 2.5596303730113235e-5)
Integrand evaluations: 232000
Number of subregions:  0
Note: The desired accuracy was reached

# Support userdata keyword.
julia> @time result = vegas((x, f, d) -> f[1]=x[1]+d, userdata=1)
  0.062823 seconds (229.48 k allocations: 10.900 MiB, 98.04% compilation time)
Component:
 1: 1.4999579838501624 ± 8.38583942796998e-5 (prob.: 0.012663461802755724)
Integrand evaluations: 13500
Number of subregions:  0
Note: The desired accuracy was reached

# Userdata can be any julia object.
julia> @time result = vegas((x, f, d) -> f[1]=x[1]+d[:a], userdata=Dict(:a=>1))
  0.062314 seconds (230.40 k allocations: 10.942 MiB, 97.97% compilation time)
Component:
 1: 1.4999579838501624 ± 8.38583942796998e-5 (prob.: 0.012663461802755724)
Integrand evaluations: 13500
Number of subregions:  0
Note: The desired accuracy was reached

An interesting observation of the above examples is that the new API with the userdata keyword significantly reduces the memory allocation. The packing and unpacking the integrand the userdata probably helps the julia compiler to learn more type information. If this is true, it probably makes sense to improve the old API to reduce memory allocation. But this could be for a separate PR.

A couple of new tests are added for the userdata keyword.

If everything looks fine to you, I could update the documentation, too.

kunyuan avatar Aug 27 '22 15:08 kunyuan

Hi, thanks a lot for your contribution! However I'm a bit nervous about adding a field tagged with Any type. Did you see https://giordano.github.io/Cuba.jl/stable/#Passing-data-to-the-integrand-function-1? That should let you pass "user data" to the integrator, but without having to touch the code of the package

giordano avatar Aug 27 '22 15:08 giordano

Hi, I have just made the Any type concrete.

Thank you for sending me this link. I have some concerns about that solution. Julia's official performance tips have mentioned that a "captured" value could cause type instability, which costs performance issue. See the following link. https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured.

Moreover, with the userdata keyword, the user will have more choice how to pass parameters to the integrand.

kunyuan avatar Aug 27 '22 16:08 kunyuan

Ok, let's do it. Can you please update the documentation, including an example? Thanks!

giordano avatar Aug 27 '22 19:08 giordano

Ok, let's do it. Can you please update the documentation, including an example? Thanks!

Sure, I will work on it.

kunyuan avatar Aug 28 '22 02:08 kunyuan

Ok, let's do it. Can you please update the documentation, including an example? Thanks!

Hi, I have just updated index.md to include an example for the userdata keyword. Please take a look and feel free to modify and improve it. English is not my native language.

In addition, I need some help from you on the following things:

  1. The julia doctests for the stochastic algorithms (vegas, suave and divonne) failed in my computer. The code generates slightly different answer but within the error bar. I guess the output of these algorithm depends on the random seed, and it is probably not proper to doctest them. Right now, I simply suppress the doctests right now. Is this okay?

  2. I added a compatibility note. I put some reference to the next Cuba.jl release as version X.X.X. You may want to modify it.

Thank you very much!

kunyuan avatar Aug 28 '22 21:08 kunyuan

I guess the output of these algorithm depends on the random seed, and it is probably not proper to doctest them.

The random seed is fixed to 0 by default: https://github.com/giordano/Cuba.jl/blob/d6e4736f9013bbcd6250474bdfc4c5752e84af5d/src/Cuba.jl#L37 I'm not convinced that's the reason.

I'd prefer to keep the doctests, at least they've been consistent for some time in CI.

giordano avatar Aug 28 '22 21:08 giordano

I guess the output of these algorithm depends on the random seed, and it is probably not proper to doctest them.

The random seed is fixed to 0 by default:

https://github.com/giordano/Cuba.jl/blob/d6e4736f9013bbcd6250474bdfc4c5752e84af5d/src/Cuba.jl#L37

I'm not convinced that's the reason. I'd prefer to keep the doctests, at least they've been consistent for some time in CI.

I investigated this problem a little bit. I find that both vegas and divonne give the exactly the same results. However, suave has a small descrepency (but within the errorbar). The below is what I got:

julia> suave((x, f) -> f[1] = cos(x[1]))
Component:
 1: 0.8413748866950277 ± 7.772872640827611e-5 (prob.: 1.0)
Integrand evaluations: 23000
Number of subregions:  23
Note: The desired accuracy was reached

while according to the existing doc (the first example):

julia> suave((x, f) -> f[1] = cos(x[1]))
Component:
 1: 0.8413748866950329 ± 7.772872640815592e-5 (prob.: 1.0)
Integrand evaluations: 23000
Number of subregions:  23
Note: The desired accuracy was reached

The results here are calculated with your code in master branch, so tiny discrepency has nothing to do with the new userdata update. Could it be caused by the hardware difference? I'm using Intel i9 CPU.

kunyuan avatar Aug 28 '22 23:08 kunyuan

I guess the output of these algorithm depends on the random seed, and it is probably not proper to doctest them.

The random seed is fixed to 0 by default:

https://github.com/giordano/Cuba.jl/blob/d6e4736f9013bbcd6250474bdfc4c5752e84af5d/src/Cuba.jl#L37

I'm not convinced that's the reason. I'd prefer to keep the doctests, at least they've been consistent for some time in CI.

Anyway, I have added back the doctests. Although I am still getting tiny discrepency on my local computer, the CI doctests work just fine.

kunyuan avatar Aug 29 '22 00:08 kunyuan

Before going on, in light of #40, would you agree to contribute to this package under the terms of the MIT license, instead of the current LGPL?

Yes, I totally agree. Do you want to leave a comment below the #40 issue?

kunyuan avatar Sep 15 '22 14:09 kunyuan

Yes, please! But before merging this PR I'd like to fix CI (see #41), and that's already proving to be complicated.

giordano avatar Sep 15 '22 20:09 giordano

Thanks a lot!

giordano avatar Sep 22 '22 16:09 giordano

This is included in v2.3.0. Thanks again for your contribution!

giordano avatar Sep 24 '22 16:09 giordano