ComplexityMeasures.jl icon indicating copy to clipboard operation
ComplexityMeasures.jl copied to clipboard

Parallellization

Open kahaaga opened this issue 2 years ago • 4 comments

In the following discussion, the issue of parallellization came up. This is a reminder of that.

    Probably best to use https://github.com/JuliaSIMD/Polyester.jl  because its loop is rather cheap. I will think about this once the API is super stable and 2.0 is out.

Originally posted by @Datseris in https://github.com/JuliaDynamics/Entropies.jl/pull/213#discussion_r1055714041

Things to think about:

  • How to paralellize? The same for all methods? Differently for some methods?
  • What about GPU compatibility? Can it be done? For which methods? Generic or backend-specific? I.e. what happens if I'm using a mac vs a windows OS?

kahaaga avatar Dec 22 '22 17:12 kahaaga

Some preliminary results, starting julia with 8 threads:

using DelayEmbeddings, Entropies
m, τ, N = 7, 1, 1000000
est = SymbolicPermutation(; m, τ)
x = Dataset(rand(N, m)) # timeseries example
πs_ts = zeros(Int, N); # length must match length of `x`;

using BenchmarkTools, Test
probabilities!(πs_ts, est, x);
probabilities_parallel!(πs_ts, est, x);
probabilities_parallel_batch!(πs_ts, est, x);
@btime pn = probabilities!($πs_ts, $est, $x) # No threads
@btime pp = probabilities_parallel!($πs_ts, $est, $x) # Threads.@threads
@btime pb = probabilities_parallel_batch!($πs_ts, $est, $x) # Polyester.@batch, no configuration

> 85.572 ms (7 allocations: 7.71 MiB)
> 44.202 ms (49 allocations: 4.36 KiB) 
> 37.254 ms (1 allocation: 48 bytes 

It definitely seems that there is some performance gains to be made here. Some more sensitivity analyses are needed before settling on anything.

kahaaga avatar Dec 26 '22 00:12 kahaaga

what's the code though

Datseris avatar Dec 26 '22 07:12 Datseris

and why does the pure probabilities! allocate...?

Datseris avatar Dec 26 '22 07:12 Datseris

Notice that you cant thread without care. The method uses the internal pre-allocated perm array stored in the estimator. To parallelize you would need as many copies of this array as nthreads(). I would guess the results you get would be wrong otherwise.

Datseris avatar Dec 26 '22 08:12 Datseris