MonteCarloSummary for intermediate MonteCarlo results
The idea is that it is often useful to get MonteCarlo intermediate results in a form of statistic summary MonreCarloSummary. I think it could be implemented in two ways (let's say we need to obtain MonreCarloSummary each n iterations of num_monte):
- Via reduction function which allows to obtain (print, save, upload to a DB) statistic parameters like
MonreCarloSummaryeachniterations but doesn't reduce the solution data as it will be needed for futureMonreCarloSummarycalculations - Via reduction function which calculates statistic parameters like
MonreCarloSummaryand reduces (deletes) all the solution data eachniterations. Here as we delete the solution data and save onlyMonreCarloSummarythe reduction function should use some "online" algorithms to calculate and approximate statistic parameters eachniterations using, for example, OnlineStats.jl
In draft the idea 2. is:
using OnlineStats
...
reduction = function(u,batch,I)
if I == 1:1
num_t = length(batch[1].t)
stats = [
Series(batch[1][i,:], OnlineStats.EqualWeight(), MV(num_t, Mean()),
MV(num_t, Variance()),
MV(num_t, PQuantile(0.05)),
MV(num_t, PQuantile(0.95)))
for i in 1:size(batch[1], 1)]
append!(u,[batch[1].t, stats]),false
else
[u[1],fit!.(u[2],[batch[1][i,:] for i in 1:size(batch[1], 1)])],false
end
end
monte_prob = MonteCarloProblem(prob, prob_func=prob_func,reduction=reduction,output_func=output_func)
sim = solve(monte_prob, Tsit5(), saveat=1.0, num_monte=200, batch_size=1)
The code can be optimized of course but the approach could be a nice option not to save the solution but to reduce the solution to statistic parameters on each step using OnlineStats.
That looks really great. I can't think of a good user interface for it right now though, but this would be super helpful!
My current draft wrapper for reduction takes dict of stats to be updated each iteration(batch_size=1), initializes OnlineStats.Series at I==1:1 and updates the stats with each new batch:
function parameterized_reduction(task_stats::Array{Dict{String,V} where V,1})
function reduction(u,batch,I)
if I == 1:1
num_s = size(batch[1],1)
num_t = length(batch[1])
s = initStats.(task_stats, num_s, num_t)
fitStats.(s, batch)
append!(u,[batch[1].t, s]), false
else
fitStats.(u[2], batch)
u, false
end
end
end
function initStats(stat::Dict, num_s::Int64, num_t::Int64)
if stat["name"] == :mean
return Series(Group([num_t*Mean() for i in 1:num_s]))
elseif stat["name"] == :variance
return Series(Group([num_t*Variance() for i in 1:num_s]))
elseif stat["name"] == :quantile
return Series(Group([num_t*PQuantile(stat["q"]) for i in 1:num_s]))
end
end
fitStats(s::T, sol::ODESolution) where T<:OnlineStats.Series = fit!(s, vecarr_to_vectors(sol))
Solver outputs two arrays: t and Array of stat Series.
In case of large num_monte the effect is > 5x in speed in comparison with "static" MonteCarloSimulation approach.
If you have ideas of code optimization, please, let me know :)
The main q is can this approach support multi threading? I've tried to run it with parallel_type=:threads but it failed with
Error thrown in threaded loop on thread 0: BoundsError(a=Array{Any, 1}[#<null>], i=(2,))ERROR: Undef RefError: access to undefined reference
I don't think that there's much of an optimization to be done here. While changing from a dictionary would make it technically faster, the amount that you're actually accessing the dictionary is probably small enough that it doesn't effect timings.
As for the multithreading, I'm not sure why it would have an issue since the batch reductions are called after the threaded loops. I think that this may just be related to the other multithreading error that was mentioned before that I haven't been able to track down. Note that multithreading in Julia is going to get an overhaul:
https://github.com/JuliaLang/julia/pull/22631
so I wonder how much should wait on that.