DiffEqBase.jl icon indicating copy to clipboard operation
DiffEqBase.jl copied to clipboard

MonteCarloSummary for intermediate MonteCarlo results

Open ivborissov opened this issue 8 years ago • 4 comments

The idea is that it is often useful to get MonteCarlo intermediate results in a form of statistic summary MonreCarloSummary. I think it could be implemented in two ways (let's say we need to obtain MonreCarloSummary each n iterations of num_monte):

  1. Via reduction function which allows to obtain (print, save, upload to a DB) statistic parameters like MonreCarloSummary each n iterations but doesn't reduce the solution data as it will be needed for future MonreCarloSummary calculations
  2. Via reduction function which calculates statistic parameters like MonreCarloSummary and reduces (deletes) all the solution data each n iterations. Here as we delete the solution data and save only MonreCarloSummary the reduction function should use some "online" algorithms to calculate and approximate statistic parameters each n iterations using, for example, OnlineStats.jl

ivborissov avatar Jan 25 '18 14:01 ivborissov

In draft the idea 2. is:

using OnlineStats
...
reduction = function(u,batch,I)

    if I == 1:1
        num_t = length(batch[1].t)
        
        stats = [ 
        Series(batch[1][i,:], OnlineStats.EqualWeight(), MV(num_t, Mean()), 
                                                         MV(num_t, Variance()), 
                                                         MV(num_t, PQuantile(0.05)), 
                                                         MV(num_t, PQuantile(0.95))) 
                for i in 1:size(batch[1], 1)] 

        append!(u,[batch[1].t, stats]),false
    else

        [u[1],fit!.(u[2],[batch[1][i,:] for i in 1:size(batch[1], 1)])],false
    end
end

monte_prob = MonteCarloProblem(prob, prob_func=prob_func,reduction=reduction,output_func=output_func)

sim = solve(monte_prob, Tsit5(), saveat=1.0, num_monte=200, batch_size=1)

The code can be optimized of course but the approach could be a nice option not to save the solution but to reduce the solution to statistic parameters on each step using OnlineStats.

ivborissov avatar Feb 01 '18 15:02 ivborissov

That looks really great. I can't think of a good user interface for it right now though, but this would be super helpful!

ChrisRackauckas avatar Feb 01 '18 18:02 ChrisRackauckas

My current draft wrapper for reduction takes dict of stats to be updated each iteration(batch_size=1), initializes OnlineStats.Series at I==1:1 and updates the stats with each new batch:

function parameterized_reduction(task_stats::Array{Dict{String,V} where V,1})
    function reduction(u,batch,I)
        if I == 1:1
            num_s = size(batch[1],1)
            num_t = length(batch[1])

            s = initStats.(task_stats, num_s, num_t)
            fitStats.(s, batch)
            append!(u,[batch[1].t, s]), false
        else
            fitStats.(u[2], batch)
            u, false
        end
    end
end

function initStats(stat::Dict, num_s::Int64, num_t::Int64)
	if stat["name"] == :mean
		return Series(Group([num_t*Mean() for i in 1:num_s]))
	elseif stat["name"] == :variance
		return Series(Group([num_t*Variance() for i in 1:num_s]))
	elseif stat["name"] == :quantile
		return Series(Group([num_t*PQuantile(stat["q"]) for i in 1:num_s]))
	end
end

fitStats(s::T, sol::ODESolution) where T<:OnlineStats.Series = fit!(s, vecarr_to_vectors(sol))

Solver outputs two arrays: t and Array of stat Series. In case of large num_monte the effect is > 5x in speed in comparison with "static" MonteCarloSimulation approach. If you have ideas of code optimization, please, let me know :)

The main q is can this approach support multi threading? I've tried to run it with parallel_type=:threads but it failed with Error thrown in threaded loop on thread 0: BoundsError(a=Array{Any, 1}[#<null>], i=(2,))ERROR: Undef RefError: access to undefined reference

ivborissov avatar Mar 02 '18 15:03 ivborissov

I don't think that there's much of an optimization to be done here. While changing from a dictionary would make it technically faster, the amount that you're actually accessing the dictionary is probably small enough that it doesn't effect timings.

As for the multithreading, I'm not sure why it would have an issue since the batch reductions are called after the threaded loops. I think that this may just be related to the other multithreading error that was mentioned before that I haven't been able to track down. Note that multithreading in Julia is going to get an overhaul:

https://github.com/JuliaLang/julia/pull/22631

so I wonder how much should wait on that.

ChrisRackauckas avatar Mar 06 '18 08:03 ChrisRackauckas