ModelingToolkit.jl icon indicating copy to clipboard operation
ModelingToolkit.jl copied to clipboard

correctly handling random variables

Open isaacsas opened this issue 3 years ago • 7 comments

I'd like to resurrect https://github.com/SciML/ModelingToolkit.jl/pull/942 by @kaandocal, but there are a few issues with general handling of random variables that seemed worth discussing. The idea is that users could write symbolic expressions like

A ~ 2 * Geometric(p)

and ultimately get a sample from a geometric random variable with parameter p. What was done in that PR was to register a distribution like

sample_geom(p) = rand(Geometric(p))

and then replace Geometric(p) with a call to sampled_geom(p) before calling build_function. This raises a few issues, several of which I'm not sure how to handle

  1. I assume such registrations should really be in Symbolics?
  2. We'd really like to only call Geometric(p) one time and keep reusing the generator instance across calls to any functions generated by build_function.
  3. We need a mechanism to pass an explicit rng into the call to rand to maintain consistency with the rng that is being used in any SDE/jump solver.

2 and 3 seem to suggest we need a way of passing additional parameters into the functions built by build_function that either aren't part of the user's parameter vector, or are augmented into it in some way. We'd want this to be done though in a manner that if a user specifies a rng we use theirs. For example, in DiffEqJump the rng is a kwarg to the JumpProblem, so not part of the parameter vector, but we'd still want it to ultimately get passed into the generated functions for use in these sampling calls.

isaacsas avatar Sep 02 '21 13:09 isaacsas

I assume such registrations should really be in Symbolics?

It would need to hook into Symbolics' codegen, so it'll need something special.

ChrisRackauckas avatar Sep 02 '21 13:09 ChrisRackauckas

Yeah, I thought as much. Feel free to move to Symbolics if this makes more sense there.

I could envision how that is just a pre or post processing step though on the symbolic expressions. It seemed like the bigger issue was figuring out how to handle having extra parameters (rng, generator=Geometric(p)) that we want to save and pass around in addition to the user's parameters.

isaacsas avatar Sep 02 '21 13:09 isaacsas

I think it's a two-fold issue. Symbolics needs a system for handling distributions in equations and build_function, and then ModelingToolkit probably needs to treat them correctly at the SciMLProblem construction.

ChrisRackauckas avatar Sep 02 '21 13:09 ChrisRackauckas

Yeah, this is an even more general problem I guess. i.e. probably in DiffEqJump user rate and affect functions should take the rng we're using as a parameter in some way too (in case they are not deterministic)? That would be a breaking change though if we added explicit kwargs to provide such info. Otherwise we could somehow wrap the user's parameters in a ComponentArray or such with our parameters attached at the end, but I worry about what that might break (for example, that could be problematic if a user tries to change parameters in an event).

isaacsas avatar Sep 02 '21 13:09 isaacsas

FWIW I was able to get sampling from distributions to work in https://github.com/SciML/Catalyst.jl/pull/493.

I also checked and registering a two-argument rand that takes a rng works fine too. So the issue is really just how to handle an extra rng parameter that may not be within the normal p vector/tuple from a user for build_function generated functions.

isaacsas avatar Mar 19 '22 02:03 isaacsas

Maybe just add it as another argument to the function, build_function with that argument, and enclose an RNG?

ChrisRackauckas avatar Mar 19 '22 03:03 ChrisRackauckas

You mean generate a function from build_function that takes the rng as the last parameter and then make a closure?

isaacsas avatar Mar 19 '22 18:03 isaacsas