Turing.jl
Turing.jl copied to clipboard
Automatic lazy broadcasting/optimization for arrays of distributions
This issue follows up on this Discourse discussion. As @mohamed82008 explained, there multiple performance advantages to using a LazyArray
inside an arraydist
. To realize them, however, one must a) know about this lazy-array trick, and b) remember/look up the LazyArray
syntax (which unfortunately includes a macro @~
, which is similar but unrelated to Turing/DynamicPPL's ~
).
Would it be possible to make the lazy-array trick more automatic? From a user perspective, I can think of a couple of ways this would work:
- Make
arraydist
convert its argument to a lazy array automatically (maybe based on some heuristic of when it's likely to benefit performance, and/or an optional argumentlazy=true
) - Provide a macro like
@lazyarraydist(...broadcasted computation...)
to do the same thing - Just make
.~
parse to a lazily broadcasted array distribution
If 3) could work, that would be best, though I don't know if it would come with hidden downsides or implementation challenges. Basically, anything that decreases the number of special-case optimization hacks a user has to remember would be great!
Just a heads up, my plan is to remove arraydist
and integrate and generalize it into the product distribution in Distributions.
Cool, good to know. So would that that resolve the tracked-array vs. array-of-tracked issue @mohamed82008 was referring to in the discourse thread?
Hi, following this issue because I'm very interested in what ultimately is going to be the fast way to handle something like this. Currently the most reliable (doesn't cause crashes/backtraces) is for me to do Turing.@addlogprob!(sum(logpdf(....)))
all manually.
We will likely adopt https://github.com/TuringLang/Turing.jl/issues/1723#issuecomment-954023424.
In addition, it is better to keep the module
macro transparent so more performance optimisation can be done outside DynamicPPL
/ Turing
. https://github.com/TuringLang/Turing.jl/pull/1900 should make this type of performance optimisation easier because the internal SimpleVarInfo
data structure becomes more compatible with the Julia ecosystem.