DynamicPPL.jl
DynamicPPL.jl copied to clipboard
[Proof of concept] `:=` to keep track of generated quantities
Note that this is more of a fun idea rather than something I think we necessarily should do.
But due to a bunch of issues with invlinking (https://github.com/TuringLang/Turing.jl/issues/2195), we're likely going to make it so that we always re-evaluate the model to get the properly linked realizations before passing it to the user (https://github.com/TuringLang/Turing.jl/pull/2202) using values_as_in_model (https://github.com/TuringLang/DynamicPPL.jl/pull/590).
Now, if we indeed decide to always do this re-evaluation, it opens a can of cute worms, one amongst them which is the possibility of the following with very little work :eyes: (note the example is also using https://github.com/TuringLang/Turing.jl/pull/2202)
julia> using Turing
julia> @model function demo()
x ~ Normal()
y := 100 + x
return nothing
end
demo (generic function with 2 methods)
julia> model = demo()
DynamicPPL.Model{typeof(demo), (), (), (), Tuple{}, Tuple{}, DynamicPPL.DefaultContext}(demo, NamedTuple(), NamedTuple(), DynamicPPL.DefaultContext())
julia> chain = sample(model, NUTS(), 1000)
┌ Info: Found initial step size
└ ϵ = 3.2
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:00
Chains MCMC chain (1000×14×1 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 1
Samples per chain = 1000
Wall duration = 0.78 seconds
Compute duration = 0.78 seconds
parameters = x, y
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
x 0.0138 0.9837 0.0549 323.4336 721.8310 0.9991 417.3337
y 100.0138 0.9837 0.0549 323.4336 721.8310 0.9991 417.3337
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
x -1.8697 -0.6448 0.0171 0.6708 1.8714
y 98.1303 99.3552 100.0171 100.6708 101.8714
That is, we can finally "track" quantities in a nice manner + get easy integration with Chains (generated_quantities lacks this).
This is highly relevant to the age-old issue: https://github.com/TuringLang/DynamicPPL.jl/issues/94
Maybe we can push this a bit further by introducing a block syntax for :=:
julia> @model function demo()
x ~ Normal(0, 1)
generated_quantities := let
# users could put arbitrary code in this block.
c = 5*x
d = 2*x
z = x
(c, z) # variables to track
end
return x
end
julia> model = demo();
julia> model()
julia> generated_quantities(model, (x = 1,))
Then, we can conditionally evaluate this block during MCMC. The variable name generated_quantities inside the model definition is arbitrary, and we could have multiple such blocks (with different names) in general.
This would provide an alternative to https://github.com/TuringLang/DynamicPPL.jl/pull/589, which is a bit unsatisfactory, I think.
Maybe we can push this a bit further by introducing a block syntax for :=:
Not a big fan because it will:
- Require extensive macro writing to achieve in a decent way.
- Will limit what the user can put in this block.
Compared to the simple := outlined in this PR, which is a trivial addition and poses no syntactical restrictions on on what the user does (beyond using left := right for tracking) => very easy to understand and use
Require extensive macro writing to achieve in a decent way.
I'm not sure it will involve a lot of extra work, but I don't think this is a major issue even if that is true. Also, we can do this block-syntax in a separate PR.
Will limit what the user can put in this block.
They are mostly equivalent to if-else blocks. I am not aware of any let-block limitations. Can you clarify?
EDIT: Apart from requiring an additional macro @is_post_processing, another issue with https://github.com/TuringLang/DynamicPPL.jl/pull/589 is the slightly abusive overreliance on the return statement.
Maybe we can push this a bit further by introducing a block syntax for
:=:
If we consider a block syntax, maybe something like
julia> @model function demo()
x ~ Normal(0, 1)
@generated_quantities begin
# users could put arbitrary code in this block.
c = 5*x
d = 2*x
z = x
(c, z) # variables to track
end
return x
end
would be a slightly simpler syntax?
If we consider a block syntax, maybe something like
But won't this make our lives much more difficult? It will be additional macro-related stuff to maintain + the beautify of the current way with return is that it can handle arbitrary Julia code and it's not restricted to what can fit in a VarInfo. := in this PR is meant to be used for simple expressions that people just want to have in their chain; return is a strict super-set of := , but := still seems useful.
I didn't focus on the return component too much actually when copying the example. I was more thinking that perhaps the meaning of := is less clear than a more explicit @generated_quantities block which would work also without return semantics:
@model function demo()
x ~ Normal()
@generated_quantities begin
y = 100 + x
end
return nothing
end
or even
@model function demo()
x ~ Normal()
@generated_quantities begin
y := 100 + x
end
return nothing
end
In principle different such blocks could be used for defining additional computations in other scenarios.
IMO if tell the user "use ~ to specify a random variable, and := to specify determinstic variables that you also want to end up in chains", things should be fairly understandable. AFAIK this is very common in PPLs, e.g. Deterministic in pymc3.
I find the syntax Tor posted pretty cool and intuitive. 😅 Why make it more complicated?
I find the syntax Tor posted pretty cool and intuitive. 😅 Why make it more complicated?
Seconded :eyes:
I'm happy to merge this PR and leave the block syntax for future work.
I find the syntax Tor posted pretty cool and intuitive. 😅 Why make it more complicated?
@trappmartin the let-block based syntax provides a nice alternative to https://github.com/TuringLang/DynamicPPL.jl/pull/589
Thanks for the context.
I'm happy to merge this PR and leave the block syntax for future work.
So is this actually something we want to move forward?
Pull Request Test Coverage Report for Build 8969088761
Details
- 22 of 30 (73.33%) changed or added relevant lines in 3 files are covered.
- No unchanged relevant lines lost coverage.
- Overall coverage decreased (-0.03%) to 81.501%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| src/utils.jl | 2 | 5 | 40.0% |
| src/values_as_in_model.jl | 12 | 17 | 70.59% |
| <!-- | Total: | 22 | 30 |
| Totals | |
|---|---|
| Change from base Build 8861691435: | -0.03% |
| Covered Lines: | 2877 |
| Relevant Lines: | 3530 |
💛 - Coveralls
Codecov Report
Attention: Patch coverage is 21.87500% with 25 lines in your changes are missing coverage. Please review.
Project coverage is 76.61%. Comparing base (
4cf395b) to head (4c61bcf).
:exclamation: Current head 4c61bcf differs from pull request most recent head 30e99f6. Consider uploading reports for the commit 30e99f6 to get more accurate results
| Files | Patch % | Lines |
|---|---|---|
| src/values_as_in_model.jl | 18.75% | 13 Missing :warning: |
| src/compiler.jl | 18.18% | 9 Missing :warning: |
| src/utils.jl | 40.00% | 3 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #594 +/- ##
==========================================
- Coverage 81.06% 76.61% -4.45%
==========================================
Files 30 30
Lines 3517 3545 +28
==========================================
- Hits 2851 2716 -135
- Misses 666 829 +163
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I don't think it's a good idea but I want to bring it up anyways: What about the opposite behaviour and making everything that is defined with = instead of := a tracked quantity (so := would be an opt-out instead of opt-in)?
What about aliasing?
For instance, given
@model function m()
x := 1
y ~ Normal(x, 1)
x := y
end
The current implementation will be tracking the value of the second x in the end?
Also the use of tilde pipeline for regular assignment is not quite natural, but understandable.
What about aliasing?
It's probably sensible to simply throw an error for such scenarios.
What about aliasing?
We actually have DynamicPPL.TestUtiles.check_model for this now!:) Plan is to make that automatically run when the user calls sample, and it will either warn the user if something like this happens.
I don't think it's a good idea but I want to bring it up anyways: What about the opposite behaviour and making everything that is defined with = instead of := a tracked quantity (so := would be an opt-out instead of opt-in)?
Naaaaaah, that's going to be too crazy I think. Too many gotchas and annoyances. Also, we've had tons of requests to "filter out variables that I need for inference but am not actually interested in to save mem footprint", and this would make those people way less happy :sweat_smile:
Let’s try to get this PR merged soon-ish.
Let's make a final push to get this merged and released -- it looks like some GSoC projects might benefit from this new feature.
Added tests + marked ready for review.
Combining this PR with https://github.com/TuringLang/Turing.jl/pull/2202, automatic inclusion of tracked variables will be included.
Looks good to me. Tiny concern on a technicality: we are using the tilde functions to implement the whole values_as_in_model. I am a little worried that this might be confusing for future contributors as := is more of a = then ~. Not suggesting a rewrite, as I do think the current implementation is the most intuitive approach.
Looks good to me. Tiny concern on a technicality: we are using the
tildefunctions to implement the wholevalues_as_in_model. I am a little worried that this might be confusing for future contributors as:=is more of a=then~. Not suggesting a rewrite, as I do think the current implementation is the most intuitive approach.
So there are two aspects:
values_as_in_modelis already implemented using the tilde pipeline, as it should, because it is meant to extract values primarily from~statements.- I think the concern regarding also implementing
:=using this approach rather than introducing another way of doing things is very fair. I think for now it makes sense to use the tilde pipeline because it a) it simplifies internals quite a bit vs. introducing specific handling of:=, and b) it's meant to work when usingvalues_as_in_model, which this then trivially does.
I'm definitively open to maybe make a more explicit approach to (2), i.e. introduce some coloneq_* methods or something, but I think maybe that's worth delaying until we see explicit use-cases for this.
@yebai Please don't merge PRs opened by others without consulting with the author (unless they're super unrespnosive ofc)