Added `check_model` and sub-module `DebugUtils`
As we're getting more users coming from other languages, who are necessarily not so familiar with Julia, it's becoming more and more important that we avoid the end-user hitting any potentially confusing snags of Turing.jl / DynamicPPL.jl.
Examples are:
- Usage of
missingand differences between the different ways of conditioning variables. - Repeated variables will silently lead to incorrect behavior.
- Etc.
Because of this, I was wondering if it might be a good idea to add some utilities for doing exactly this.
This PR adds a method check_model (which is also exported) which executes the model using a special context (DebugUtils.DebugContext) and performs some checks before, throughout, and after evaluation of the model. See the added tests for some examples.
Moreover, once we improve the documentation (I'm going to add a section on "Syntax"), we can then link to those in the warnings/errors to point the user to where they can figure stuff out.
Here's an example:
julia> using DynamicPPL, Distributions
julia> model = DynamicPPL.TestUtils.DEMO_MODELS[1];
julia> issuccess, (trace, _) = check_model(model);
julia> issuccess
true
julia> trace
3-element Vector{DynamicPPL.DebugUtils.Stmt}:
assume: VarName[s[1], s[2]] = [5.58585, 0.73498] .~ InverseGamma{Float64}(invd=Gamma{Float64}(α=2.0, θ=0.333333), θ=3.0) ⟼ [5.58585, 0.73498] (logprob = -4.46134)
assume: VarName[m[1], m[2]] = [0.264206, 0.781137] .~ Normal{Float64}[Normal{Float64}(μ=0.0, σ=2.36344), Normal{Float64}(μ=0.0, σ=0.85731)] ⟼ [0.264206, 0.781137] (logprob = -2.96538)
observe: [1.5, 2.0] ~ DiagNormal(μ=[0.264206, 0.781137], Σ=[5.58585 0.0; 0.0 0.73498]) (logprob = -3.6914)
julia> trace[1].varname
2-element Vector{VarName{:s, Setfield.IndexLens{Tuple{Int64}}}}:
s[1]
s[2]
Thoughts?
As an additional thing, though I'm uncertain if we want this or not, I've added the possibility of saving the tilde-statements as we're going through the model + potentially saving the varinfo after every statement. This can be quite useful for both us debugging "remotely" (as in, just ask the user to run the model with this and give us the output) + it makes it quite easy to perform either visual or programmatic checks of the "trace" of a model, which I think will be quite handy.
Btw, throughout this I've discovered a few bugs, which have subsequently been fixed:) If we decide against something like this, then I'll make separate PRs for this.
Pull Request Test Coverage Report for Build 8750716198
Details
- 109 of 220 (49.55%) changed or added relevant lines in 5 files are covered.
- 17 unchanged lines in 2 files lost coverage.
- Overall coverage decreased (-2.9%) to 78.921%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| src/test_utils.jl | 0 | 5 | 0.0% |
| src/utils.jl | 1 | 7 | 14.29% |
| src/debug_utils.jl | 102 | 202 | 50.5% |
| <!-- | Total: | 109 | 220 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| src/model.jl | 1 | 89.22% |
| src/threadsafe.jl | 16 | 48.25% |
| <!-- | Total: | 17 |
| Totals | |
|---|---|
| Change from base Build 8745687397: | -2.9% |
| Covered Lines: | 2707 |
| Relevant Lines: | 3430 |
💛 - Coveralls
Codecov Report
Attention: Patch coverage is 49.08257% with 111 lines in your changes are missing coverage. Please review.
Project coverage is 79.16%. Comparing base (
816e962) to head (8c47fad). Report is 3 commits behind head on master.
:exclamation: Current head 8c47fad differs from pull request most recent head c1aa5ea. Consider uploading reports for the commit c1aa5ea to get more accurate results
| Files | Patch % | Lines |
|---|---|---|
| src/debug_utils.jl | 50.00% | 100 Missing :warning: |
| src/utils.jl | 14.28% | 6 Missing :warning: |
| src/test_utils.jl | 0.00% | 5 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #540 +/- ##
==========================================
- Coverage 83.65% 79.16% -4.50%
==========================================
Files 28 26 -2
Lines 3219 3197 -22
==========================================
- Hits 2693 2531 -162
- Misses 526 666 +140
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I like this a lot. I wonder though if it would be better to let check_model only return a Boolean. I think this might be simpler and less confusing for users that really only care about these common checks and warnings but are not interested in the tracing output.
I like this a lot.
Awesome:)
I wonder though if it would be better to let check_model only return a Boolean. I think this might be simpler and less confusing for users that really only care about these common checks and warnings but are not interested in the tracing output.
Yeah I was thinking about the same.
But it's really nice to also have the option of returning the "trace" of the model. People can sometimes find it very confusing how Turing.jl "sees" these tilde-statements, so I wanted to just make it more transparent.
Maybe do a check_model_and_extras (not the best name) or something? And then have check_model just wrap this?
Maybe do a check_model_and_extras (not the best name) or something? And then have check_model just wrap this?
Yes, that would be an option. An alternative could be check_model(; return_trace = Val(true)) (or maybe even just a Boolean flag given that type stability should not matter here?).
I thought this "optionally changing return-value" was somewhat frowned upon, or nah?
Yeah, it's not something I'd generally do. But I have the feeling it's somewhat OKish for functions that are a) only/mainly used in the REPL and b) expected to be called with the default arguments most of the time.
But I agree, the cleaner approach would be to have two methods and let one of them discard the additional output.
Gotcha, that seems sensible :+1:
I went the check_model_and_trace route for now
Any thoughts on the implementation @devmotion ? I realize I haven't overloaded show much, so am uncertain if what I've done is a good idea or not.
I'd say the main rules are to avoid type piracy, to absolutely never define show methods for Types, and otherwise follow the conventions in the docs (https://docs.julialang.org/en/v1/manual/types/#man-custom-pretty-printing; main points: two argument show method is a compact one-line representation whereas show(io::IO, ::MIME"text/plain", x) is used for longer descriptions and will also be used by display(x) in the REPL).
But then it seems like my current approach is good?
Btw @yebai you have any thoughts on this?
Another aspect I think would be useful with this is to extract the trace from the model a few times to determine if the model is "static" (in some sense) or not. We could then warn the user if, say, they're running HMC on a model which has a non-static number of parameters.
EDIT: Added a has_static_constraints method. This is also related to https://github.com/TuringLang/Turing.jl/issues/2195.
EDIT 2: Note that after this PR we can immediately improve the user-experience by just adding a check_model=true to Turing.sample and then run DynamicPPL.check_model(model) before we hit inference.
This is pretty much ready to go:)
We can likely introduce some checks performed after each MCMC step in the future, which can help catch additional issues like changing model dimensionality.