DynamicPPL.jl
                                
                                
                                
                                    DynamicPPL.jl copied to clipboard
                            
                            
                            
                        Improve docs, tutorials and README
I haven't checked the tutorial (BTW there exist already some on the Turing webpage which could be updated and moved here) but I think it might be better to add it to the documentation instead of the README. Then it would even be possible to run code and ensure that it does not become outdated.
@devmotion thanks, I encouraged Carlos to write this tutorial since he is learning how DynamicPPL works recently. The idea is to have a Turing-independent walk-through example of how DynamicPPL operates. It might be obvious for seasoned Turing developers but turns out tricky for newcomers. Keeping a simple tutorial inside DynamicPPL help to demystify stuff here, I think.
I haven't checked the tutorial (BTW there exist already some on the Turing webpage which could be updated and moved here) but I think it might be better to add it to the documentation instead of the README. Then it would even be possible to run code and ensure that it does not become outdated.
I did suggest that to Hong, but it's up to him. We could also put it in both places, or alternatively just put a link to the docs in the README.
Keeping a simple tutorial inside DynamicPPL help to demystify stuff here, I think.
I think it's great to improve documentation (we really have), and I als think it should be moved ro DynamicPPL. I just thought it might be better to add it to the documentation in the docs folder (in DynamicPPL) since it's a longer write-up and could be executed.
I'm also not certain if writing a sampler is the best way to demonstrate how to work with DynamicPPL, in particular a simple one like MH which only requires logjoint and logjoint_unconstrained and shouldn't have to explicitly touch things like SimpleVarInfo.
The API used in this PR is a bit overkill for RWMH indeed, but that’s off the point. The goal is to explain these APIs. I suggest that we can add a new section that provides a simpler MH implementation based on logjoint APIs.
I'd say I agree with @devmotion regarding instead adding docs where this could potentially go, instead of putting this "somewhat" longer write-up in the README.
We have a plan to implement a doc feature where the README file is compiled into the default “home” page for documentation. So keeping the “get started” tutorial in README doesn’t matter that much. For example, both AdvancedHMC and Bijectors keep docs in the README file. These will go into respective docs when the suggested doc workflow is implemented. But also happy to move it to docs and add a link in README if there is a strong preference over that.
The API used in this PR is a bit overkill for RWMH indeed, but that’s off the point. The goal is to explain these APIs.
It's exactly because of the goal I think a complicated implementation of RWMH is the wrong thing to do here :shrug:
An example implementation of something like sampler should do either of the following:
- Clarify the feature which we're trying to explain by demonstrating application.
 - Bring together several features in an application to demonstrate why the collection of features are useful; "usefulness" I define here as a) making implementation of said application much simpler than the norm, or b) enabling implementation of application which wasn't (easily) implementable before.
 
IMHO a demo implementing RWMH doesn't really do either:
evaluate!!,logjointandgetlogpare the only "features" (other how to define aModel) which are used, and neither of these are particularly difficult to understand.- All we really need for RWMH is 
logjoint(and transformation of variables, but this isn't covered here anyways). This is not really something that is now suddenly enabled by DPPL. If all one wanted was alogjointcorresponding to a generative model, then you might as well implement that particular function, ignoring everything that DPPL has to offer. 
I'm definitively not against including a tutorial! Nor am I against large parts of this PR, e.g. I very much appreciate trying to explain what's going on with evaluate!! etc. I just don't think the chosen example really motivates why one should care about DPPL, nor why it's particularly useful.
Maybe we should instead demonstrate an example of a sampler which actually makes use of the tilde-pipeline, transformed variables, etc.? I think demonstrating how to "hook into" the tilde-pipeline would be a nice addition.
I agree with some parts of Tor's comments here -- notably, I think this fits better in a tutorial than on the README.
That being said, I'm concerned about throwing too much stuff into one tutorial and ending up with it being unreadable. A lot of the current tutorials are definitely like that. It's definitely much easier to make a tutorial too long than to make it too short, so we should err on the side of making them short. The costs are very asymmetric -- if you make the tutorial too long, people will refuse to read it, but if you make it too short, people just have to click through to the next page slightly more often.
Maybe I should note that this tutorial is already longer than around 60% of pages in the Stan reference manual. There's nothing wrong with a short tutorial -- some pages in the Stanual are shorter than this comment!
So basically -- I'd be happy to write another tutorial on hooking into the tilde-pipeline, but I think it should be separate. This is meant to be a very brief+gentle introduction to DynamicPPL.
To make my comments more concrete and also more constructive. This is a rough sketch of a simple MH sampler for DynamicPPL:
using DynamicPPL
using AbstractMCMC
struct MHSampler{P} <: AbstractMCMC.AbstractSampler
    proposal::P
end
function AbstractMCMC.step(
    rng::Random.AbstractRNG, model::DynamicPPL.Model, sampler::MHSampler;
    init_from::Union{Nothing,NamedTuple}=nothing,
    kwargs...
)
    if init_from === nothing
        x = SimpleVarInfo(
            model, SamplingContext(rng, SampleFromPrior(), DefaultContext())
        )
        samples = DynamicPPL.values_as(x, NamedTuple)
        logjoint_samples = getlogp(x)
    else
        samples = init_from
        logjoint_samples = logjoint(model, samples)
    end
    return samples, (samples, logjoint_samples)
end
function AbstractMCMC.step(
    rng::Random.AbstractRNG, model::DynamicPPL.Model, sampler::MHSampler, state;
    kwargs...
)
    samples, logjoint_samples = state
    proposed = sampler.proposal(rng, samples)
    logjoint_proposed = logjoint(model, proposed)
    if logjoint_samples < logjoint_proposed + randexp(rng)
        return proposed, (proposed, logjoint_proposed)
    else
        return samples, (samples, logjoint_samples)
    end
end
# Example
using Distributions
@model function demo(x)
    m ~ Normal(0, 1)
    x ~ Normal(m, 1)
end
model = demo([1.2, 3.5, 2.4])
sampler = MHSampler((rng, x) -> (m = randn(rng) + x.m,))
sample(model, sampler, 100)
sample(model, sampler, 100; init_from=(m=10.0,))
sample(model, sampler, 100; discard_initial=100, thinning=10)
sample(model, sampler, MCMCSerial(), 100, 4)
sample(model, sampler, MCMCThreads(), 100, 4)
sample(model, sampler, MCMCDistributed(), 100, 4)
In fact it would be even simpler if DynamicPPL would support initial parameters also for SimpleVarInfo, currently it always uses VarInfo. As you can see, one gets all kind of functionality for free by implementing the AbstractMCMC interface. It's designed exactly for this purpose, and hence we want users and downstream developers to implement it when implementing a sampler. The actual MH step is very simple and the idea is not to deal with SimpleVarInfo there at all but only with the NamedTuple and logjoint. Again it's designed for this purpose, so we should tell people about it :slightly_smiling_face: The only place where more internal/advanced stuff is used is when the initial sample is generated. Maybe we should implement something like
function rand(rng::AbstractRNG, ::Type{T}, model::Model) where {T}
    x = SimpleVarInfo(
        model, SamplingContext(rng, SampleFromPrior(), DefaultContext())
    )
    return DynamicPPL.values_as(x, T)
end
Then we could use just rand(rng, NamedTuple, model) and logjoint there as well and write:
function AbstractMCMC.step(
    rng::Random.AbstractRNG, model::DynamicPPL.Model, sampler::MHSampler;
    init_from::Union{Nothing,NamedTuple}=nothing,
    kwargs...
)
    samples = if init_from === nothing
        rand(rng, NamedTuple, model)
    else
        init_from
    end
    logjoint_samples = logjoint(model, samples)
    return samples, (samples, logjoint_samples)
end
                                    
                                    
                                    
                                
This is great, thanks @devmotion! I'll definitely use it.
Is this a fully-working sampler? I'm asking because it doesn't seem to implement some of the things in the AbstractMCMC tutorial -- namely it's missing an InferenceAlgorithm and a SamplerState type.
It's fully working. The AbstractMCMC tutorial is focused on Turing specifics and largely outdated. The correct, (hopefully) complete and up-to-date docs are in the AbstractMCMC repo: https://turinglang.github.io/AbstractMCMC.jl/dev/
With #381 being merged and released, the MH sampler can be simplified to:
using DynamicPPL
using AbstractMCMC
using Random
struct MHSampler{P} <: AbstractMCMC.AbstractSampler
    proposal::P
end
function AbstractMCMC.step(
    rng::Random.AbstractRNG, model::DynamicPPL.Model, sampler::MHSampler;
    init_from::Union{Nothing,NamedTuple}=nothing,
    kwargs...
)
    samples = init_from === nothing ? rand(rng, model) : init_from
    logjoint_samples = logjoint(model, samples)
    return samples, (samples, logjoint_samples)
end
function AbstractMCMC.step(
    rng::Random.AbstractRNG, model::DynamicPPL.Model, sampler::MHSampler, state;
    kwargs...
)
    samples, logjoint_samples = state
    proposed = sampler.proposal(rng, samples)
    logjoint_proposed = logjoint(model, proposed)
    if logjoint_samples < logjoint_proposed + randexp(rng)
        return proposed, (proposed, logjoint_proposed)
    else
        return samples, (samples, logjoint_samples)
    end
end
Examples:
using Distributions
@model function demo(x)
    m ~ Normal(0, 1)
    x ~ Normal(m, 1)
end
model = demo([1.2, 3.5, 2.4])
sampler = MHSampler((rng, x) -> (m = randn(rng) + x.m,))
sample(model, sampler, 100)
sample(model, sampler, 100; init_from=(m=10.0,))
sample(model, sampler, 100; discard_initial=100, thinning=10)
sample(model, sampler, MCMCSerial(), 100, 4)
sample(model, sampler, MCMCThreads(), 100, 4)
sample(model, sampler, MCMCDistributed(), 100, 4)
No SimpleVarInfo, VarInfo, evaluate!! etc. :slightly_smiling_face:
@ParadaCarleton There are a lot of excellent improvements in this PR. However, it is a bit difficult for other team members to review and approve. I suggest that we break this PR into a few self-contained smaller PRs. We can quickly review and merge each individual PR.
Closed in favor of making several smaller PRs.