DynamicPPL.jl Probability interface tutorial

First addition to the DynamicPPL tutorials; breaking this up as Hong suggested. Goes over how to use the basic interfaces (e.g. logjoint, loglikelihood, logdensityof).

Apr 22 '22 21:04 ParadaCarleton

Many thanks @ParadaCarleton - looks like a good PR!

Jun 08 '22 20:06 yebai

bors r+

Jun 08 '22 20:06 yebai

It seems it was not tested and is not included in the documentation?

Jun 08 '22 20:06 devmotion

It seems it was not tested and is not included in the documentation?

Sure, the plan is to have a series of small PRs to replace #365 - it is probably ok to add testing in a subsequent PR, but if @ParadaCarleton can quickly add a fix that would be good too.

Jun 08 '22 20:06 yebai

Yeah, that would be good. It's apparently also a jmd file but we don't support these here in DynamicPPL yet.

Jun 08 '22 20:06 devmotion

bors r-

Jun 08 '22 20:06 devmotion

Canceled.

Jun 08 '22 20:06 bors[bot]

@yebai should be ready to merge

Dec 18 '22 18:12 ParadaCarleton

bors r+

Dec 18 '22 19:12 yebai

bors r-

Dec 18 '22 19:12 yebai

Canceled.

Dec 18 '22 19:12 bors[bot]

bors r+

Dec 18 '22 19:12 yebai

:-1: Rejected by code reviews

Dec 18 '22 19:12 bors[bot]

bors r+

Dec 18 '22 19:12 yebai

:-1: Rejected by code reviews

Dec 18 '22 19:12 bors[bot]

@devmotion, it seems bors is not happy for me to contribute and approve this PR. Can you approve and merge it?

Dec 18 '22 19:12 yebai

@devmotion can you take another look? We would like to move this forward and can perform additional improvements even after merged.

Dec 18 '22 20:12 yebai

The main issue is that, as mentioned above, currently the cross-validation example does not work: https://github.com/TuringLang/DynamicPPL.jl/actions/runs/3726471522/jobs/6319942583#step:5:248

Even some other example fail (e.g., https://github.com/TuringLang/DynamicPPL.jl/actions/runs/3726471522/jobs/6319942583#step:5:231) which, however, can be fixed easily by making FillArrays a direct dependency and loading it.

Probably it would be good to make the docs build fail if there are any errors in the examples.

The errors can be deduced also from the missing output in https://beta.turing.ml/DynamicPPL.jl/previews/PR404/tutorials/prob-interface/.

Dec 18 '22 20:12 devmotion

I would suggest the following:

Change the model to a Gaussian model with normal-inverse gamma prior
Write a function bayes_loss(dataset)
Use samples from the exact posterior instead of samples from NUTS

Then the example can be executed without depending on Turing and without having to deal with Chains or StructArray. IMO that will make the example more readable, I imagine it could be something like:

function bayes_loss(dataset::Vector{Float64}; nsamples=1_000, nfolds=5)
    posterior_samples_mu = Vector{Float64}(undef, nsamples)
    posterior_samples_sigma2 = Vector{Float64}(undef, nsamples)
    loss = 0.0
    for (training, validation) in kfolds(dataset, nfolds)
        # Sample from the exact posterior
        rand!(InverseGamma(...), posterior_samples_sigma2)
        rand!(MvNormal(..., posterior_samples_sigma2 ./ ...), posterior_samples_mu)

        # Estimate Bayes loss
        model = gdemo(length(validation)) | (x=validation,)
        loss += sum(1:nsamples) do i
            logjoint(model, (mu=posterior_samples_mu[i], sigma2=posterior_samples_sigma2[i]))
        end
    end
    return loss
end

Dec 18 '22 21:12 devmotion

bors r+

Dec 19 '22 04:12 ParadaCarleton

Pull request successfully merged into master.

Build succeeded:

Dec 19 '22 07:12 bors[bot]

thanks @ParadaCarleton and @devmotion!

Dec 19 '22 11:12 yebai

I would suggest the following:

* Change the model to a Gaussian model with normal-inverse gamma prior

* Write a function `bayes_loss(dataset)`

* Use samples from the exact posterior instead of samples from NUTS

Then the example can be executed without depending on Turing and without having to deal with Chains or StructArray. IMO that will make the example more readable, I imagine it could be something like:

function bayes_loss(dataset::Vector{Float64}; nsamples=1_000, nfolds=5)
    posterior_samples_mu = Vector{Float64}(undef, nsamples)
    posterior_samples_sigma2 = Vector{Float64}(undef, nsamples)
    loss = 0.0
    for (training, validation) in kfolds(dataset, nfolds)
        # Sample from the exact posterior
        rand!(InverseGamma(...), posterior_samples_sigma2)
        rand!(MvNormal(..., posterior_samples_sigma2 ./ ...), posterior_samples_mu)

        # Estimate Bayes loss
        model = gdemo(length(validation)) | (x=validation,)
        loss += sum(1:nsamples) do i
            logjoint(model, (mu=posterior_samples_mu[i], sigma2=posterior_samples_sigma2[i]))
        end
    end
    return loss
end

@devmotion Just curious about the above code block, maybe two stupid questions:

is the logjoint accepting (model, NamedTuple)?
are we calculating loss or just log likelihood here? if it's loss, shall we return -loss?

Jan 20 '23 10:01 YongchaoHuang

DynamicPPL.jl DynamicPPL.jl copied to clipboard

Probability interface tutorial

DynamicPPL.jl
DynamicPPL.jl copied to clipboard