MCMCChains.jl icon indicating copy to clipboard operation
MCMCChains.jl copied to clipboard

DIC example is broken

Open rikhuijzer opened this issue 3 years ago • 12 comments

Currently, MCMCChains contains a broken DIC example, see https://github.com/TuringLang/MCMCChains.jl/issues/267 for more information.

rikhuijzer avatar Feb 28 '21 10:02 rikhuijzer

Upon second read, this issue isn't very clear. It is about the untested usage example at

https://github.com/TuringLang/MCMCChains.jl/blob/b0da9f1800b3b96cdb4f3f8dbb4fef1db7bd35bd/src/modelstats.jl#L17-L31

rikhuijzer avatar Mar 27 '21 15:03 rikhuijzer

Hmm, I'm thinking about adding more model selection / IC options to MCMCChains but I'm not fully clear about the current API. Wouldn't it make more sense with an interface like:

dic(model::DynamicPPL.Model, chains::Chains)

and then implement a logpdf method based on pointwise_loglikelihoods that takes a model and a chains value:

import Distributions: logpdf
function logpdf(model::DynamicPPL.Model, chain::Chains)
    logliks_per_input = pointwise_loglikelihoods(model, chain)
    Float64[mean(lls) for (_, lls) in logliks_per_input]
end

?

It seems to me one can then implement waic and maybe even loo with a similar interface. Comments welcome, maybe I'm missing something fundamental so stop me before I go ahead implementing this... ;)

robertfeldt avatar Jun 17 '21 12:06 robertfeldt

The fundamental problem is that MCMCChains should not depend on DynamicPPL. But I assume one could/should work around this problem by moving some of the interface in DynamicPPL to AbstractMCMC which (of course) MCMCChains already depends on.

devmotion avatar Jun 17 '21 12:06 devmotion

Ok, I see and it kind of makes sense.

So maybe a logpdf function making method makes sense then and this could be added to either Turing.jl, to DynamicPPL, or to a new package ModelSelection.jl?

function make_logpdf_func(model::DynamicPPL.Model)
    function logpdffunc(c::Chains) 
        Float64[mean(lls) for (_, lls) in pointwise_loglikelihoods(model, c)]
    end
    logpdffunc
end

and then it can be used with MCMCChains like:

dic(chains, make_logpdf_func(model))
waic(chains, make_logpdf_func(model))

etc.

robertfeldt avatar Jun 17 '21 13:06 robertfeldt

For psisloo() and waic() wouldn't you need the pointwise_loglikelihoods(model, c) matrix? Not sure for wdic.

I really like the name ModelSelection.jl, maybe in JuliaStats?

goedman avatar Jun 17 '21 13:06 goedman

aic etc. are defined in StatsBase: https://github.com/JuliaStats/StatsBase.jl/blob/2faa6e80b7966b915086d8cd5a4a4d89a2126db5/src/statmodels.jl#L192 I think dic etc. should be defined in a similar way if possible: there's some lightweight functional interface (such as loglikelihood) and if it is implemented for a model then aic etc. work automatically.

devmotion avatar Jun 17 '21 13:06 devmotion

BTW just recently also a pointwise loglikelihood loglikelihood(model, observation) (with loglikelihood(model, :) for all observations) was added: https://github.com/JuliaStats/StatsBase.jl/pull/685

devmotion avatar Jun 17 '21 13:06 devmotion

David, is your thinking to have something like an MCMCModel <: StatisticalModel?

goedman avatar Jun 17 '21 14:06 goedman

Yes, and then one would just make sure that StatsBase.loglikelihood(::MCMCModel) etc. are implemented. Since e.g. in DynamicPPL we don't want to subtype StatisticalModel (there's already a different supertype), one could maybe define a way to construct a MCMCModel from e.g. DynamicPPL.Model and mainly use it internally (e.g. dic(model::DynamicPPL.Model) = dic(MCMCModel(model)). Maybe it could even just be a wrapper of other models such as DynamicPPL.Model. Maybe even DynamicPPL could define its custom StatisticalModel construct and it would not have to be unified across packages.

devmotion avatar Jun 17 '21 14:06 devmotion

aic etc. are defined in StatsBase: https://github.com/JuliaStats/StatsBase.jl/blob/2faa6e80b7966b915086d8cd5a4a4d89a2126db5/src/statmodels.jl#L192 I think dic etc. should be defined in a similar way if possible: there's some lightweight functional interface (such as loglikelihood) and if it is implemented for a model then aic etc. work automatically.

DIC or WAIC? I'm of the opinion that DIC probably shouldn't be implemented at all, unless you're using it for pedagogical purposes I guess. It's kind of a hacked-together tool that only took off because it's about a third of the way to being Bayesian and it got added to BUGS. DIC isn't transformation-invariant, it only works for normal distributions, it uses point estimates rather than the full posterior, and it's got even more problems besides that. On the other hand, I think implementing WAIC and WBIC in StatsBase is a good idea and am all for it.

The thing I don't think is quite as good of an idea is implementing PSIS-LOO in StatsBase, for a couple reasons. First, it's quite a heavy -- I'm trying to build an implementation right now and it's definitely taking a while. (Anyone who wants to lend a hand is welcome to help! Message me on the Slack.) If you take a look at the loo package by the Stan team in R, it takes one file to implement WAIC and something like 7 or 8 to implement PSIS-LOO. The opposite holds too -- I don't think people should have to import all of StatsBase just to get PSIS-LOO. The second thing is that PSIS isn't inherently only applicable to cross validation -- it's just a modified version of importance sampling, which you can use anywhere you'd use importance sampling. It would actually make more sense to implement it as a Turing sampling algorithm, I think.

ParadaCarleton avatar Jun 18 '21 02:06 ParadaCarleton

Thanks for all your input. I'll be holding off on PSIS-LOO then, hearing that there is already ongoing work on it. First focus is to explore good/simple ways to add WAIC and possible WBIC to StatsBase and make them useful with MCMCChains + DynamicPPL.Model as discussed above.

robertfeldt avatar Jun 18 '21 07:06 robertfeldt

Thanks for all your input. I'll be holding off on PSIS-LOO then, hearing that there is already ongoing work on it. First focus is to explore good/simple ways to add WAIC and possible WBIC to StatsBase and make them useful with MCMCChains + DynamicPPL.Model as discussed above.

I'd check out the discussion here for implementation tips on avoiding confusion from WBIC, e.g. by referring to it as WSIC and making sure documentation explains the differing goals of WSIC and WAIC (Picking the correct model vs. picking the model with best expected fit).

If you're interested in adding WSIC/WBIC/PSIS-LOO functions to Julia, feel free to reach out to me on Slack!

ParadaCarleton avatar Jun 18 '21 16:06 ParadaCarleton