AbstractPPL.jl
AbstractPPL.jl copied to clipboard
`condition` method with weights
It would be nice to have a new method for condition that accepts weights from StatsBase; FrequencyWeights in particular would be very useful for things like minibatching or repeat observations.
Maybe one should just use a special set of weighted observations, i.e., condition(model, observations) where the weights are included in observations, to keep the API consistent.
Related: https://github.com/TuringLang/DynamicPPL.jl/issues/208
Maybe one should just use a special set of weighted observations, i.e.,
condition(model, observations)where the weights are included inobservations, to keep the API consistent.Related: TuringLang/DynamicPPL.jl#208
That could work, but wouldn't that require adding another struct or using something like a tuple for the observations argument? Not sure if that's any cleaner than adding a new argument.
Wouldn't this functionality be a generalization of the MiniBatchContext? With a vector instead of loglike_scalar, and a names field for the weighted variables?
Wouldn't this functionality be a generalization of the
MiniBatchContext? With a vector instead ofloglike_scalar, and a names field for the weighted variables?
It’s similar, although I’m not sure it would need a names field? The weights are on observations, rather than variables. Mostly I want something I can use with AbstractPPL instead of just DynamicPPL.
It’s similar, although I’m not sure it would need a names field? The weights are on observations, rather than variables.
Oh, I have no idea, that was just spontaneous generalization. It wouldn't hurt to implement the generalized case I guess.
Mostly I want something I can use with AbstractPPL instead of just DynamicPPL.
True, that's critical. I really have no better idea, but my hesitation against an extra argument is: it becomes part of the interface and thus priviledged. How far should be go with other special cases?
The alternative idea of using a special type sounds more generalizable to me. I have previously been thinking of allowing other extensible expressions the speculative "probability expressions" used in the design document, something like
condition(m, @P(do(X = ...)))
where the model is responsible of accepting such things or not. A weighting scheme, like
condition(m, @P(weighted(X = ..., weights)))
fits into that scheme.
Just to add my thoughts to the discussion:
- Adding weights as a keyword is a no-no IMO, for the same reasons as @phipsgabler mentioned. Weighting the observations (or the computation of any random variable) should be a separate thing as it's more general than just the use-cases it would have in
condition. - Should it not be an action on the
Model, e.g.weight(model, syms...)or something along those lines? In DPPL we would just implement this in a similar manner ascondition, i.e. using contexts under the hood but a nicer user-facing function.