AbstractPPL.jl icon indicating copy to clipboard operation
AbstractPPL.jl copied to clipboard

`condition` method with weights

Open ParadaCarleton opened this issue 3 years ago • 6 comments

It would be nice to have a new method for condition that accepts weights from StatsBase; FrequencyWeights in particular would be very useful for things like minibatching or repeat observations.

ParadaCarleton avatar Nov 04 '21 18:11 ParadaCarleton

Maybe one should just use a special set of weighted observations, i.e., condition(model, observations) where the weights are included in observations, to keep the API consistent.

Related: https://github.com/TuringLang/DynamicPPL.jl/issues/208

devmotion avatar Nov 04 '21 18:11 devmotion

Maybe one should just use a special set of weighted observations, i.e., condition(model, observations) where the weights are included in observations, to keep the API consistent.

Related: TuringLang/DynamicPPL.jl#208

That could work, but wouldn't that require adding another struct or using something like a tuple for the observations argument? Not sure if that's any cleaner than adding a new argument.

ParadaCarleton avatar Nov 04 '21 20:11 ParadaCarleton

Wouldn't this functionality be a generalization of the MiniBatchContext? With a vector instead of loglike_scalar, and a names field for the weighted variables?

phipsgabler avatar Nov 09 '21 16:11 phipsgabler

Wouldn't this functionality be a generalization of the MiniBatchContext? With a vector instead of loglike_scalar, and a names field for the weighted variables?

It’s similar, although I’m not sure it would need a names field? The weights are on observations, rather than variables. Mostly I want something I can use with AbstractPPL instead of just DynamicPPL.

ParadaCarleton avatar Nov 09 '21 18:11 ParadaCarleton

It’s similar, although I’m not sure it would need a names field? The weights are on observations, rather than variables.

Oh, I have no idea, that was just spontaneous generalization. It wouldn't hurt to implement the generalized case I guess.

Mostly I want something I can use with AbstractPPL instead of just DynamicPPL.

True, that's critical. I really have no better idea, but my hesitation against an extra argument is: it becomes part of the interface and thus priviledged. How far should be go with other special cases?

The alternative idea of using a special type sounds more generalizable to me. I have previously been thinking of allowing other extensible expressions the speculative "probability expressions" used in the design document, something like

condition(m, @P(do(X = ...)))

where the model is responsible of accepting such things or not. A weighting scheme, like

condition(m, @P(weighted(X = ..., weights)))

fits into that scheme.

phipsgabler avatar Nov 20 '21 19:11 phipsgabler

Just to add my thoughts to the discussion:

  1. Adding weights as a keyword is a no-no IMO, for the same reasons as @phipsgabler mentioned. Weighting the observations (or the computation of any random variable) should be a separate thing as it's more general than just the use-cases it would have in condition.
  2. Should it not be an action on the Model, e.g. weight(model, syms...) or something along those lines? In DPPL we would just implement this in a similar manner as condition, i.e. using contexts under the hood but a nicer user-facing function.

torfjelde avatar Nov 21 '21 16:11 torfjelde