pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Remove `givens_dict` argument in sample posterior predictive internal function

Open ferrine opened this issue 1 year ago • 21 comments

https://github.com/pymc-devs/pymc/blob/473c9528e56fc291e323e30082c8f4295dbd6149/pymc/sampling/forward.py#L618

If I pass variables to cmopile forward function, would it be a correct do-inference?

ferrine avatar Mar 20 '23 12:03 ferrine

That was the idea but we never tested it carefully, so that's why it's not announced in the docstrings.

This is more of a Discourse issue than a GitHub one though.

ricardoV94 avatar Mar 20 '23 12:03 ricardoV94

Just a note that it is impossible to pass these arguments without code modification, so it is not provided to a user API at this moment.

UPD. As far as I tested this modification, it does what is expected, and the downstream variables are affected by the givens dict in the way I would expect them to be affected.

ferrine avatar Mar 20 '23 12:03 ferrine

Right, that's what I meant.

I think we should refactor a bit the code that:

  1. Creates a function that outputs draws from a (possibly partial) subset of inputs, possibly with modifications like givens.
  2. Calls the function with points from the InferenceData as inputs and stores the outputs back into an InferenceData

Posterior predictive tries to do too much right now (e.g., infer variables that are new or have changed, or that depend on variables that are missing from the trace...).

ricardoV94 avatar Mar 20 '23 12:03 ricardoV94

I found the implementation to be overwhelming as well, and not sure if all the parts are required. Here is the snippet I used to verify my intuition about this argument holds

with pm.Model() as m:
    b = pm.Normal("b")
    c = pm.Normal("c", b, 1, observed=1)
    d = pm.Normal("d")
    e = pm.Normal("e", c * 2 + d, 1, observed=3)
    t = pm.sample()

then you can play with the givens dict and check which variables were affected

# posterior predictive
p = sample_posterior_predictive(t, m, givens_dict={
    m.named_vars['b']: 100,
    m.named_vars['d']: 10,
})
az.plot_posterior(p.posterior_predictive["e"])
az.plot_posterior(p.posterior_predictive["c"])

I tested different combinations and it acts intuitively. Including the effect from the multiple observables, you can make do-inference on one for the observed variables and the effect will propagate to other variables.

ferrine avatar Mar 20 '23 13:03 ferrine

In your example there is no posterior information being used, because all your variables depend on the givens_dict, so they are all sampled from the prior. You should be getting the same results as in

with pm.Model() as m:
    b = 100
    c = pm.Normal("c", b, 1, observed=1)
    d = 10
    e = pm.Normal("e", c * 2 + d, 1, observed=3)
    t = pm.sample_prior_predictive()

ricardoV94 avatar Mar 20 '23 13:03 ricardoV94

You can modify the divens dict and get it differently

ferrine avatar Mar 20 '23 13:03 ferrine

Something like that

# posterior predictive
p = sample_posterior_predictive(t, m, givens_dict={
    m.named_vars['c']: 100,
    m.named_vars['d']: 10,
})
az.plot_posterior(p.posterior_predictive["e"])
az.plot_posterior(p.posterior_predictive["c"])

So it is now mixed, the intermediate observed is not sampled, and one of the posterior variables is not sampled

ferrine avatar Mar 20 '23 13:03 ferrine

Anyway, I was thinking of a lower level API like:

with pm.Model() as m:
    b = pm.Normal("b")
    c = pm.Normal("c", b, 1, observed=1)
    d = pm.Normal("d")
    e = pm.Normal("e", c * 2 + d, 1, observed=3)
    t = pm.sample()

pp_fn = m.compile_fn(inputs=[c], outs=[e], givens={d: np.array(100.0)})
pm.sample_fn_from_trace(pp_fn, t)

Which pm.sample_posterior_predictive can use under the hood.

ricardoV94 avatar Mar 20 '23 13:03 ricardoV94

Something like that

# posterior predictive
p = sample_posterior_predictive(t, m, givens_dict={
    m.named_vars['c']: 100,
    m.named_vars['d']: 10,
})
az.plot_posterior(p.posterior_predictive["e"])
az.plot_posterior(p.posterior_predictive["c"])

So it is now mixed, the intermediate observed is not sampled, and one of the posterior variables is not sampled

That example is still not using any posterior information. But anyway it doesn't matter for your question.

ricardoV94 avatar Mar 20 '23 13:03 ricardoV94

Just an idea, but worth considering the API. Not saying I don't like givens_dict, but perhaps it could just be givens (without the _dict), or observed, or conditioned_upon? Just an idea.

drbenvincent avatar Mar 21 '23 12:03 drbenvincent

What about a more mechanical thing like replace?

ricardoV94 avatar Mar 24 '23 06:03 ricardoV94

Actually, why not do. I think that would be extremely good. It still mechanical, as you say. But it would really allow us to push the do-operator/causal inference angle.

drbenvincent avatar Mar 24 '23 06:03 drbenvincent

I agree that do would be a nice name, even though it's not very technical, it's what people would look for and know what it does conceptually.

twiecki avatar Mar 24 '23 11:03 twiecki

I imagine do is just one of many uses you can give to this functionality. If I replace an RV by another RV it's not a do, but still a valid use of this kwarg.

The docstrings and notebooks can (and should) still refer to do of course.

ricardoV94 avatar Mar 24 '23 17:03 ricardoV94

How about some redundancy where there is a do kwarg and a givens kwarg? A user would provide values for only one of them.

drbenvincent avatar Mar 25 '23 09:03 drbenvincent

I was going to suggest the same, just with do and replace.

On Sat, Mar 25, 2023, 10:39 Benjamin T. Vincent @.***> wrote:

How about some redundancy where there is a do kwarg and a givens kwarg? A user would provide values for only one of them.

— Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc/issues/6614#issuecomment-1483779285, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFETGCFV23YBCYGMCZUTX3W524MXANCNFSM6AAAAAAWBAJP3E . You are receiving this because you commented.Message ID: @.***>

twiecki avatar Mar 25 '23 09:03 twiecki

Is there any consensus of how to implement that? I like the do idea

ferrine avatar Apr 19 '23 12:04 ferrine

I propose to have both, even if it's a bit unusual.

twiecki avatar Apr 19 '23 12:04 twiecki

If we all agree that the do operation works as an intervention that removes the causal links that feed into the intervened node, then we can call this kwarg do. If instead the consensus is that the do operator should set a value to condition on (this would be equivalent to setting an observed value), then we should use a different name. I vote that we use a single kwarg, but try to make it consistent with the definitions from the causal inference literature. The only reason I called the argument of compile_forward_ givens_dict was because pytensor.function has an argument givens that must be a list of tuples, and I didn’t want to overload the argument name with different types.

lucianopaz avatar May 12 '23 13:05 lucianopaz

@drbenvincent pointed me to Pearl’s “Causality” section 1.3.1, that is quite clear in saying that do removes the incoming causal links of the intervened node. With that, I agree with renaming the argument do

lucianopaz avatar May 12 '23 14:05 lucianopaz

Superseded by the do method. We should perhaps remove the givens_dict logic to reduce the number of moving pieces.

Related to #6876 to really make the functionality irrelevant

ricardoV94 avatar Feb 06 '24 09:02 ricardoV94