dowhy
dowhy copied to clipboard
Couple of questions on causal_identifier.py
I am trying to understand code base (and also learning causal reasoning) and I had couple of questions from causal_identifier.py
- At around line 319, we have:
is_identified = [ self._graph.all_observed(bset["backdoor_set"]) for bset in backdoor_sets ]
if all(is_identified):
self.logger.info("All common causes are observed. Causal effect can be identified.")
However, when calculating the backdoor sets, we might not calculate all possible back door sets (e.g. if we hit the 100000 limit when doing exhaustive method). Is this an issue for the id_identified
variable? Is the claim 'All common causes are observed' legitimate in such circumstances?
- I think
identify_nie_effect
andidentify_nde_effect
are identical. Is this intentional? If so, what is the reason for it?
- This is a good point. Ideally we should remove the
100000
limit and let the user specify that value. By default, it can beNone
and we let the algorithm run its full exhaustive search. Btw, if the current code outputs "all common causes are observed", then it is always correct. The issue is that the 100000 limit may fail to check a valid backdoor and then we may incorrectly conclude that "causal effect cannot be identified", when in fact it can be.
To your point, perhaps the message needs to change too. We may say, "The identify algorithm found a config where all common causes are observed", or "the identify algorithm failed to find a config where all common causes are observed"
- DoWhy currently has limited support for nde and nie (only simple mediators in the graph and only linear model for estimation). In these settings, estimation for nde and nie require the same set of three variables: backdoor variables, mediators, and the first-stage and second-stage mediators' confounders. So currently the two methods return the same information: whether mediators of this type can be identified in the graph. The plan is to implement a more complete identification for mediation, when these two functions may become different.