dowhy icon indicating copy to clipboard operation
dowhy copied to clipboard

Why am i getting exactly the same results for two different models? (Related to estimand/Identification Method)

Open zahs123 opened this issue 2 years ago • 3 comments

Hi,

I am running the below code:

causal_graph = """digraph {

}"""


#print(df_dowhy)
model = dowhy.CausalModel(data=df_dowhy,
                     graph=causal_graph.replace("\n", " "),
                     treatment="drug",
                     outcome="outcome")
model.view_model()
from IPython.display import Image, display
display(Image(filename="causal_model.png"))

identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)

estimate = model.estimate_effect(identified_estimand, 
                                 method_name='backdoor.propensity_score_stratification',
                                target_units="att")
print(estimate)

I have left the digraph blank but i have tested two models: image

When i get the estimate from the two above they are exactly the same.. It seems that the only variables that matter are the non grey ones in identifying the estimand. However the output of 'identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)' gives me ### Estimand : 1 and an ### Estimand : 2 .

Estimand one is the expectation of the 3 coloured variables, whereas estimand 2 involves all the ones in grey too. My question is why am i getting the exact same estimate for the 2 different models above? How is the variable 'identified estimand' being used. Is it using estimand 1 or 2? it seems it is only using the first one which includes only the 3 colored variables.

I don't understand how these backdoor variables are being taken into account if remvoing them gives me same estimate - it is the exact same result.

zahs123 avatar Jul 21 '22 14:07 zahs123

For removing the confounding effect, we only need the variables that cause both treatment (drug) and outcome. So that's why, in your graph, b1, b2, b3 are redundant. They can be included, but then they would increase the variance of the estimate.

So that's why the identify_effect shows both options, but then it uses the Estimand1 as a default.

amit-sharma avatar Jul 23 '22 08:07 amit-sharma

Thanks i'm guessing if i want to use estimand 2 then this relates to one of the iv methods?

zahs123 avatar Jul 24 '22 17:07 zahs123

yeah, in this graph, you can use either of b1, b2, b3 as an instrument.

amit-sharma avatar Jul 25 '22 04:07 amit-sharma

Closing as the question seems to be answered.

petergtz avatar Oct 14 '22 13:10 petergtz