dowhy icon indicating copy to clipboard operation
dowhy copied to clipboard

Mediation analysis with GLMs

Open rokapre opened this issue 3 years ago • 1 comments

Hi,

I am new to DoWhy and am trying to do a mediation analysis (two stages) where the second stage is a logistic regression. But I am having trouble initializing the model in the first place. The example here https://microsoft.github.io/dowhy/example_notebooks/dowhy_mediation_analysis.html is for linear regression only. I am getting lots of errors to do GLM I tried the following syntax:

causal_estimate_nie = model.estimate_effect(identified_estimand_nie,
                                        method_name="mediation.two_stage_regression",
                                       confidence_intervals=False,
                                       test_significance=False,
                                        method_params = {
                                            'first_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator,
                                            'second_stage_model': dowhy.causal_estimators.generalized_linear_model_estimator.GeneralizedLinearModelEstimator,
                                            'glm_family': sm.families.Binomial                                          
                                        })

The error I got is "TypeError: starting_mu() missing 1 required positional argument: 'y'"

For reference btw I am using the "VitD "dataset from the R package "ivtools" (I saved it as a csv to upload into Python). The code before this is:


vitD = pd.read_csv("IV_VitDdata.csv")
vitD["vitd_std"] = (vitD["vitd"] - 20)/20

G = """graph[directed 1 node [id "filaggrin" label "filaggrin"]
                node[id "age" label "age"]
                node[id "vitd_std" label "vitd_std"]
                node[id "death" label "death"]
                edge[source "filaggrin" target "vitd_std"]
                edge[source "age" target "vitd_std"]
                edge[source "age" target "death"]
                edge[source "vitd_std" target "death"]]"""

model = CausalModel(data=vitD,treatment="filaggrin",outcome="death",
                    graph=G)

identified_estimand_nie = model.identify_effect(estimand_type="nonparametric-nie",
                                            proceed_when_unidentifiable=True)
print(identified_estimand_nie)

rokapre avatar Jul 11 '21 21:07 rokapre

@rokapre The mediation functionality is experimental and currently only supports linear regression, as mentioned in the notebook.

Let me try to add support for logistic regression. Thanks for raising this issue @rokapre

amit-sharma avatar Jul 16 '21 12:07 amit-sharma

@amit-sharma Has support for logistic regression been implemented?

olamagnusandersson avatar Oct 14 '22 12:10 olamagnusandersson

@olamagnusandersson @rokapre I've create an enhancement issue for this now for easier tracking. Please subscribe there:

  • https://github.com/py-why/dowhy/issues/688

Closing this issue in favor of the new one.

petergtz avatar Oct 14 '22 13:10 petergtz