dowhy Mediation analysis with GLMs

Mediation analysis with GLMs

Open rokapre opened this issue 3 years ago • 1 comments

Hi,

I am new to DoWhy and am trying to do a mediation analysis (two stages) where the second stage is a logistic regression. But I am having trouble initializing the model in the first place. The example here https://microsoft.github.io/dowhy/example_notebooks/dowhy_mediation_analysis.html is for linear regression only. I am getting lots of errors to do GLM I tried the following syntax:

causal_estimate_nie = model.estimate_effect(identified_estimand_nie,
                                        method_name="mediation.two_stage_regression",
                                       confidence_intervals=False,
                                       test_significance=False,
                                        method_params = {
                                            'first_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator,
                                            'second_stage_model': dowhy.causal_estimators.generalized_linear_model_estimator.GeneralizedLinearModelEstimator,
                                            'glm_family': sm.families.Binomial                                          
                                        })

The error I got is "TypeError: starting_mu() missing 1 required positional argument: 'y'"

For reference btw I am using the "VitD "dataset from the R package "ivtools" (I saved it as a csv to upload into Python). The code before this is:


vitD = pd.read_csv("IV_VitDdata.csv")
vitD["vitd_std"] = (vitD["vitd"] - 20)/20

G = """graph[directed 1 node [id "filaggrin" label "filaggrin"]
                node[id "age" label "age"]
                node[id "vitd_std" label "vitd_std"]
                node[id "death" label "death"]
                edge[source "filaggrin" target "vitd_std"]
                edge[source "age" target "vitd_std"]
                edge[source "age" target "death"]
                edge[source "vitd_std" target "death"]]"""

model = CausalModel(data=vitD,treatment="filaggrin",outcome="death",
                    graph=G)

identified_estimand_nie = model.identify_effect(estimand_type="nonparametric-nie",
                                            proceed_when_unidentifiable=True)
print(identified_estimand_nie)