EconML icon indicating copy to clipboard operation
EconML copied to clipboard

Wrong ATE estimation result, expected positive ATE got negative ATE

Open xiaogangzhu opened this issue 2 years ago • 2 comments

Hi,

I am fitting a DML model to my data, and I know the ATE of my treatment is positive, but the model gives me the negative result. I am wondering why this happened and how can I explain this? Is there any way to fix this wrong estimation?

xiaogangzhu avatar Aug 03 '23 06:08 xiaogangzhu

It's hard to say based on the information you provided, but here are a few things to keep in mind:

  • What do the confidence intervals look like? Depending on your data, it's possible that your estimate is just very imprecise (i.e. the confidence intervals are wide), such that the point estimate has the wrong sign but the "true" value is still within the confidence intevals.
  • The quality of the estimate depends on the quality of the first stage models, particularly the treatment model. Have you chosen appropriate models given the structure of your data?
  • DML models make several assumptions of the data that you have (e.g. that the treatment effect is linear in the treatment and that there are no unmeasured confounders). If these assumptions are not satisfied in your setting then you shouldn't expect to get accurate results.

If you have a concrete example you can share, then it might be possible to provide more specific guidance.

kbattocchi avatar Aug 04 '23 18:08 kbattocchi

Our goal is to find what make our outcome Y decrease, and Y should always be decreased given the treatment T. I am sorry that I cannot share the data, but the case is my data should be that if I give a treatment then my outcome will decrease. The Y and T are negative correlated. But the fact is that I also got some wrong data in which when T is increasing the Y decrease a bit. I used Gradient Boosting for first stage and second stage model and LinearDML, below is my code.

est1 = LinearDML(
        model_y=GradientBoostingRegressor(),
        model_t=GradientBoostingRegressor(),
        cv=5)

I find one of my treatment t ATE is negetive, and -5.658953873151606e-05 and the CI is (-6.571635269071133e-05, -4.746272477232079e-05). I don't know if the data would make the estimate wrong and how can I explain this result if it is actually the data issue?

xiaogangzhu avatar Aug 07 '23 08:08 xiaogangzhu