EconML icon indicating copy to clipboard operation
EconML copied to clipboard

Continuous Treatment & Marginal CATE & Effect

Open Leo-T-Zang opened this issue 2 years ago • 1 comments

Hi EconML Team,

I used EconML CausalForestDML over continuous Treatment. Here are couple questions about APIs and background theories that I hope for further clarification.

For continuous Treatment

  1. How does the model compute Marginal CATE? I understand that we could translate it to estimating a local gradient around a treatment vector conditional on observables. But given treatment is continuous values, how can we do it?

  2. What is the difference between const_marginal_ate(X=None) and const_marginal_effect ? Is that just for train and test sets?

  3. What is the difference or relationshiop between const_marginal_ate(X=None) and effect(X=None, *, T0=0, T1=1) for continuous treatments? I assume for const_marginal_ate that it has the assumption that the outcome is linear in the treatment vector.

  4. What is the difference or relationshiop between marginal_ate(T, X=None) and effect(X=None, *, T0=0, T1=1) ?

Thank you in advance!

Leo-T-Zang avatar Oct 28 '23 20:10 Leo-T-Zang

CausalForestDML computes a model which is linear in T, roughly

Y = \theta(X) T + \beta(X) + Y_0(X,W)

where theta and beta are forest-based models, and Y_0 is a nuisance function.

So the marginal CATE is just dY/dT = \theta(X). We don't need to compute a gradient because we're assuming linearity in T and the final model is fitting this directly.

For your second question, const_marginal_ate is the average marginal treatment effect over all of the Xs passed in, whereas const_marginal_effect is the per-X heterogeneous treatment effect, so the former is just the mean of the latter (but to be clear, CausalForestDML does not support X=None as a valid input to either method).

Because the treatment effect is assumed to be linear for all of our DML methods, they implement effect in terms of const_marginal_effect (not const_marginal_ate, because again that would average over the treatments instead of giving the heterogeneous effects): effect(X, T0, T1) is basically just const_marginal_effect(X) @ (T1-T0) where @ is matrix multiplication.

Again, because the treatment effect is always linear for these models, the marginal effect is the same for any value of T, and marginal_effect(T, X) == const_marginal_effect(X) and marginal_ate(T, X) == const_marginal_ate(X) for every T.

Hope that helps.

kbattocchi avatar Nov 13 '23 19:11 kbattocchi