python-causality-handbook icon indicating copy to clipboard operation
python-causality-handbook copied to clipboard

Chapter 22-Non Parametric Double/Debiased ML

Open tiantiancoder opened this issue 3 years ago • 7 comments

Thank you for your tutorials! But I am confused about the Non Parametric Double/Debiased ML method.

image

From the loss function, we can see its CATE is still a fixed constant for each unit X. So how does it learn the non-linearity CATE? Looking forward to your reply!

tiantiancoder avatar Aug 20 '21 05:08 tiantiancoder

It doesn't! It learns a local linear cate. I try to explain that in the following section. Have you read it? If it's not clear, let me know.

matheusfacure avatar Aug 20 '21 11:08 matheusfacure

Thank you very much for answering my confusion. I have read the following section. When applying the Non-Parametric Double/Debias ML to the data that discount affects sales non-linearly, I want to know why the X is the discount residual and not the discount in the final non_param model. What I know is the X fitted into the final stage model in DML is features X not the residual.

debias_m = LGBMRegressor(max_depth=3)
denoise_m = LGBMRegressor(max_depth=3)

# orthogonalising step
discount_res =  discount.ravel() - cross_val_predict(debias_m, np.ones(discount.shape), discount.ravel(), cv=5)
sales_res =  sales.ravel() - cross_val_predict(denoise_m, np.ones(sales.shape), sales.ravel(), cv=5)

# final, non parametric causal model
non_param = LGBMRegressor(max_depth=3)
w = discount_res ** 2 
y_star = sales_res / discount_res

# here  why X is discount_res and not discount 
non_param.fit(X=discount_res.reshape(-1,1), y=y_star.ravel(), sample_weight=w.ravel());

tiantiancoder avatar Aug 20 '21 12:08 tiantiancoder

You are correct. X is what goes to the final model as the features. I can't find that piece of code. Can you point me to it? Here is what I found in the book

model_final = LGBMRegressor(max_depth=3)
 
# create the weights
w = train_pred["price_res"] ** 2 
 
# create the transformed target
y_star = (train_pred["sales_res"] / train_pred["price_res"])
 
# use a weighted regression ML model to predict the target with the weights.
model_final.fit(X=train[X], y=y_star, sample_weight=w);

matheusfacure avatar Aug 20 '21 12:08 matheusfacure

I find it in the section named What is Non-Parametric About? of Chapter 22. here is the link https://matheusfacure.github.io/python-causality-handbook/22-Debiased-Orthogonal-Machine-Learning.html#what-is-non-parametric-about . it is in the third code block.

tiantiancoder avatar Aug 20 '21 12:08 tiantiancoder

Oh, I see. Thats a bug :) Since there are no features in this case, X should only be a constant hehe. I'll fix it.

matheusfacure avatar Aug 20 '21 17:08 matheusfacure

it should be

debias_m = LGBMRegressor(max_depth=3)
denoise_m = LGBMRegressor(max_depth=3)

# orthogonalising step
discount_res =  discount.ravel() - cross_val_predict(debias_m, np.ones(discount.shape), discount.ravel(), cv=5)
sales_res =  sales.ravel() - cross_val_predict(denoise_m, np.ones(sales.shape), sales.ravel(), cv=5)

# final, non parametric causal model
non_param = LGBMRegressor(max_depth=3)
w = discount_res ** 2 
y_star = sales_res / discount_res

non_param.fit(X=np.ones(discount_res.reshape(-1,1)), y=y_star.ravel(), sample_weight=w.ravel());

matheusfacure avatar Aug 20 '21 17:08 matheusfacure

Hi. Sorry to come back to this old issue, but I am am having the same difficulties in understanding the code. fitting the non_param with np.ones does not make much sense, does it? how am I going to run predictions on this? in the DGP the elasticity depends on the discount, so I would pass the discount as feature when predicting, so I should use the discount as X when fitting. am I missing anything?

Thanks for the great book,

Andrea.

andreadisimone avatar Apr 08 '22 09:04 andreadisimone