EconML
EconML copied to clipboard
Shapley Values from CausalForestDML
I am working with the Customer Segmentation notebook found here.
I wanted to look at the Shapley values for X and W's in the data set and get the constant marginal effects. The code below shows how I ran the model and produced the charts. This leads to my question given the model DAG assumptions mentioned in this thread. This leads to my questions which are related to each other.
- If the second chart shows the effect of the W's on Y (demand), why wouldn't the chart also include the effect of income (X) and price (T)?
- Given the question above, is there a way to recover the Shapley values that account for for the effects of X,T and W's? I have looked at the Shapley values in the dict's that are produced. They look just fine but each dictionary has a different set of base values which indicates to me these are being run (as indicated in the code) as completely separate models. Is there a proper or recommended way to recover the Shaps from a single model? My goal is to use the Shapley values to say "this W accounts for this much movement away from the base value, while X accounts for this much...".
import shap
from econml.dml import CausalForestDML
est = CausalForestDML(model_y=GradientBoostingRegressor(), model_t=GradientBoostingRegressor())
est.tune(log_Y, log_T, X=X, W=W)
est.fit(log_Y, log_T, X=X, W=W)
shap_values_w = est.shap_values(W)
shap_values_x = est.shap_values(X)
shap.plots.beeswarm(shap_values_x['demand']['price'])
shap.plots.beeswarm(shap_values_w['demand']['price'])