eli5 icon indicating copy to clipboard operation
eli5 copied to clipboard

Check how explain_weights works on regression problems

Open lopuhin opened this issue 7 years ago • 3 comments

More of a note to self - I'll expand it into something more reproducible or close:

  • explain_weights for xgboost regressor does not have a bias - is it possible to recover it, or some other indication of how good are the features vs. just predicting the mean? Maybe it applies to classification problems as well?
  • explain_weights for Lasso looks suspicious on the diabets + leak dataset - the leak does not stand out among other features, but it does for xgboost (need to check the model first).

lopuhin avatar Apr 27 '17 21:04 lopuhin

Hm, I'm not sure bias makes sense for explain_weights + xgboost regressor; regressor predicts values regardless of mean; GBMs can handle shifts in data without any special handling, there is no need to account for bias explicitly. I haven't seen feature importances for "bias" in decison trees or ensembles.

But maybe I'm wrong and it is possible to introduce some notion of bias which makes sense. For example, in LIghtGBM the first iteration is a synthetic tree which always predicts bias; while not required, as I understand, it helps with convergence in practice. So maybe the way to look at it is to compare first tree and next trees in the ensemble, or maybe check several of the low-iteration trees; this is a more general approach which is not specific to bias. I haven't tried it, but it may work.

kmike avatar May 03 '17 13:05 kmike

@lopuhin in regards of the BIAS, I wanted to comment here but then noticed the issue is closed. Reading the post in the blog you've linked that comment my understanding is that BIAS is the starting point, then every shift in from is how feature importance is define for each prediction.

If BIAS ends up being the most "relevant" feature explained, doesn't it mean that the shift from the mean in the path taken is minimal and none of the feature play a key role? In other words, no feature is a discriminator for this prediction, it's just the expected value?

alzmcr avatar Mar 05 '19 16:03 alzmcr

@alzmcr yes, I think your interpretation is fair 👍

lopuhin avatar Mar 06 '19 07:03 lopuhin