interpret icon indicating copy to clipboard operation
interpret copied to clipboard

Non-Treebased base models

Open JoshuaC3 opened this issue 3 years ago • 5 comments

Amazing stuff. From what I can tell, you use simple decision trees as your base estimator. Is it possible to use linear models, polynomial regression models or even cubic splines?

In my experience, Stake Holders prefer smooth, or non-stepped functions when discussing interpretability. They seem to accept them better.

JoshuaC3 avatar Nov 24 '20 16:11 JoshuaC3

Hi @JoshuaC3,

It's a good question! Right now, we're focused on tree-based models as the base estimators in EBMs. We've found that trees tend to yield the best performance in most cases, and are also easier to use "out of the box" as they naturally adapt to categorical data and are agnostic to the scale of the features.

That being said, we're currently experimenting with some parameters that seem to significantly improve the smoothness of the learned functions. We're planning on exposing them in the package after some more experimentation, and we'll update this thread once they're in.

-InterpretML Team

interpret-ml avatar Dec 21 '20 20:12 interpret-ml

Thanks for that update. Having gone away and understood the algorithm behind this a little better, it makes total sense as to why you would use DTs as your default base learner.

From what I understand, there would be nothing stopping someone* from making their own approximations of the graphs (or a single graph), if they really desired a completely smooth function (spline, poly or even linear).

All that said, I think it would still be a nice feature to be able to include other base learners, even if just for academic purposes.


*Excluding the obvious: effects on model accuracy, compatibility problems, complications with calculating SHAP values and the potential need for retraining weights, etc etc etc.

JoshuaC3 avatar Jan 20 '21 11:01 JoshuaC3

I would especially look forward to Isotonic Regression as base estimators because then I could model monotone functions with it.

Garve avatar Apr 02 '21 13:04 Garve

@Garve I have been using your ExplainableMetaRegressor with Isotonic regression in a piece of work where I know the variables should have a monotonic relationship with the dependent variable. So far the results have been competitive with InterpretMLs EBMs, and in several cases, the holdout training scores have been slightly better!

Here is a bit of a hack-job to get the mEBMs plotting global explainability: https://github.com/JoshuaC3/scikit-bonus/tree/mEBM-utils

use as follows

# from XXX import utils
# from interpret import show

# your mEBM code
# note, I use mEBM to distinguish between interprets EBM and your Meta EBM


selector = utils.make_selector(X)
feature_names = X.columns.tolist()

mebm_global = utils.explain_global(mebm, feature_names, feature_importances, selector, name=None)

show(mebm_global)

Happy to tidy up, add tests, docs and do PR if desired.

All that said, I think having an Isotonic/Monotonic/Base Models options at the training level IN InterpretML is far more desirable!! 😄

JoshuaC3 avatar Apr 07 '21 07:04 JoshuaC3

Heya Joshua!

Awesome, will try it out later! :) And I agree, having that in interpretml would definitely be the best! Especially if you can assign a different model for each feature of the dataset.

Best Robert

Garve avatar Apr 07 '21 08:04 Garve