deep_autoviml icon indicating copy to clipboard operation
deep_autoviml copied to clipboard

Add SHAP or other explainability function to Deep_AutoViML

Open govindjeevan opened this issue 3 years ago • 10 comments
trafficstars

govindjeevan avatar Jan 09 '22 13:01 govindjeevan

Any hints on how the explainability integration should work for the end-user?

For eg. What-If tool could be used through Tensorboard @AutoViML

govindjeevan avatar Jan 09 '22 15:01 govindjeevan

Hi @govindjeevan 👍 Great question! If you could look for Lime or SHAP or even the new Explainer Dashboard and see if any of them could be used in place of Tensorboard and What-If tool, it would be good to compare. If I remember right, the What-If tool is somewhat different to install and use in a Jupyte Notebook vs a Colab vs a Kaggle Kernel. See documentation here. For example in a Jupyter Notebook it is as follows:

pip install witwidget
jupyter nbextension install --py --symlink --sys-prefix witwidget
jupyter nbextension enable --py --sys-prefix witwidget

Hence it would be cumbersome for someone to implement What-If Tool with Deep_AutoViML since it would require different installations for different environments. It would be nicer to have a simpler and nicer installation that is lightweight but does the job.

You can research the above and make some suggestions here. AutoViML

AutoViML avatar Jan 12 '22 15:01 AutoViML

Hey @AutoViML

So in terms of usability post-installation, how should deep_autviml expose these tools to the user?

Should it be encapsulated by a function similar to fit, predict that is currently offered? eg. deepauto.explain(model, test_dataset=test) and this method configures the required library objects and would render the respective explainability tooling/log output?

or should it be part of the fit/predict work flow itself, part of the log output or so

govindjeevan avatar Jan 17 '22 20:01 govindjeevan

Great question @govindjeevan 👍

Should it be encapsulated by a function similar to fit, predict that is currently offered?
eg. deepauto.explain(model, test_dataset=test)
and this method configures the required library objects and would render the respective explainability tooling/log output?

Yes exactly as you have explained here. The user would feed you the trained model and test data, you need to explain it. This will differ based on whether the data is:

  1. Tabular
  2. NLP
  3. Image
  4. etc.

You can start with explaining Tabular data or NLP or image (whichever you are comfortable with). Thanks Ram

AutoViML avatar Jan 17 '22 22:01 AutoViML

A sizable fraction of predict function from predict_model.py would be useful for the proposed .explain method. Everything till the actual model.predict call in that function seems relevant, does moving the data processing code to a separate function such that it can be re-used make sense? Or is duplication okay?

For the library, I found Lime to meet the requirements. It supports Image and Tabular data. Can work with PyTorch. Installation is fairly straightforward and uniform across platforms. Some of the other tools offered fancier algorithms, but this fits perfectly in terms of "simple but gets the job done". For NLP, need to look at it more closely.

govindjeevan avatar Jan 18 '22 18:01 govindjeevan

If there aren't any potential disadvantages to doing so, I was thinking of interfacing the function with both SHAP and LIME and allowing the user to configure which explainer they wish deep_autoviml to invoke from the backend. ( with a pre-configured default )

govindjeevan avatar Jan 18 '22 18:01 govindjeevan

@AutoViML thoughts?

govindjeevan avatar Jan 21 '22 15:01 govindjeevan

Hi @govindjeevan 👍 I totally agree with this part:

I was thinking of interfacing the function with both SHAP and LIME and allowing the user to configure which explainer they wish deep_autoviml to invoke from the backend. ( with a pre-configured default )

go for it! AutoViML

AutoViML avatar Jan 23 '22 12:01 AutoViML

The input format expectations of LIME and SHAP are different from that of the model constructed by deep_viml (dict based MapDataset).

Trying to work around this. Passing the dict based MapDataset to LIME, SHAP functions results in an error as they expect a dataframe like input.

govindjeevan avatar Jan 28 '22 13:01 govindjeevan

Hi @govindjeevan 👍 I believe that there are other ways to pass the data to the TF model. You don't need to use a MapDataSet to make a prediction. You can use the the following link to understand how to make predictions: https://stackoverflow.com/questions/53384168/keras-tensorflow-predict-using-tf-data-dataset-api

If you still have trouble, let me know and I will set up a time to walk you through it. AutoViML

AutoViML avatar Feb 03 '22 14:02 AutoViML