alibi-detect icon indicating copy to clipboard operation
alibi-detect copied to clipboard

Example notebooks out of date

Open ascillitoe opened this issue 3 years ago • 7 comments

The recent api updates (i.e. separating models, utils.prediction etc into tensorflow and pytorch) mean a number of the example notebooks are now out-of-date. Some examples below:

  • Many examples load in predict_batch as from alibi_detect.utils.prediction import predict_batch instead of from alibi_detect.utils.tensorflow.prediction import predict_batch. The arguments and use of predict_batch also needs updating i.e. clf is now the second argument, and the absence of return_class arg means these notebooks need reworking to convert prediction proba's to classes (or this functionality needs to be added back in internally). e.g. see examples/cd_distillation_cifar10.ipynb.
  • Some fetch methods (such as fetch_vaegmm used in examples/od_aegmm_kddcup.ipynb fail when attempting to load state from meta.pickle files. These files need to be updated to match the new api (or this can be fixed on the code side with sub-classes, aliases etc).
  • General updates to imports in notebooks needed.

The above is not exhaustive. A general sweep to check all the examples is needed.

ascillitoe avatar Sep 02 '21 13:09 ascillitoe

We'll be looking into notebook testing (maybe a new issue?) as a longer-term initiative. Example of how this is done in Seldon Core:

  • https://github.com/SeldonIO/seldon-core/blob/3cfe519ae918acc33690ed9da5390093135ca035/testing/scripts/seldon_e2e_utils.py#L645 (create script from notebook)
  • https://github.com/SeldonIO/seldon-core/blob/master/notebooks/convert.tpl (conversion template, has a lot of sleep calls between cells because their examples make network requests that can time out, we don't need to do this)
  • https://github.com/SeldonIO/seldon-core/blob/master/testing/scripts/test_notebooks.py (run script)

jklaise avatar Sep 07 '21 09:09 jklaise

Many are fixed in this PR: #333.

The following list are outstanding:

  • [x] od_if_kddcup - the saved version of the model must be using an older sklearn version which triggers a ModuleNotFoundError: No module named 'sklearn.ensemble.iforest' as it seems they shuffled around internal modules - needs updating the saved detector

  • [x] od_vae_adult.ipynb - due to load_outlier_detector=False, the notebook trains and attempts to save a detector but fails because of #335 .

  • [x] od_vae_adult.ipynb - ValueError: shape too large to be a matrix when training from scratch image

  • [x] od_aegmm_kddcup.ipynb - outdated metadata files causing fetch to fail

  • [ ] od_prophet_weather.ipynb - outdated metadata files causing fetch to fail

  • [ ] ad_ae_cifar10.ipynb - cannot deserialized saved detector - needs updating

  • [x] Fix links to methods in notebooks.

The remaining list may not be exhaustive, we will have a better view once notebook tests are in place.

jklaise avatar Sep 14 '21 13:09 jklaise

To avoid future incompatibilities with older artefacts we could have a CI workflow that after every release trains all detectors with the newest detect version and pushes them to the bucket, then extend fetch_ method to look for an appropriately versioned artefact #336.

jklaise avatar Sep 14 '21 13:09 jklaise

There also seem to be some broken links following the re-organization of the docs, e.g. top of https://docs.seldon.io/projects/alibi-detect/en/latest/examples/cd_mol.html point to the old methods links which are broken.

jklaise avatar Oct 15 '21 14:10 jklaise

Good spot, I missed these, and guess they weren't flagged up as sphinx errors as they're absolute url's as opposed to relative links to sphinx source. I'll grep all absolute links and fix.

ascillitoe avatar Oct 15 '21 14:10 ascillitoe

Adding to the list of remote artefacts that need updating/fixing:

  • [ ] The attack dataset. The metadata for attack='slide' is incorrectly listed as attack='sl1bia'. This is causing the test_fetch_attack test to sometimes fail. The use of test_fetch_attack in ad_ae_cifar10.ipynb is unaffected since the notebook does not use the metadata.
  • [ ] Fetched models more generally. See https://github.com/SeldonIO/alibi-detect/issues/321.

ascillitoe avatar Nov 30 '21 17:11 ascillitoe

@jklaise I didn't want to cause confusion by editing your comment, but perhaps you can add these to the list of notebooks with errors:

  • [ ] od_if_kddcup.ipynb - AttributeError: 'IsolationForest' object has no attribute 'n_features_' from sklearn iforest. Is this new? I know we had an issue with an outdated module name in the saved detector, but wasn't aware of this issue.
  • [ ] cd_text_imdb.ipynb - Timeout. Don't know how long this takes, we might be able to adjust the notebook (or timeout limit?) slightly.

These were uncovered in https://github.com/SeldonIO/alibi-detect/pull/402. We should probably also have a second look at the weekly CI test results.

Also, because the two below are still failing, we probably want to add these to EXCLUDE_NOTEBOOKS until we update remote artefacts.

  • [ ] od_prophet_weather.ipynb - outdated metadata files causing fetch to fail
  • [ ] ad_ae_cifar10.ipynb - cannot deserialized saved detector - needs updating

ascillitoe avatar Dec 07 '21 19:12 ascillitoe