alibi issues

AnchorText - Pytorch backend

The AnchorText with `sampling_strategy=language_model` uses Tensorflow language models from [transformers](https://github.com/huggingface/transformers). Some benchmarking showed that Tensorflow models are at least 1.5 slower than the corresponding Pytorch implementation. Thus, for full performance,...

RobertSamoilescu

AnchorText

Priority: Medium

AnchorText - GPU sharing

AnchorText works with black-box models. Thus, the black-box can potentially be a transformer-based model. In this case, both transformers (the one corresponding to the black-box model and the one used...

RobertSamoilescu

AnchorText

Priority: High

internal-mle

AnchorText - stopword functionality for unknown/similarity

The AnchorText language model extension supports the option of including a list of `stopwords`. This means that the words inside the list will not be perturbed. **Should we include the...

RobertSamoilescu

AnchorText

Priority: Medium

[AnchorText] - Example explaining Transformers models

This would serve several purposes: - Realistic example using a common architecture - Performance benchmarking of the Anchor algorithm with realistic models - Uncover issues with GPU sharing #437 .

jklaise

Priority: High

internal-mle

[IntegratedGradients] - handle non-array `forward_kwargs`

Currently `forward_kwargs` expects to contain arrays, this is to handle the use case of explaining `transformer` models. However, models can have more general `forward_kwargs` which should be handled. See [Captum...

jklaise

Priority: Medium

Investigate extending IntegratedGradients for text to work on raw text

11

Whilst `AnchorText` works directly on raw text, `IntegratedGradients` works on the token level. One reason for this is that `IntegratedGradients` is use case agnostic - tabular data, images and text...

jklaise

Type: Method extension

Type: API

Priority: High

internal-mle

AnchorText performance with spacy 2.2.3 and 2.3.0

The way `AnchorText` code finds word similarities is different between the two major spacy versions with implications on runtime and quality of words found. The scope of this would be...

jklaise

internal-mle

Priority: Medium

Type: Performance

Avoid pandas for groupby in ALE

This comes from the observation here: https://github.com/SeldonIO/alibi/pull/152#discussion_r428624088 Having done some performance tests, going via a `pandas` dataframe to do a groupby operation can take up to 25% of the computation...

jklaise

Type: HPC

ALE for second-order effects

1

Current ALE implementation only calculates first order effects, an extension to second order effects is possible. If `n` is the number of features then: - In ALE all `n` main...

jklaise

Type: Method extension

Priority: Medium

ALE for categorical features

1

ALE currently supports numerical features only. An extension to categorical features is possible, but comes with serious caveats for interpretability (see https://compstat-lmu.github.io/iml_methods_limitations/ale-misc.html) so I think some more research needs to...

jklaise

Type: Research

Type: Method extension

Priority: Low

alibi
alibi copied to clipboard

Metadata

AnchorText - Pytorch backend

AnchorText - GPU sharing

AnchorText - stopword functionality for unknown/similarity

[AnchorText] - Example explaining Transformers models

[IntegratedGradients] - handle non-array `forward_kwargs`

Investigate extending IntegratedGradients for text to work on raw text

AnchorText performance with spacy 2.2.3 and 2.3.0

Avoid pandas for groupby in ALE

ALE for second-order effects

ALE for categorical features

← Metadata

Owner

Metadata

alibi alibi copied to clipboard

Metadata

← Metadata

Owner

Metadata

alibi
alibi copied to clipboard