alibi icon indicating copy to clipboard operation
alibi copied to clipboard

Which anchor explainer to use for different types of input features

Open erkinaltuntas opened this issue 1 year ago • 1 comments

Hello together,

I have a ML (classification) model which uses different types of input features, i.e. numerical features but also text features (which are processed by doc2vec). I am now struggling whether the anchor functionality works on this kind of problem. Do I have to use the AnchorText or AnchorTabular?

erkinaltuntas avatar Sep 05 '22 10:09 erkinaltuntas

Hi, for now the anchor algorithms only support tabular or text data separately but not both. We're looking into multi-modal explanation methods to support such use cases in the future.

That being said, if the input to your model is a concatenation of tabular features + text vectors you could try to use AnchorTabular, but just be warned that it does not scale well with the number of features (I assume the text embeddings are quite high-dimensional). Also, the output will likely not be interpretable because an anchor for a text embedding vector would not necessarily correspond to any "words" in the natural language space (because with doc2vec and word2vec you map variable length phrases to fixed-length vectors, so the inverse mapping may not exist). For example, if the embedding dimension is 100 and your anchor says that the first 10 dimensions are important to keep fixed, how would you map this space of "fixed first 10 dimensions and the rest of the 90 dimensions can vary freely" to a set of words?

jklaise avatar Sep 05 '22 10:09 jklaise