ferret
ferret copied to clipboard
[Question] How you obtained different relevance values for each subtoken in LIME explanations?
As far as I know, LIME library generates rationales for the vocabulary of the sentence i.e. relevance values are predicted for each unique subtoken. For example:
exp = explainer.explain_instance("Hello hello! I am Ritwik. I am human. I am alive.", my_predict_function, num_samples=50, labels=[1])
exp.as_list(1)
Output
[('am', -0.008500862588102119),
('Hello', -0.006200060595890539),
('hello', -0.005261094759960054),
('human', 0.00344380721723037),
('Ritwik', -0.0019497717589949612),
('alive', 0.0017217607625450034),
('I', 0.0010682594124306127)]
Notice that only one relevance value is predicted for repeating words such as I
and am
. Then how LIME from ferret predicts different relevance values for repeating words?