ferret [Question] How you obtained different relevance values for each subtoken in LIME explanations?

[Question] How you obtained different relevance values for each subtoken in LIME explanations?

Open ritwikmishra opened this issue 1 year ago • 0 comments

As far as I know, LIME library generates rationales for the vocabulary of the sentence i.e. relevance values are predicted for each unique subtoken. For example:

exp = explainer.explain_instance("Hello hello! I am Ritwik. I am human. I am alive.", my_predict_function, num_samples=50, labels=[1])
exp.as_list(1)

Output

[('am', -0.008500862588102119),
 ('Hello', -0.006200060595890539),
 ('hello', -0.005261094759960054),
 ('human', 0.00344380721723037),
 ('Ritwik', -0.0019497717589949612),
 ('alive', 0.0017217607625450034),
 ('I', 0.0010682594124306127)]

Notice that only one relevance value is predicted for repeating words such as I and am. Then how LIME from ferret predicts different relevance values for repeating words?

May 03 '23 20:05 ritwikmishra

ferret ferret copied to clipboard

[Question] How you obtained different relevance values for each subtoken in LIME explanations?

ferret
ferret copied to clipboard