How to retrieve salience of some specific words?

Open CarhoJohn opened this issue 2 years ago • 1 comments

Hi. To obtain the salience map of previous tokens when generating new tokens, we can use the code/function provided in the example code:

output = lm.generate(prompt, generate=1, do_sample=True, attribution=['ig'])
res = output.primary_attributions(attr_method='ig')

However, in this standard method, I can only get the salience map for the (randomly/uncontrollable) generated word.

Is it possible to obtain the salience map for specific word? For example, in the sentence "I have a dog. He is very ...", I'd like to get the salience map for a specific word cute, rather than other words generated by the model.

Thanks very much!

Sep 05 '23 08:09 CarhoJohn

From my understanding this is not possible unless you do algorithmic optimization (some math). Salience maps is doing backprop from output to embedding. This process is just chain rule, and if you break it you do get specific words, but unless mathematically grounded, your approach fails.

Sep 11 '23 16:09 BiEchi