inseq
inseq copied to clipboard
Inconsistent batching for DiscretizedIntegratedGradients attributions
🐛 Bug Report
Despite fixing batched attribution so that results are consistent with individual attribution (see #110), the method DiscretizedIntegratedGradients
still produces different results when applied to a batch of examples.
🔬 How To Reproduce
- Instantiate a AttributionModel with the
discretized_integrated_gradients
method. - Perform an attribution for a batch of examples
- Perform an attribution for a single example present in the previous batch
- Compare the attributions obtained in the two cases
Code sample
import inseq
model = inseq.load_model("Helsinki-NLP/opus-mt-en-de", "discretized_integrated_gradients")
out_multi = model.attribute(
[
"This aspect is very important",
"Why does it work after the first?",
"This thing smells",
"Colorless green ideas sleep furiously"
],
n_steps=20,
return_convergence_delta=True,
)
out_single = model.attribute(
[ "Why does it work after the first?" ],
n_steps=20,
return_convergence_delta=True,
)
assert out_single.attributions == out_multi[1].attributions # raises AssertionError
Environment
- OS: 20.04
- Python version: 3.8
📈 Expected behavior
Same as #110
📎 Additional context
The problem is most likely due to a faulty scaling of the gradients in the _attribute
method of the DiscretizedIntegratedGradients
class.
Hi @soumyasanyal, FYI our library supports your method Discretized IG for feature attribution, but at the moment we are experiencing some issues with consistency across single-example and batched attribution (i.e. there is some issue with the creation of orthogonal approximation steps for a batch, see also #114 for additional info). It would be great if you could have a look!