inseq icon indicating copy to clipboard operation
inseq copied to clipboard

Set tqdm to iterate over sentences when doing attributions for multiple sentences

Open jumelet opened this issue 1 year ago • 1 comments

Description

When computing attributions for a list of sentences the tqdm iterator prints out the iteration per token, which gives no insight into how far in you are with the corpus of sentences that you are attributing over. I would suggest that when attributing over a List of strings the tqdm iterates per sentence, and drops the per token iteration.

image

Commit to Help

Happy to have a go at this if you agree this could be nice.

  • [x] I'm willing to help with this feature.

jumelet avatar May 11 '23 15:05 jumelet

Hey @jumelet, nice idea, and of course any help would be more than welcome! :smile:

Some details and comments that might help in this sense:

  • The loop happens in the attribute method call, and supports both tqdm (in notebooks) and rich (in console, if pretty_progress=True) logging if show_progress=True.
  • attribute receives only a batch of inputs to attribute, since the full set of inputs is actually split in batches and passed gradually by the batched decorator around prepare_and_attribute
  • A simple way to achieve the desired behavior is to use tqdm for the outer loop inside batched, and set show_progress=False to avoid showing the per-example one. Consider however that for the console case, keeping the per-example rich progress might still be desirable.

Hope this helps, let me know how it goes!

gsarti avatar May 11 '23 20:05 gsarti