transformers icon indicating copy to clipboard operation
transformers copied to clipboard

PEGASUS-X

Open zphang opened this issue 3 years ago • 1 comments

What does this PR do?

Adds PEGASUS-X implementation.

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [X] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [X] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [X] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@patrickvonplaten, @patil-suraj


Note: The models are currently hosted on https://huggingface.co/zphang but should be transferred to the Google organization shortly.

zphang avatar Aug 10 '22 00:08 zphang

The documentation is not available anymore as the PR was closed or merged.

cc @ArthurZucker

LysandreJik avatar Aug 15 '22 20:08 LysandreJik

On it 🤗

ArthurZucker avatar Aug 15 '22 20:08 ArthurZucker

I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.

One comment on DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.

zphang avatar Aug 18 '22 22:08 zphang

I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.

One comment on DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.

Thanks for the quick comment! For me it's mostly the single uppercase letters that I would like to change. Ok for me to keep the class, even if we haven't done it before for models like LongT5, Longformer or BigBird. Think overall I'd prefer to not have the class at all, but ok for me to leave it if you feel stongly about it @zphang :-) Just it'd be super nice to write out the single upper-case letters

patrickvonplaten avatar Aug 19 '22 09:08 patrickvonplaten

Let me know if there is anything else I need to address!

zphang avatar Aug 30 '22 17:08 zphang

Let me ping Peter Liu on this. He should be able to pull and push to the Google org. I will update the paths in the PR when it is ready.

zphang avatar Aug 31 '22 16:08 zphang

Thanks for making the change! Test failures seem unrelated :-) Merging!

patrickvonplaten avatar Sep 02 '22 17:09 patrickvonplaten

Hi @zphang Thank you for adding this model! We have a few failing tests for this model, which could be found on this CI job run page. You can click [View raw logs] on the icon at the top-right corner.

  • One issue is the missing checkpoint pegasus-x-base:

    pegasus-x-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
    

    Do you know where is the correct checkpoint?

  • Another test failure is test_seq_to_seq_generation, where the model outputs

    PEGASUSX PEGASUS PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS PEGASUS
    

    Could you check if you get the expected values we investigate the performance on your side, and/or (if possible) why this non-sense output occurs?

  • For the remaining failure test_torchscript_output_attentions, we will fix it on our side.

Thank you in advance!

ydshieh avatar Sep 05 '22 08:09 ydshieh

Here the PR to correct the naming: https://github.com/huggingface/transformers/pull/18896/files

patrickvonplaten avatar Sep 05 '22 09:09 patrickvonplaten

Fix in #19025

ydshieh avatar Sep 14 '22 11:09 ydshieh

Thanks for sharing this model @zphang! Do you intend to release the fine-tuned checkpoints? (pubmed-large, arxiv-large, govreport-large, etc)?

fonsc avatar Oct 22 '22 19:10 fonsc

The FLAX weights of the fine-tuned models can be found here https://github.com/google-research/pegasus/tree/main/pegasus/flax#checkpoints

And the FLAX to HF conversion script can be found here https://github.com/google-research/pegasus/blob/main/pegasus/flax/checkpoint_conversion/convert_from_flax_to_hf.py

I'll try to convert the models over and upload them to HF hub this week.

zphang avatar Oct 31 '22 18:10 zphang