transformers
transformers copied to clipboard
PEGASUS-X
What does this PR do?
Adds PEGASUS-X implementation.
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [X] Did you read the contributor guideline, Pull Request section?
- [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [X] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [X] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@patrickvonplaten, @patil-suraj
Note: The models are currently hosted on https://huggingface.co/zphang but should be transferred to the Google organization shortly.
The documentation is not available anymore as the PR was closed or merged.
cc @ArthurZucker
On it 🤗
I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.
One comment on DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.
I can follow up on the rest of the feedback this weekend / early next week: most of it looks manageable.
One comment on
DimensionInfo: I use it to capture all of the shape-related attributes I need for the various reshapes/padding: it felt cleaner/more manageable for me to keep it all in one data structure than to pass them around individually. I can expand the attributes to the full readable names as you mention above, but I think it's useful to keep the dataclass. Let me know what you think: I'm fine either way.
Thanks for the quick comment! For me it's mostly the single uppercase letters that I would like to change. Ok for me to keep the class, even if we haven't done it before for models like LongT5, Longformer or BigBird. Think overall I'd prefer to not have the class at all, but ok for me to leave it if you feel stongly about it @zphang :-) Just it'd be super nice to write out the single upper-case letters
Let me know if there is anything else I need to address!
Let me ping Peter Liu on this. He should be able to pull and push to the Google org. I will update the paths in the PR when it is ready.
Thanks for making the change! Test failures seem unrelated :-) Merging!
Hi @zphang Thank you for adding this model! We have a few failing tests for this model, which could be found on this CI job run page. You can click [View raw logs] on the icon at the top-right corner.
-
One issue is the missing checkpoint
pegasus-x-base:pegasus-x-base is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'Do you know where is the correct checkpoint?
-
Another test failure is
test_seq_to_seq_generation, where the model outputsPEGASUSX PEGASUS PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS-X PEGASUS PEGASUS PEGASUSCould you check if you get the expected values
we investigate the performanceon your side, and/or (if possible) why this non-sense output occurs? -
For the remaining failure
test_torchscript_output_attentions, we will fix it on our side.
Thank you in advance!
Here the PR to correct the naming: https://github.com/huggingface/transformers/pull/18896/files
Fix in #19025
Thanks for sharing this model @zphang! Do you intend to release the fine-tuned checkpoints? (pubmed-large, arxiv-large, govreport-large, etc)?
The FLAX weights of the fine-tuned models can be found here https://github.com/google-research/pegasus/tree/main/pegasus/flax#checkpoints
And the FLAX to HF conversion script can be found here https://github.com/google-research/pegasus/blob/main/pegasus/flax/checkpoint_conversion/convert_from_flax_to_hf.py
I'll try to convert the models over and upload them to HF hub this week.