transformers
transformers copied to clipboard
[WIP]Add TF BEiT Implementation
Porting BEiT model from PyTorch to TensorFlow backend
What does this PR do?
Fixes #18085
Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [x] Did you read the contributor guideline, Pull Request section?
- [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
- [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
- [x] Did you write any new necessary tests?
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.
@amyeroberts @gante @LysandreJik @NielsRogge
@gante @amyeroberts Here's the WIP draft of BEiT!
Please tell me if I have done anything wrong, I'll make the changes right away!
Thanks!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Hi @MadElf1337 - thanks for opening a PR and for adding this model! Outline looks good.
As a quick overview, I see two main things that you'll want to add (alongside docs and tests):
-
# Copied from
in the TF data2vec model definition -
TFBeitForXxx
classes
Looking forward to seeing the full PR and having this model available for our TF users :)
@amyeroberts Sure! I'll make the changes!
@amyeroberts @gante So I think I'm done with the model, can you just look over it once while I'll finish writing the tests?
@MadElf1337 From a quick glance, the model code looks fine 👍 As always, the devil is in the details, so you likely come across issues in the tests. Lets us know if you get stuck in a particular test (tip: breakpoint()
+ comparing to PT are your friends)
Will do an in-depth review when the tests are added.
@MadElf1337 As discussed on the issue #18085 here for this model, we want to copy the relevant code in data2vec to modeling_tf_beit.py
, then add the necessary #Copied from
statements in modeling_tf_data2vec.py
i.e. modeling_tf_beit.py
and modeling_tf_data2vec.pyshould have the same structure and equivalent
#Copied fromstatements as in
modeling_beit.pyand
modeling_data2vec.py`. Let me know if any of this isn't clear or you need any help.
Yeah yeah it was clear, just wanted to see if the broad architecture was written correctly or not, once I complete the tests(I’m a bit stuck on the attention output test for tf), I’ll do the formatting, add the comments and then ask for a complete review
If you follow the same structure as the pytorch data2vec vision and beit, including the copied from statements, then almost all of the architecture considerations will be taken care of for you, and it will be easier for us as reviewers.
If you need any help with the tests, let us know and we can try and lend a hand.
Yeah so as I said, I just am stuck on the seq_len part in the attention output for TF, since that is one thing which is present in data2vec but not in BEIT, So just need to figure out that test
Hey @MadElf1337 -- we've just released a guide for TF conversions, might come handy to you :D
https://huggingface.co/docs/transformers/main/en/add_tensorflow_model
Yep thanks!
Mostly done with the tests as well, just a little hiccup that will be solved soon, else I’ll make sure to ask for help!
@gante @amyeroberts Terribly sorry for the delay, had to deal with some personal stuff that could not be avoided.
I think I'm done writing the tests and the model, can I get a review to see if I've missed anything/done anything wrong?
Thanks!
(Also I'll add the comments of #Copied from TFData2vec in the final commit)
@amyeroberts @gante
Can I get a review please?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
@amyeroberts Thanks for the review!
-
As suggested I've added the comments of #Copied from...(Sorry that you had to ask twice, I thought they were just comments and did not know that it was a part of the review process)
-
I've also added the missing code and the torch references have been changed!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Hi @MadElf1337 - thanks for the updates and iterating so quickly after review.
There's still a few files that need to be added for the model to be importable and fully integrated into the library. The guidelines in the document @gante details these. Here's a recent model PR for reference. As the overall architecture looks good, this is the next step for this PR.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
@amyeroberts @gante So I've done everything as specified in the docs(I think), can I get a review to see if I've missed anything?
Hey @amyeroberts @gante Can I get a review please?
@MadElf1337 Thanks for the update!
The next stage for this PR is getting all of the tests running - the fun part! The tests aren't running at the moment as the models can't be imported:
E ImportError: cannot import name 'TFBeitForImageClassification' from 'transformers' (/home/circleci/transformers/src/transformers/__init__.py)
One thing I can see that needs to be added is included the beit models in import_structure
in the __init__.py
e.g. here.
Some of the tests that are failing e.g. check_code_quality
you can fix and/or find the issues by running make fixup
locally.
Finally, the # Copied from
statements should be added to the data2vec vision model in modeling_tf_data2vec_vision.py
and the ones in modeling_tf_beit.py
removed.
# Copied from transformers.models.beit.modeling_beit.TFBeitModelOutputWithPooling with Beit->Data2VecVision
@amyeroberts Thanks for the review!
I can see that the original repo does not have the import structures in init.py, however I have added those to the init file in my dev branch, which is why it is showing a conflict for the same file
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hey, can I know what to do next to solve the merge conflict?
Hey @MadElf1337 -- You will have to rebase your PR with main :)
- Get the latest main
git checkout main
git pull
- Rebase
git checkout your_branch
git rebase origin/main
-
Handle conflicts manually (i.e. keep the desired changes and remove the unwanted ones in the conflicting files, and follow the instructions that git gives you)
-
Force-push your changes (force to avoid GitHub showing a diff of 666 files)
git push -u origin your_branch -f
There, I think I've solved the conflict but the test errors are occurring due to errors in data2vecvision
@MadElf1337 Some of the failures are because the # Copied from
statements point to a path that doesn't exist e.g.
# Copied from transformers.models.data2vec.modeling_data2vec_vision.TFData2VecVisionEmbeddings with Data2VecVision->Beit
is copying the object TFData2VecVisionEmbeddings
but is referring to the pytorch modeling file transformers.models.data2vec.modeling_data2vec_vision
.
Note: The copied from statement should be in the modeling_tf_data2vec_vision.py
file and should copy from the beit model e.g. # Copied from transformers.models.beit.modeling_tf_beit.TFBeitEmbeddings with Beit->Data2VecVision
. There shouldn't be any # Copied from
comments in the BEiT modeling file modeling_tf_beit.py
.
If you run make fixup
locally in the repo, you'll be able to reproduce the check_copies
and it will make
the check_code_quality
checks pass.
Ah I see, I will fix that right away
So the tests run locally, but whenever I run the make fix-copies
command it changes the docstring in data2vec from data2vec to beit, thus throwing the style change errors.
How do I go about fixing this?