transformers icon indicating copy to clipboard operation
transformers copied to clipboard

[WIP]Add TF BEiT Implementation

Open MadElf1337 opened this issue 1 year ago • 158 comments

Porting BEiT model from PyTorch to TensorFlow backend

What does this PR do?

Fixes #18085

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [x] Did you read the contributor guideline, Pull Request section?
  • [x] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [x] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [x] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

@amyeroberts @gante @LysandreJik @NielsRogge

MadElf1337 avatar Aug 10 '22 15:08 MadElf1337

@gante @amyeroberts Here's the WIP draft of BEiT!

Please tell me if I have done anything wrong, I'll make the changes right away!

Thanks!

MadElf1337 avatar Aug 10 '22 15:08 MadElf1337

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Hi @MadElf1337 - thanks for opening a PR and for adding this model! Outline looks good.

As a quick overview, I see two main things that you'll want to add (alongside docs and tests):

  • # Copied from in the TF data2vec model definition
  • TFBeitForXxx classes

Looking forward to seeing the full PR and having this model available for our TF users :)

amyeroberts avatar Aug 15 '22 13:08 amyeroberts

@amyeroberts Sure! I'll make the changes!

MadElf1337 avatar Aug 15 '22 15:08 MadElf1337

@amyeroberts @gante So I think I'm done with the model, can you just look over it once while I'll finish writing the tests?

MadElf1337 avatar Sep 04 '22 05:09 MadElf1337

@MadElf1337 From a quick glance, the model code looks fine 👍 As always, the devil is in the details, so you likely come across issues in the tests. Lets us know if you get stuck in a particular test (tip: breakpoint() + comparing to PT are your friends)

Will do an in-depth review when the tests are added.

gante avatar Sep 05 '22 08:09 gante

@MadElf1337 As discussed on the issue #18085 here for this model, we want to copy the relevant code in data2vec to modeling_tf_beit.py, then add the necessary #Copied from statements in modeling_tf_data2vec.py i.e. modeling_tf_beit.py and modeling_tf_data2vec.pyshould have the same structure and equivalent#Copied fromstatements as inmodeling_beit.pyandmodeling_data2vec.py`. Let me know if any of this isn't clear or you need any help.

amyeroberts avatar Sep 21 '22 12:09 amyeroberts

Yeah yeah it was clear, just wanted to see if the broad architecture was written correctly or not, once I complete the tests(I’m a bit stuck on the attention output test for tf), I’ll do the formatting, add the comments and then ask for a complete review

MadElf1337 avatar Sep 21 '22 14:09 MadElf1337

If you follow the same structure as the pytorch data2vec vision and beit, including the copied from statements, then almost all of the architecture considerations will be taken care of for you, and it will be easier for us as reviewers.

If you need any help with the tests, let us know and we can try and lend a hand.

amyeroberts avatar Sep 29 '22 11:09 amyeroberts

Yeah so as I said, I just am stuck on the seq_len part in the attention output for TF, since that is one thing which is present in data2vec but not in BEIT, So just need to figure out that test

MadElf1337 avatar Sep 29 '22 16:09 MadElf1337

Hey @MadElf1337 -- we've just released a guide for TF conversions, might come handy to you :D

https://huggingface.co/docs/transformers/main/en/add_tensorflow_model

gante avatar Oct 03 '22 14:10 gante

Yep thanks!

Mostly done with the tests as well, just a little hiccup that will be solved soon, else I’ll make sure to ask for help!

MadElf1337 avatar Oct 03 '22 15:10 MadElf1337

@gante @amyeroberts Terribly sorry for the delay, had to deal with some personal stuff that could not be avoided.

I think I'm done writing the tests and the model, can I get a review to see if I've missed anything/done anything wrong?

Thanks!

(Also I'll add the comments of #Copied from TFData2vec in the final commit)

MadElf1337 avatar Oct 28 '22 15:10 MadElf1337

@amyeroberts @gante

Can I get a review please?

MadElf1337 avatar Nov 11 '22 04:11 MadElf1337

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@amyeroberts Thanks for the review!

  1. As suggested I've added the comments of #Copied from...(Sorry that you had to ask twice, I thought they were just comments and did not know that it was a part of the review process)

  2. I've also added the missing code and the torch references have been changed!

MadElf1337 avatar Nov 12 '22 03:11 MadElf1337

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Hi @MadElf1337 - thanks for the updates and iterating so quickly after review.

There's still a few files that need to be added for the model to be importable and fully integrated into the library. The guidelines in the document @gante details these. Here's a recent model PR for reference. As the overall architecture looks good, this is the next step for this PR.

amyeroberts avatar Nov 16 '22 15:11 amyeroberts

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@amyeroberts @gante So I've done everything as specified in the docs(I think), can I get a review to see if I've missed anything?

MadElf1337 avatar Nov 29 '22 05:11 MadElf1337

Hey @amyeroberts @gante Can I get a review please?

MadElf1337 avatar Dec 05 '22 12:12 MadElf1337

@MadElf1337 Thanks for the update!

The next stage for this PR is getting all of the tests running - the fun part! The tests aren't running at the moment as the models can't be imported:

E   ImportError: cannot import name 'TFBeitForImageClassification' from 'transformers' (/home/circleci/transformers/src/transformers/__init__.py)

One thing I can see that needs to be added is included the beit models in import_structure in the __init__.py e.g. here.

Some of the tests that are failing e.g. check_code_quality you can fix and/or find the issues by running make fixup locally.

Finally, the # Copied from statements should be added to the data2vec vision model in modeling_tf_data2vec_vision.py and the ones in modeling_tf_beit.py removed. # Copied from transformers.models.beit.modeling_beit.TFBeitModelOutputWithPooling with Beit->Data2VecVision

amyeroberts avatar Dec 09 '22 13:12 amyeroberts

@amyeroberts Thanks for the review!

I can see that the original repo does not have the import structures in init.py, however I have added those to the init file in my dev branch, which is why it is showing a conflict for the same file

MadElf1337 avatar Dec 18 '22 08:12 MadElf1337

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 11 '23 15:01 github-actions[bot]

Hey, can I know what to do next to solve the merge conflict?

MadElf1337 avatar Jan 11 '23 15:01 MadElf1337

Hey @MadElf1337 -- You will have to rebase your PR with main :)

  1. Get the latest main
git checkout main
git pull
  1. Rebase
git checkout your_branch
git rebase origin/main
  1. Handle conflicts manually (i.e. keep the desired changes and remove the unwanted ones in the conflicting files, and follow the instructions that git gives you)

  2. Force-push your changes (force to avoid GitHub showing a diff of 666 files)

git push -u origin your_branch -f

gante avatar Jan 16 '23 12:01 gante

There, I think I've solved the conflict but the test errors are occurring due to errors in data2vecvision

MadElf1337 avatar Jan 17 '23 04:01 MadElf1337

@MadElf1337 Some of the failures are because the # Copied from statements point to a path that doesn't exist e.g. # Copied from transformers.models.data2vec.modeling_data2vec_vision.TFData2VecVisionEmbeddings with Data2VecVision->Beit is copying the object TFData2VecVisionEmbeddings but is referring to the pytorch modeling file transformers.models.data2vec.modeling_data2vec_vision.

Note: The copied from statement should be in the modeling_tf_data2vec_vision.py file and should copy from the beit model e.g. # Copied from transformers.models.beit.modeling_tf_beit.TFBeitEmbeddings with Beit->Data2VecVision. There shouldn't be any # Copied from comments in the BEiT modeling file modeling_tf_beit.py.

If you run make fixup locally in the repo, you'll be able to reproduce the check_copies and it will make the check_code_quality checks pass.

amyeroberts avatar Jan 17 '23 16:01 amyeroberts

Ah I see, I will fix that right away

MadElf1337 avatar Jan 17 '23 16:01 MadElf1337

So the tests run locally, but whenever I run the make fix-copies command it changes the docstring in data2vec from data2vec to beit, thus throwing the style change errors.

How do I go about fixing this?

MadElf1337 avatar Jan 17 '23 18:01 MadElf1337