transformers
transformers copied to clipboard
Add methods to PreTrainedModel to use PyTorch's BetterTransformer
As per title.
Should be merged only on the next Optimum release that will include https://github.com/huggingface/optimum/pull/676
Before submitting
Tests are still to be done.
Who can review?
@younesbelkada @sgugger
The documentation is not available anymore as the PR was closed or merged.
as a side note, since in the previous optimum versions the save_pretrained and push_to_hub methods are not blocked, I propose to explicitly block them for transformed models in this PR and/or force users to use a certain version of optimum.
Yes we should probably force the next optimum version.
Should be ready @sgugger , the documentation has been extended in https://moon-ci-docs.huggingface.co/docs/transformers/pr_21259/en/perf_infer_gpu_one .
Let me know if I should add a test - in which case optimum should be added in the setup.py, I guess.
@fxmarty there should be no need to add optimum in setup.py, we can do something similar than bitsandbytes and add optimum in the Dockerfile of the Docker image that will run the slow tests: https://github.com/huggingface/transformers/blob/0db5d911fc94604f9568b4b212e005ec4600d157/docker/transformers-all-latest-gpu/Dockerfile#L52
I very much agree that we should add tests, especially to test accelerate compatibility, happy to help you on this, let me know if you need help
Thanks, will do!
especially to test accelerate compatibility
Isn't this already tested on Optimum side?
Isn't this already tested on Optimum side?
Yes but the tests are run on GPU: therefore not run on any of the runners on optimum on a daily basis (but not sue if there are tested somewhere else) - I just asked individually to each contributor to run the accelerate test locally on their GPU before merging (only in case I have serious doubts that the PR breaks anything related to accelerate).
Since in transformers tests are run on GPU on daily basis, we can leverage that and setup a small BetterTransformer testing suite that tests all the tests + accelerate compatibility. Also this enables us to flag anything we need to upstream to accelerate if something breaks BT integration with accelerate
There are tests on the daily basis on GPU in Optimum, for example https://github.com/huggingface/optimum/blob/main/.github/workflows/test_onnxruntime_train.yml and https://github.com/huggingface/optimum/blob/main/.github/workflows/test_onnxruntime_gpu.yml
In my opinion, thorough tests should be added in Optimum, not Transformers. The test I was thinking of in Transformers was only an integration one to check that there's no error.
There is an issue with accelerate loaded models and transform from BT, let's wait until this gets fixed before merging this PR
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
not stale
If you want this PR included in the next release, you should finish the work and have it merged sooner rather than later :-) The last I saw was Younes telling we should wait for a fix, was that fix added? Then this needs a rebase on main since it has been a while.
Thanks for the headsup!
Indeed we are working on fixing some bugs on optimum side that was introduced by one of my PRs (the revert-transform PR) before adding the invert_transform method
We can maybe merge this PR by keeping only transform method and blocking the save_pretrained & push_to_hub methods after transforming the model
you should finish the work and have it merged sooner rather than later :-)
There is substantial work left in Optimum before this should be merged. Marking as draft for now!
OK, so this won't be in the next release of Transformers (probably this week in preparation for PyTorch 2.0).
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hey @fxmarty and @younesbelkada, are there standing PRs in optimum that need to be merged for this to proceed/anything we can help with to have this move forward? Thanks :)
Hey @LysandreJik @sgugger
@fxmarty recently managed to fix all issues related to decoder-based models integration in optimum! I believe that this PR could be re-opened, in my understanding we just need to add few tests and we should be good to go
@sgugger @LysandreJik this is now ready for review!