transformers
transformers copied to clipboard
Test summary with previous PyTorch/TensorFlow versions
Initialized by @LysandreJik, we ran the tests with previous PyTorch/TensorFlow versions. The goal is to determine if we should drop (some) earlier PyTorch/TensorFlow versions.
- This is not exactly the same as the scheduled daily CI (
torch-scatter,acceleratenot installed, etc.) - Currently we only have the global summary (i.e. there is no number of test failures per model)
Here is the results (running on ~June 20, 2022):
- PyTorch testing has ~27100 tests
- TensorFlow testing has ~15700 tests
| Framework | No. Failures |
|---|---|
| PyTorch 1.10 | 50 |
| PyTorch 1.9 | 710 |
| PyTorch 1.8 | 1301 |
| PyTorch 1.7 | 1567 |
| PyTorch 1.6 | 2342 |
| PyTorch 1.5 | 3315 |
| PyTorch 1.4 | 3949 |
| TensorFlow 2.8 | 118 |
| TensorFlow 2.7 | 122 |
| TensorFlow 2.6 | 122 |
| TensorFlow 2.5 | 128 |
| TensorFlow 2.4 | 167 |
It looks like the number of failures in TensorFlow testing doesn't increase much.
So far my thoughts:
- All TF >= 2.4 should be (still) kept in the list of supported versions
Questions
- What's you opinion regarding which versions to drop support?
- Would you like to see the number of test failures per model?
- TensorFlow 2.3 needs CUDA 10.1 and requires the build of a special docker image. Do you think we should make the effort on it to have the results for
TF 2.3?
cc @LysandreJik @sgugger @patrickvonplaten @Rocketknight1 @gante @anton-l @NielsRogge @amyeroberts @alaradirik @stas00 @hollance to have your comments
TF 2.3 is quite old by now, and I wouldn't make a special effort to support it. Several nice TF features (like the Numpy-like API) only arrived in TF 2.4, and we're likely to use those a lot in future.
Hey @ydshieh, would you have a summary of the failing tests handy? I'm curious to see the reason why there are so many failures for PyTorch as soon as we leave the latest version. I'm quite confident that it's an issue in our tests rather than in our internal code, so seeing the failures would help. Thanks!
@LysandreJik I will re-run it. The previous run(s) have huge tables in the reports, and sending to Slack failed (3001 character limit). I finally ran it by disabling those blocks.
Before re-running it, I need a approve for #17921
I ran the past CI again which returns more information. Looking the report for PyTorch 1.4 quickly, here are some observations:
There is one error occurring in almost all models:
from_pretrained: OSError: Unable to load weights from pytorch checkpoint file for`torch.load: Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old.
Another one also occurs a lot (torchscript tests)
- (line 625) AttributeError: module 'torch.jit' has no attribute '_state'
An error occurs (specifically) to vision models (probably due to the convolution layers)
- (line 97) RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
BART has 108/106 failures:
- (line 240) RuntimeError: CUDA error: device-side assert triggered
- Don't know what's wrong here yet
Others
- Other
AttributeError: (not exhaustive)- AttributeError: module 'torch' has no attribute 'minimum'
- AttributeError: 'builtin_function_or_method' object has no attribute 'fftn'
- AttributeError: module 'torch' has no attribute 'square'
- AttributeError: module 'torch.nn' has no attribute 'Hardswish'
- AttributeError: module 'torch' has no attribute 'logical_and'
- AttributeError: module 'torch' has no attribute 'pi'
- AttributeError: module 'torch' has no attribute 'multiply'
Thanks for the report! Taking a look at the PyTorch versions, here are the dates at which they were releases:
- 1.4: Jan 16, 2020
- 1.5: Apr 21, 2020
- 1.6: Jul 28, 2020
- 1.7: Oct 27, 2020
- 1.8: Mar 4, 2021
- 1.9: Jun 15, 2021
- 1.10: Oct 21, 2021
- 1.11: Mar 10, 2021
Most of the errors in from_pretrained seem to come from the zipfile format introduced by PyTorch 1.6. I think this is the most annoying one to patch by far.
From a first look, I'd offer to drop support for all PyTorch version inferior to < 1.6 as these have been released more than two years ago.
Do you have a link to a job containing all these failures? I'd be interested in seeing if the 2342 errors in PyTorch 1.6 are solvable simply or if they will require a significant refactor.
The link is here. But since it contains too many jobs (all models x all versions ~= 3200 jobs), it just shows [Unicorn!] This page is taking too long to load.
I can re-run specifically for PyTorch 1.6 only, and will post a link later.
From a first look, I'd offer to drop support for all PyTorch version inferior to < 1.6 as these have been released more than two years ago.
I second that.
While we are at it, do we want to establish an official shifting window of how far back we want to support pytorch versions for? As in minimum - we support at least 2 years of pytorch? If it's easy to support longer we would but it'd be easy to cut off if need be.
The user always has the older transformers that they can pin to if they really need a very old pytorch support.
Yes, that would work fine with me. If I understand correctly, that's how libraries in the PyData ecosystem (scikit-learn, numpy) manage the support of Python versions: they drop support for versions older than 2 years (https://github.com/scikit-learn/scikit-learn/issues/20965, https://github.com/scikit-learn/scikit-learn/issues/20084, scipy toolchaib, https://github.com/scipy/scipy/pull/14655).
Dropping support for PyTorch/Flax/TensorFlow versions that have been released more than two years ago sounds good to me. That is somewhat already the case (see failing tests), but we're just not aware.
Hi, I am wondering what it means a PyTorch/TensorFlow/Flax version is supported. I guess it doesn't imply all models work under those framework versions, but would like to know if there is more explicit definition (for transformers, or more generally, in open source projects).
Ideally it should mean that all models work/all tests pass apart from functionality explicitly having versions tests (like CUDA bfloat16 or torch FX where we test against a specific PyTorch version).