onnx
onnx copied to clipboard
Remove test data from PyPI package
Describe the bug
The ONNX package on PyPI contains all test files found at https://github.com/onnx/onnx/tree/main/onnx/backend/test/data . These constitute ~40MB unpacked or more than 70% of the total package size.
System information
I checked the 1.15.0 MacOS wheel, but judging by the compressed file size all platforms are affected: https://pypi.org/project/onnx/#files
Expected behavior
These test files should not be installed in a production environment.
Other notes
Are there any plans to move those binary files out of git / generate them on the fly?
+1 on this. Otherwise ONNX PyPI package will grow significantly when there are more backend tests from more ops. See related discussion in this issue. For now, some users might still need static backend tests from the package and in that case onnx can have 2 packages -- one with test data and one without test data, but personally I feel ONNX eventually should stop providing them from PyPI and just let users produce them on the fly.
We should.
One advantage with distributing the test data of course, is that runtimes do not need the Python tool chain to run tests (protobuf python, numpy etc.)
When you say "distributing" do you mean shipping them in the PyPI package or having them in the repository? I have a hard time seeing a use case where a downstream project would rather fish the test files out of the PyPI package than using a git submodule.
Ah you are right. We can check those in without distributing them with the Python package.
Ah you are right. We can check those in without distributing them with the Python package.
I think the best way forward is if we were to move the onnx/onnx/backend/test folder out of the Python package. While being at it, we may want to do the same with onnx/onnx/backend/sample.
I took a closer look at this issue. Unfortunately, the lines between tests that should not be packaged, test utilities, and the reference implementation are blurry. Moving the "tests" out of onnx/ is quite a large and technically breaking change. The most minimally invasive way to exclude those test files is by simply excluding them from the final package as done in #5970 .