bandersnatch
bandersnatch copied to clipboard
ImportError on S3DirEntry from 'mirror' operation
Hello, I'm using a fresh install of bandersnatch[s3] in attempt to establish a private S3-backed mirror. I discovered this issue in Python 3.9, but was able to reproduce it on Python 3.11. Here is an example configuration, and the following stack trace.
[mirror]
master = https://pypi.org
storage-backend = s3
directory = /my-s3-bucket/
diff-file = bandersnatch-diff
diff-append-epoch = true
json = false
stop-on-error = true
timeout = 30
keep_index_versions = 3
workers = 4
[plugins]
enabled =
allowlist_project
allowlist_release
blocklist_project
blocklist_release
project_requirements
project_requirements_pinned
exclude_platform
[blocklist]
platforms =
windows
macos
freebsd
[allowlist]
packages =
requirements_path = ./
requirements = *.txt
# bandersnatch -c bandersnatch.conf mirror
2024-03-15 15:50:23,977 INFO: Selected storage backend: s3 (configuration.py:133)
2024-03-15 15:50:23,977 INFO: Selected compare method: hash (configuration.py:181)
Traceback (most recent call last):
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch_storage_plugins/s3.py", line 26, in <module>
from s3path import S3DirEntry as _S3DirEntry
ImportError: cannot import name 'S3DirEntry' from 's3path' (/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/s3path/__init__.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/bin/bandersnatch", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/main.py", line 225, in main
return asyncio.run(async_main(args, config))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/main.py", line 190, in async_main
return await bandersnatch.mirror.mirror(config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/mirror.py", line 925, in mirror
storage_backend_plugins(
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/storage.py", line 403, in storage_backend_plugins
return load_storage_plugins(
^^^^^^^^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch/storage.py", line 370, in load_storage_plugins
plugin_class = entry_point.load()
^^^^^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2471, in load
return self.resolve()
^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/pkg_resources/__init__.py", line 2477, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/bandersnatch_storage_plugins/s3.py", line 29, in <module>
from s3path import _S3DirEntry
ImportError: cannot import name '_S3DirEntry' from 's3path' (/home/sasha/investigations/2024-03-15-bandersnatch-s3path/env/lib/python3.11/site-packages/s3path/__init__.py)
It appears the commit in the upstream s3path package is https://github.com/liormizr/s3path/commit/5ea0bd23db60c6efd34da732534442ffb8894abf. It looks like this change first appeared in version 0.5.0, and that the latest tag without it is 0.4.2.
A proper fix should probably update the Bandersnatch codebase to use the new public API, but a minimal fix in the meantime would be to adjust requirements_s3.txt to use s3path==0.4.2 rather than its current s3path==0.5.0.
Interesting, we don't support / test Python 3.9 to start with. We may have used more recent python syntax. I would have expected you would have had to hack things to get it to install in 3.9? If not, that's a bug too.
That aside, I would have expected our CI to catch this on the PR that updated to 0.5.0. Happy to revert for now.
- https://github.com/pypa/bandersnatch/pull/1525
- we must have a testing gap there
I even more welcome a PR to upgrade to latest APIs + plugging the missing testing. I started locally on https://github.com/pypa/bandersnatch/pull/1672 but just haven't had the time to finish it and test it ... all help welcome.
I didn't have any trouble installing on 3.9, funnily enough. I was able to get my environment to work by pinning the s3path version: the extra requirement listed in the setup.cfg looks like it's just set to >= 0.4.0, so pip install bandersnatch[s3] s3path<0.5.0 is resolvable.
I haven't got the cycles to contribute today, unfortunately, but I'm back up and working with just that install tweak.
I feel let's retry on latest version of bandersnatch and reopen if still an issue.