s3fs kwarg unexpected keyword argument in `AioSession`
Might be related to #204
I'm trying to use the cache_type kwarg for s3 [source], but this causes issues down the line when the file is accessed:
>>> import upath
>>> url = "s3://codeocean-s3datasetsbucket-1u41qdg42ur9/39490bff-87c9-4ef2-b408-36334e748ac6/nwb/ecephys_620264_2022-08-02_15-39-59_experiment1_recording1.nwb"
>>> path = upath.UPath(url, cache_type="first")
>>> path
S3Path('s3://codeocean-s3datasetsbucket-1u41qdg42ur9/39490bff-87c9-4ef2-b408-36334e748ac6/nwb/ecephys_620264_2022-08-02_15-39-59_experiment1_recording1.nwb')
>>> path.exists()
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\upath\core.py", line 711, in exists
return self.fs.exists(self.path)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 118, in wrapper
return sync(self.loop, func, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 103, in sync
raise return_result
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\fsspec\asyn.py", line 56, in _runner
result[0] = await coro
^^^^^^^^^^
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 1035, in _exists
await self._info(path, bucket, key, version_id=version_id)
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 1302, in _info
out = await self._call_s3(
^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 341, in _call_s3
await self.set_session()
File "c:\Users\ben.hardcastle\github\npc_io\.venv\Lib\site-packages\s3fs\core.py", line 502, in set_session
self.session = aiobotocore.session.AioSession(**self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: AioSession.__init__() got an unexpected keyword argument 'cache_type'
upath: 0.2.2 python: 3.11.5
Hi @bjhardcastle
Please note the difference between storage options for AbstractFileSystems and options for the their open() methods:
S3FileSystem class https://github.com/fsspec/s3fs/blob/efbe1e4c23a06e65b3df6a82f28fc49bab0dbd78/s3fs/core.py#L273-L297
The UPath constructor gathers all keyword arguments under **storage_options and uses those to instantiate the specific filesystem class.
S3FileSystem._open() method https://github.com/fsspec/s3fs/blob/efbe1e4c23a06e65b3df6a82f28fc49bab0dbd78/s3fs/core.py#L611-L625
If you want to pass specific options down to the filesystem specific AbstractBufferedFile implementation, you would use the following in your case:
import upath
upath.UPath("s3://mybucket/myfile.txt").open(cache_type="first")
If you want to set this on the Filesystem level for s3fs you can do:
import upath
p = upath.UPath("s3://mybucket/myfile.txt", default_cache_type="first")
...
p.open() # will use the default_cache_type
Let me know if that helps! It would be wonderful, if you could tell me how I could improve the text in the README to make this more intuitive. PRs are super welcome too!
Cheers, Andreas :smiley:
Hi Andreas,
Thank you very much for explaining in detail. That of course fixed it!
I don't think it was a problem with the README in this case, but the wording for the open() method (which I assumed came from pathlib):
Because it says "as the built-in does", I never would have thought to pass it config for the fsspec-related operations.
One of the reasons I use upath is so I don't need to set-up anything manually, it just handles whatever I throw at it! Now I'm trying to use different configurations I'll refer to the documentation more and let you know if any parts aren't clear.
Cheers, ben