adlfs
adlfs copied to clipboard
`fsspec.core.url_to_fs` starts the authentication process when when URL starts with "abfs"
This behavior is not expected for a user who has previously worked with s3-like URLs. The authentication process is supposed to start when function glob is called.
The following code works well for s3-like URLs, however this doesn't work for abfs-like URLs:
from botocore.exceptions import NoCredentialsError
fs = fsspec.core.url_to_fs(file_path, anon=False)[0]
exists = False
try:
exists = len(fs.glob(file_path)) > 0
except (NoCredentialsError, PermissionError):
pass
fs = fsspec.core.url_to_fs(file_path, anon=True)[0]
return exists or len(fs.glob(file_path)) > 0
Logs:
>>> fsspec.core.url_to_fs(file_path, anon=False, **storage_options)
Traceback (most recent call last):
File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 525, in _get_default_azure_credential
asyncio.get_child_watcher().attach_loop(self.loop)
File "...\Miniconda3\envs\modin\lib\asyncio\events.py", line 763, in get_child_watcher
return get_event_loop_policy().get_child_watcher()
File "...\Miniconda3\envs\modin\lib\asyncio\events.py", line 599, in get_child_watcher
raise NotImplementedError
NotImplementedError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 571, in do_connect
self._get_default_azure_credential()
File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 529, in _get_default_azure_credential
raise ClientAuthenticationError(
azure.core.exceptions.ClientAuthenticationError: No explict credentials provided. Failed with DefaultAzureCredential!
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "...\Miniconda3\envs\modin\lib\site-packages\fsspec\core.py", line 407, in url_to_fs
fs = cls(**options)
File "...\Miniconda3\envs\modin\lib\site-packages\fsspec\spec.py", line 68, in __call__
obj = super().__call__(*args, **kwargs)
File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 452, in __init__
self.do_connect()
File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 588, in do_connect
raise ValueError(f"unable to connect to account for {e}")
ValueError: unable to connect to account for No explict credentials provided. Failed with DefaultAzureCredential!
>>> fsspec.core.url_to_fs(file_path, anon=True, **storage_options)
(<adlfs.spec.AzureBlobFileSystem object at 0x0000022E019FACA0>, 'nyctlc/green/puYear=2019/puMonth=*/*.parquet')
Can you post your expected behavior?
Expected behavior when glob throws an exception regarding the credentials, not the url_to_fs function.
Example with s3:
>>> import fsspec
>>> file_path = "s3://noaa-ghcn-pds/csv/1785.csv"
>>> fs = fsspec.core.url_to_fs(file_path, anon=False)[0] # This function does not throw an exception with s3-like path
>>> len(fs.glob(file_path)) > 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 85, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 65, in sync
raise return_result
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 25, in _runner
result[0] = await coro
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 626, in _glob
return await super()._glob(path, **kwargs)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 621, in _glob
elif await self._exists(path):
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 835, in _exists
await self._info(path, bucket, key, version_id=version_id)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 1029, in _info
out = await self._call_s3(
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 281, in _call_s3
raise err
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 261, in _call_s3
out = await method(**additional_kwargs)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\client.py", line 173, in _make_api_call
http, parsed_response = await self._make_request(
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\client.py", line 193, in _make_request
return await self._endpoint.make_request(operation_model, request_dict)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\endpoint.py", line 77, in _send_request
request = await self.create_request(request_dict, operation_model)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\endpoint.py", line 70, in create_request
await self._event_emitter.emit(event_name, request=request,
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\hooks.py", line 27, in _emit
response = await handler(**kwargs)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\signers.py", line 16, in handler return await self.sign(operation_name, request)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\signers.py", line 63, in sign
auth.add_auth(request)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\botocore\auth.py", line 378, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
Thanks. So the glob part is perhaps behaving fine, it's just the url_to_fs part? Can you post a reproducible example? This seems to work
In [1]: import fsspec.core
In [2]: fsspec.core.url_to_fs("abfs://test/a/b/", anon=False, account_name="acc")
Out[2]: (<adlfs.spec.AzureBlobFileSystem at 0x7f801ee26ac0>, 'test/a/b/')
Yes.
UPD: looks like a problem with asyncio on Windows
fsspec: 2022.3.0
OS: Windows 10
>>> import fsspec.core
>>> fs = fsspec.core.url_to_fs('az://nyctlc/green/puYear=2019/puMonth=*/*.parquet', anon=False, account_name='azureopendatastorage')
Traceback (most recent call last):
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 525, in _get_default_azure_credential
asyncio.get_child_watcher().attach_loop(self.loop)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\asyncio\events.py", line 763, in get_child_watcher
return get_event_loop_policy().get_child_watcher()
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\asyncio\events.py", line 599, in get_child_watcher
raise NotImplementedError
NotImplementedError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 571, in do_connect
self._get_default_azure_credential()
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 529, in _get_default_azure_credential
raise ClientAuthenticationError(
azure.core.exceptions.ClientAuthenticationError: No explict credentials provided. Failed with DefaultAzureCredential!
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\core.py", line 407, in url_to_fs
fs = cls(**options)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\spec.py", line 68, in __call__
obj = super().__call__(*args, **kwargs)
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 452, in __init__
self.do_connect()
File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 588, in do_connect
raise ValueError(f"unable to connect to account for {e}")
ValueError: unable to connect to account for No explict credentials provided. Failed with DefaultAzureCredential!
Can you try with fsspec 2022.04.0?
>>> import fsspec.core
>>> fs = fsspec.core.url_to_fs('az://nyctlc/green/puYear=2019/puMonth=*/*.parquet', anon=False, account_name='azureopendatastorage')
## -- End pasted text --
@TomAugspurger it seems that fsspec last release is 2022.03.0 on PyPI.
It works for, at least, fsspec==2023.12.0.
Thanks @TomAugspurger!