adlfs icon indicating copy to clipboard operation
adlfs copied to clipboard

`fsspec.core.url_to_fs` starts the authentication process when when URL starts with "abfs"

Open anmyachev opened this issue 3 years ago • 6 comments

This behavior is not expected for a user who has previously worked with s3-like URLs. The authentication process is supposed to start when function glob is called.

The following code works well for s3-like URLs, however this doesn't work for abfs-like URLs:

from botocore.exceptions import NoCredentialsError

fs = fsspec.core.url_to_fs(file_path, anon=False)[0]
exists = False
try:
    exists = len(fs.glob(file_path)) > 0
except (NoCredentialsError, PermissionError):
    pass
fs = fsspec.core.url_to_fs(file_path, anon=True)[0]
return exists or len(fs.glob(file_path)) > 0

Logs:

>>> fsspec.core.url_to_fs(file_path, anon=False, **storage_options)
Traceback (most recent call last):
  File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 525, in _get_default_azure_credential
    asyncio.get_child_watcher().attach_loop(self.loop)
  File "...\Miniconda3\envs\modin\lib\asyncio\events.py", line 763, in get_child_watcher        
    return get_event_loop_policy().get_child_watcher()
  File "...\Miniconda3\envs\modin\lib\asyncio\events.py", line 599, in get_child_watcher        
    raise NotImplementedError
NotImplementedError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 571, in do_connect     
    self._get_default_azure_credential()
  File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 529, in _get_default_azure_credential
    raise ClientAuthenticationError(
azure.core.exceptions.ClientAuthenticationError: No explict credentials provided. Failed with DefaultAzureCredential!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\Miniconda3\envs\modin\lib\site-packages\fsspec\core.py", line 407, in url_to_fs     
    fs = cls(**options)
  File "...\Miniconda3\envs\modin\lib\site-packages\fsspec\spec.py", line 68, in __call__       
    obj = super().__call__(*args, **kwargs)
  File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 452, in __init__       
    self.do_connect()
  File "...\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 588, in do_connect     
    raise ValueError(f"unable to connect to account for {e}")
ValueError: unable to connect to account for No explict credentials provided. Failed with DefaultAzureCredential!
>>> fsspec.core.url_to_fs(file_path, anon=True, **storage_options)  
(<adlfs.spec.AzureBlobFileSystem object at 0x0000022E019FACA0>, 'nyctlc/green/puYear=2019/puMonth=*/*.parquet')

anmyachev avatar May 04 '22 21:05 anmyachev

Can you post your expected behavior?

TomAugspurger avatar May 05 '22 12:05 TomAugspurger

Expected behavior when glob throws an exception regarding the credentials, not the url_to_fs function.

Example with s3:

>>> import fsspec                                        
>>> file_path = "s3://noaa-ghcn-pds/csv/1785.csv"
>>> fs = fsspec.core.url_to_fs(file_path, anon=False)[0]  # This function does not throw an exception with s3-like path
>>> len(fs.glob(file_path)) > 0   
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 85, in wrapper        
    return sync(self.loop, func, *args, **kwargs)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 65, in sync
    raise return_result
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 25, in _runner        
    result[0] = await coro
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 626, in _glob
    return await super()._glob(path, **kwargs)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\asyn.py", line 621, in _glob
    elif await self._exists(path):
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 835, in _exists
    await self._info(path, bucket, key, version_id=version_id)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 1029, in _info
    out = await self._call_s3(
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 281, in _call_s3        
    raise err
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\s3fs\core.py", line 261, in _call_s3        
    out = await method(**additional_kwargs)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\client.py", line 173, in _make_api_call
    http, parsed_response = await self._make_request(
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\client.py", line 193, in _make_request
    return await self._endpoint.make_request(operation_model, request_dict)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\endpoint.py", line 77, in _send_request
    request = await self.create_request(request_dict, operation_model)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\endpoint.py", line 70, in create_request
    await self._event_emitter.emit(event_name, request=request,
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\hooks.py", line 27, in _emit    
    response = await handler(**kwargs)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\signers.py", line 16, in handler    return await self.sign(operation_name, request)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\aiobotocore\signers.py", line 63, in sign   
    auth.add_auth(request)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\botocore\auth.py", line 378, in add_auth    
    raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials

anmyachev avatar May 05 '22 13:05 anmyachev

Thanks. So the glob part is perhaps behaving fine, it's just the url_to_fs part? Can you post a reproducible example? This seems to work

In [1]: import fsspec.core

In [2]: fsspec.core.url_to_fs("abfs://test/a/b/", anon=False, account_name="acc")
Out[2]: (<adlfs.spec.AzureBlobFileSystem at 0x7f801ee26ac0>, 'test/a/b/')

TomAugspurger avatar May 05 '22 13:05 TomAugspurger

Yes.

UPD: looks like a problem with asyncio on Windows

fsspec: 2022.3.0 OS: Windows 10

>>> import fsspec.core
>>> fs = fsspec.core.url_to_fs('az://nyctlc/green/puYear=2019/puMonth=*/*.parquet', anon=False, account_name='azureopendatastorage')
Traceback (most recent call last):
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 525, in _get_default_azure_credential
    asyncio.get_child_watcher().attach_loop(self.loop)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\asyncio\events.py", line 763, in get_child_watcher        
    return get_event_loop_policy().get_child_watcher()
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\asyncio\events.py", line 599, in get_child_watcher        
    raise NotImplementedError
NotImplementedError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 571, in do_connect     
    self._get_default_azure_credential()
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 529, in _get_default_azure_credential
    raise ClientAuthenticationError(
azure.core.exceptions.ClientAuthenticationError: No explict credentials provided. Failed with DefaultAzureCredential!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\core.py", line 407, in url_to_fs     
    fs = cls(**options)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\fsspec\spec.py", line 68, in __call__       
    obj = super().__call__(*args, **kwargs)
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 452, in __init__       
    self.do_connect()
  File "C:\Users\amyachev\Miniconda3\envs\modin\lib\site-packages\adlfs\spec.py", line 588, in do_connect     
    raise ValueError(f"unable to connect to account for {e}")
ValueError: unable to connect to account for No explict credentials provided. Failed with DefaultAzureCredential!

anmyachev avatar May 05 '22 14:05 anmyachev

Can you try with fsspec 2022.04.0?

>>> import fsspec.core
>>> fs = fsspec.core.url_to_fs('az://nyctlc/green/puYear=2019/puMonth=*/*.parquet', anon=False, account_name='azureopendatastorage')
## -- End pasted text --

TomAugspurger avatar May 07 '22 12:05 TomAugspurger

@TomAugspurger it seems that fsspec last release is 2022.03.0 on PyPI.

anmyachev avatar May 13 '22 07:05 anmyachev

It works for, at least, fsspec==2023.12.0.

Thanks @TomAugspurger!

anmyachev avatar Jan 25 '24 16:01 anmyachev