adlfs
adlfs copied to clipboard
Example missing
Hello,
I am used to authenticate to my azure blob storage account using the following code:
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient
default_credential = DefaultAzureCredential()
account_url = 'https://test.blob.core.windows.net'
blob_service_client = BlobServiceClient(account_url, credential=default_credential)
container_client = blob_service_client.get_container_client("test")
I searched at many places but I can't find a concrete example of how to, e.g., list the files of the container test
using the adlfs library.
The latest I tested was this code (using previous variables):
fs = adlfs.AzureBlobFileSystem(account_url=account_url, credentials=default_credential, anon=False)
Any help would be appreciated.
kind regards Robin
Note that I meanwhile discovered this code snippet in the comments but the corresponding call doesn't work for me:
Authentication with DefaultAzureCredential
>>> abfs = AzureBlobFileSystem(account_name="XXXX", anon=False)
>>> abfs.ls('')
Can you share the trace back?
Here it is
from azure.identity import DefaultAzureCredential
import adlfs
default_credential=DefaultAzureCredential()
abfs = adlfs.AzureBlobFileSystem(account_name="XXX", anon=False)
abfs.ls('')
output
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/adlfs/spec.py", line 757, in ls
files = sync(
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/fsspec/asyn.py", line 65, in sync
raise return_result
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/adlfs/spec.py", line 815, in _ls
containers = [c async for c in contents]
File "/Users/robin.aly/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/adlfs/spec.py", line 815, in <listcomp>
containers = [c async for c in contents]
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/azure/core/async_paging.py", line 154, in __anext__
return await self.__anext__()
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/azure/core/async_paging.py", line 157, in __anext__
self._page = await self._page_iterator.__anext__()
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/azure/core/async_paging.py", line 99, in __anext__
self._response = await self._get_next(self.continuation_token)
File "/UsersXXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/azure/storage/blob/aio/_models.py", line 60, in _get_next_cb
process_storage_error(error)
File "/Users/XXX/Library/Caches/pypoetry/virtualenvs/fast-heat-detection-kty-UEtt-py3.9/lib/python3.9/site-packages/azure/storage/blob/_shared/response_handlers.py", line 181, in process_storage_error
exec("raise error from None") # pylint: disable=exec-used # nosec
File "<string>", line 1, in <module>
azure.core.exceptions.ClientAuthenticationError: Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:aa5e1056-501e-0056-63d0-457e49000000
Time:2022-04-01T13:58:21.2085578Z
ErrorCode:AuthenticationFailed
authenticationerrordetail:Signature not valid in the specified time frame: Start [Fri, 10 Apr 2020 08:55:16 GMT] - Expiry [Sun, 11 Apr 2021 08:55:00 GMT] - Current [Fri, 01 Apr 2022 13:58:21 GMT]
Content: <?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:aa5e1056-501e-0056-63d0-457e49000000
Time:2022-04-01T13:58:21.2085578Z</Message><AuthenticationErrorDetail>Signature not valid in the specified time frame: Start [Fri, 10 Apr 2020 08:55:16 GMT] - Expiry [Sun, 11 Apr 2021 08:55:00 GMT] - Current [Fri, 01 Apr 2022 13:58:21 GMT]</AuthenticationErrorDetail></Error>
Just want to chime in that I am also missing an example of how to use this in the readme. The only example there is about how to use it with Dash, but I, like Robin (and I assume, others in the future as well) want to instanciate a file system instead. If this is already in the works, great :-) Otherwise, I also don't mind writing a bit of docu myself as soon as I have figured out how to use it ;-)
To add on my last comment: My current manual way to interact with Azure File System looks like this - I am using the azure-identity
package.
from azure.identity import InteractiveBrowserCredential
from azure.storage.filedatalake import FileSystemClient
credential = InteractiveBrowserCredential()
credential.authenticate()
storage_name = "mystoragename"
container = "mycontainer"
url = f"https://{storage_name}.dfs.core.windows.net/"
fs_client = FileSystemClient(account_url= url,
credential=credential,
file_system_name=container)
fs_client.exists() # Should return True if sucessful
my two cents, I was able to use InteractiveBrowserCredential
like this:
import pandas as pd
from azure.identity import InteractiveBrowserCredential
credentials = InteractiveBrowserCredential(tenant_id={tenant_id})
credentials.authenticate()
storage_options = {'account_name' : {dalake_account_name}, 'anon': False}
df= pd.read_csv('az://{CONTAINER_NAME}/test/*.csv', storage_options=storage_options)
df.head()
The user needs to have the role of storage data blob contributor or storage blob data reader on the CONTAINER_NAME
.
UPDATE: this is not working as expected it turns out that the credentials used were related az cli. When I ensure that there is no cached credential, I'm receiving the same issue reported on #312.