adlfs
adlfs copied to clipboard
await file_obj.credential.close() : TypeError: object NoneType can't be used in 'await' expression
My problem
At the end of my python script, I get a clean up error : TypeError: object NoneType can't be used in 'await' expression
Complete trace is :
Traceback (most recent call last):
File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
f()
File ".../lib/python3.10/weakref.py", line 591, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression
Which is weird because, I don't do anything asynchronious.
A reproducible example
At least as much as I can:
from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential
azure_cli = AzureCliCredential()
managed_identity = ManagedIdentityCredential()
CREDENTIAL_CHAIN = ChainedTokenCredential(managed_identity, azure_cli)
import pandas as pd
pd.read_parquet("abfs://[email protected]/path_to_parquets.parquet", storage_options={"credential": credential_chain})
print("Done")
I do get the "Done" printed before failure.
My config
- ubuntu,
- python 3.10,
- azure-storage-blob==12.16.0
- pandas==2.0.0
- pyarrow==11.0.0
- adlfs==2023.9.0
fsspec / adlfs use an async internally. can you try using the credentials from azure.identity.aio
instead?
On Oct 6, 2023, at 10:43 AM, ELToulemonde @.***> wrote:
My problem
At the end of my python script, I get a clean up error : TypeError: object NoneType can't be used in 'await' expression on
Traceback (most recent call last): File ".../lib/python3.10/weakref.py", line 667, in _exitfunc f() File ".../lib/python3.10/weakref.py", line 591, in call return info.func(*info.args, **(info.kwargs or {})) File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential await file_obj.credential.close() TypeError: object NoneType can't be used in 'await' expression Which is weird because, I don't do anything asynchronious.
A reproducible example
At least as much as I can:
from azure.identity import ChainedTokenCredential, ManagedIdentityCredential, AzureCliCredential azure_cli = AzureCliCredential() managed_identity = ManagedIdentityCredential() CREDENTIAL_CHAIN = ChainedTokenCredential(managed_identity, azure_cli)
import pandas as pd @.***/path_to_parquets.parquet", storage_options={"credential": credential_chain}) print("Done") I do get the "Done" printed before failure.
My config
ubuntu, python 3.10, azure-storage-blob==12.16.0 pandas==2.0.0 pyarrow==11.0.0 adlfs==2023.9.0 — Reply to this email directly, view it on GitHub https://github.com/fsspec/adlfs/issues/431 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIVBR7SYSQMVOV7SKMTX6ARJZBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJRGA4TIMBRHE4TTAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDCOJTGA2DOMBSGA2KO5DSNFTWOZLSUZRXEZLBORSQ. You are receiving this email because you are subscribed to this thread.
Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@TomAugspurger @ELToulemonde I get the same error, no difference importing DefaultAzureCredential
from either azure.identity.aio
or azure.identity
, did you solve this?
Same error here, any solutions?
Importing DefaultAzureCredential
from azure.identity.aio
silenced that error for me.
Python 3.10.13 on Ubuntu.
Package Version
--------------------------- ---------
adlfs 2023.12.0
azure-ai-ml 1.12.1
azure-common 1.1.28
azure-core 1.29.6
azure-datalake-store 0.0.53
azure-identity 1.15.0
azure-mgmt-core 1.4.0
azure-mgmt-resource 23.0.1
azure-mgmt-storage 21.1.0
azure-mgmt-subscription 3.1.1
azure-storage-blob 12.19.0
azure-storage-file-datalake 12.14.0
azure-storage-file-share 12.15.0
pyarrow 14.0.2
I am aware that using azure.identity.aio
silence the error but: Why does async involved in non-async call ?
fsspec uses asyncio internally.
I'd recommend people use credentials from azure.identity.aio
. If someone wants, we could add an inspect.iscoroutine
check to before we call .close
.
That may solve one of our issue: I am using adlfs as part of a complex code that use ThreadPool. At the end of the run, I get this message that do not change the exit code, so not fatal but looks a bit ugly:
Traceback (most recent call last):
File ".../lib/python3.10/weakref.py", line 667, in _exitfunc
f()
File ".../lib/python3.10/weakref.py", line 591, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential
await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression
I did not manage to create a small reproducable example ...
Reproduction example
Just create an AzureBlobFileSystem
with non-AIO DefaultAzureCredential
:
from adlfs import AzureBlobFileSystem
from azure.identity import DefaultAzureCredential
AzureBlobFileSystem(
account_name="_",
credential=DefaultAzureCredential(),
)
When the code exits an exception is raised:
Traceback (most recent call last):
File "/usr/lib/python3.12/weakref.py", line 666, in _exitfunc
f()
File "/usr/lib/python3.12/weakref.py", line 590, in __call__
return info.func(*info.args, **(info.kwargs or {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.12/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File ".venv/lib/python3.12/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
^^^^^^^^^^
File ".venv/lib/python3.12/site-packages/adlfs/utils.py", line 78, in close_credential
await file_obj.credential.close()
TypeError: object NoneType can't be used in 'await' expression
What about azure.identity.aio.DefaultAzureCredential
I'm using both azure.storage.blob.BlobServiceClient
and adlfs.AzureBlobFileSystem
in my code. I made the mistake importing the AIO version and using it for both which caused the ABFS
to no longer raise an exception but BlobServiceClient
stopped to work as it expects a sync version.
I fixed that by using non-AIO credentials for the BlobServiceClient
and AIO ones for AzureBlobFileSystem
. It's not pretty having to create two credential objects but fixes the issue.
fsspec / adlfs use an async internally. can you try using the credentials from
azure.identity.aio
instead?
The azure.identity.aio
solution worked for me in a simple script while testing out remote storage connectivity:
import io
import os
import adlfs
from azure.identity.aio import DefaultAzureCredential
credential = DefaultAzureCredential()
azfs = adlfs.AzureBlobFileSystem(
account_name=<storage-account-name>,
credential=credential
)
with io.BytesIO() as buf:
buf.write(str('hello from the byterealm!').encode())
buf.seek(0)
azfs.write_bytes('/container/path/msg.txt', buf)
file_output = azfs.read_text('/container/path/msg.txt')
print(file_output)
Giving the print out:
$ python remote-storage.py
hello from the byterealm!
That may solve one of our issue: I am using adlfs as part of a complex code that use ThreadPool. At the end of the run, I get this message that do not change the exit code, so not fatal but looks a bit ugly:
Traceback (most recent call last): File ".../lib/python3.10/weakref.py", line 667, in _exitfunc f() File ".../lib/python3.10/weakref.py", line 591, in __call__ return info.func(*info.args, **(info.kwargs or {})) File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File ".../lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File ".../lib/python3.10/site-packages/adlfs/utils.py", line 78, in close_credential await file_obj.credential.close() TypeError: object NoneType can't be used in 'await' expression
I did not manage to create a small reproducable example ...
Using an asynchronous object in a threadpool sounds like a headache... you could create a credential
in each thread to isolate the async handlers per action, but, perhaps collecting the pool and finalising the credential
once all threads are complete could work too? Practicing the former would help when you move from threads to processes or external Workers/Jobs.