Concurrent download of multiple files
I am wondering whether it is possible to use the library to download multiple files in parallel/concurrently (i.e. by exploiting asyncio Python SDK for Azure Blob Storage) so that given a list of files, e.g. ["file1.parquet", "file2.parquet", "file3.parquet"] I can start the download of all files in parallel with couritines and await them so that at the end I can do something the files, e.g. for instance concatenating them together.
What I tried so far is a single file:
import pandas as pd
storage_options={"connection_string": "MY_CONNECTION_STRING"}
CONTAINER = "MY_CONTAINER_NAME"
FILE = "PATH/TO/FILE.parquet"
FILEPATH = f"az://{CONTAINER}/{FILE}"
df = pd.read_parquet(FILEPATH, storage_options=storage_options)
adlfs (like most fsspec libraries) is asynchronous internally with a sync API on top. See https://filesystem-spec.readthedocs.io/en/latest/async.html for more.
On Nov 30, 2023, at 9:38 AM, Luca Maurelli @.***> wrote:
I am wondering whether it is possible to use the library to download multiple files in parallel/concurrently (i.e. by exploiting asyncio Python SDK for Azure Blob Storage) so that given a list of files, e.g. ["file1.parquet", "file2.parquet", "file3.parquet"] I can start the download of all files in parallel with couritines and await them so that at the end I can do something the files, e.g. for instance concatenating them together.
What I tried so far is a single file:
import pandas as pd storage_options={"connection_string": "MY_CONNECTION_STRING"} CONTAINER = "MY_CONTAINER_NAME" FILE = "PATH/TO/FILE.parquet" FILEPATH = f"az://{CONTAINER}/{FILE}" df = pd.read_parquet(FILEPATH, storage_options=storage_options) — Reply to this email directly, view it on GitHub https://github.com/fsspec/adlfs/issues/439 or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAOIQZROX4KX64N5UUHRTYHCR6NBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJRGA4TIMBRHE4TTAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDEMBRHA4TGMZRGY4KO5DSNFTWOZLSUZRXEZLBORSQ. You are receiving this email because you are subscribed to this thread.
Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.