cloudpathlib
cloudpathlib copied to clipboard
Investigate concurrency support
We probably want to be able to do things like downloads of many files in parallel. Async may help (#28) but some backends may be able to do things like multipart upload/download in parallel and do things across processes in addition to across threads.
Like #28, we'll want tests to make sure the gains are worth the complexity.
Two options with Azure:
max_concurrencyparameter - https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobclient?view=azure-python#download-blob-offset-none--length-none----kwargs-- New
aiointerface: https://docs.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.aio.blobclient?view=azure-python
For S3, it appears to be setting transfer config with something like:
from boto3.s3.transfer import TransferConfig
config = TransferConfig(
...,
max_concurrency=10,
use_threads=True
)
for S3 aioboto3 is another option