filesystem_spec
filesystem_spec copied to clipboard
make `fsspec.asyn._get_batch_size()` public
Being able to get the configured or system default batch size from fsspec is useful when using filesystems that support an additional level of concurrency (beyond the existing fs methods that support file batching like get()
/put()
).
See: adlfs, which supports setting concurrency for chunked/multipart uploads and downloads (and in this case concurrency is handled by sending a batch size to the underlying Azure SDK, scheduling the individual chunk upload/download in adlfs with _run_coros_in_chunks()
is not an option).
_get_batch_size()
is currently marked as protected (with the leading _
) and internally only gets called in asyn._run_coros_in_chunks()
, but it would be better if fsspec exposed a public method for getting the configured batch size
I suspect that there was simply no use case for this before to make it public. Probably can just contribute a PR.
Agree with @efiop
I will comment that fsspec is an unusual library, because it has a definite "end user" API of public functions/methods (as in the docs) and a whole load of other stuff that is used by other libraries or implementations that build on fsspec. In addition, the async methods of an async implementation also start with "_" to show they are special, but not hidden/private. Perhaps a better convention could have been chosen.
In addition, the async methods of an async implementation also start with "_" to show they are special, but not hidden/private. Perhaps a better convention could have been chosen.
I'm aware of this convention for the sync vs async AbstractFileSystem
implementation methods (and it's documented that it works this way).
But fsspec.asyn
also contains a mix of module level methods that look intentionally public (like fsspec.asyn.get_loop()
) and ones that don't (like fsspec.asyn._get_batch_size()
) which is why the concerns about using _get_batch_size()
were raised in the adlfs PR
If it's safe to assume _get_batch_size()
is relatively stable and is safe for fs implementations to use, this issue can probably just be closed.
The best way to make sure that it is safe is to write a test against it and add docstrings/comments to the code itself. I would say it is not planned for any changes, in its existence or the signature, however.