filesystem_spec
filesystem_spec copied to clipboard
bugfix: await on size, assuming it can be an async function
Checking size
may require a call to the async _info
function. On line 277, _cp_file
checks if f1.size
is async (callback.set_size(await maybe_await(f1.size))
), but then later it simply references f1.size
which fails if this function is async. As such, this PR corrects the code to ensure f1.size
references are wrapped in await maybe_await()
.
@martindurant what do you think?
I think you are right in your argument. However, I have a feeling that f.size is never async actually, so it won't matter. open_async/AsyncStreamedFiles could use work!
I think you are right in your argument. However, I have a feeling that f.size is never async actually, so it won't matter. open_async/AsyncStreamedFiles could use work!
https://github.com/fsspec/s3fs/pull/742 makes it async in order to call the async _info needed for _cp_file
Funny, because S3 is the only one where for sure we should know the size before starting the download: the content-size header is always populated and, possibly outside compressive transcoding, will be correct.
Funny, because S3 is the only one where for sure we should know the size before starting the download: the content-size header is always populated and, possibly outside compressive transcoding, will be correct.
Perhaps I should change https://github.com/fsspec/filesystem_spec/blob/master/fsspec/generic.py#L277 to just remove the assumption that f1.size
could be async? Or do you think this still might be useful for other implementations even though it isn't an issue for s3?
maybe_await should never be harmful and not add any overhead