API for conditional / exclusive write
Over in https://github.com/zarr-developers/zarr-python/pull/2262, we'd like to write a file but only if it doesn't already exist. On a local file system, this would be open(path, mode="xb"), which will fail with a FileExistsError if the file already exists.
Now that S3 supports conditional writes, it should be possible to implement this for s3fs, gcsfs (if_generation_match=0), and adlfs (overwrite=False).
Would there be any appetite for standardizing this behavior? I'm not sure what API is best, but I lean towards something like an overwrite: bool parameter to pipe and similar methods. We could also try to support mode=xb in some open-like methods, but I'm less sure about that.
If this is only to apply to open, then the mode= would be fine, and probably the check would happen at open time. But I think you mean for methods put/pipe, right? A bool argument on those methods and their one-file variants would be enough.
A couple of thoughts:
- how does this interact with on_error, when trying to write multiple files to remote; is it like any other IO error? Probably yes; so other files would get written (concurrently), this would not act as a lock on the whole operation
- on S3, I assume you are looking at If-None-Match; is if_generation_match=0 really the same, or does it mean "if no such filename ever existed"?
Do you know how this interacts with multi-part-uploads, where although many bytes might have been sent, the file is not really written to the remote path location until a final commit? At what point is the exists condition applied?
I'm not sure offhand.