cloudstorage
cloudstorage copied to clipboard
Overwriting behavioral difference between stores types, when using store.NewWriter...
Some cloud providers overwrite a file as an atomic operation that takes place on a call to writer.Close(). But for localfs and sftp, we're currently removing the file when the writer is opened and then we overwrite the object as we stream bytes to it. This creates a gap of time when the file is in an inconsistent state for those stores that don't support atomic replacement on Close().
One possible solution would be to write to a remote tmp file and then in writer.Close() to move/rename the file to its final location.
can't we just remove the backing file at end of writer.Close() ?
There isn't a backing file when you use store.NewWriter since it writes/pipes directly to the target source. I know GCS the best, so in GCS when you begin piping the data via a writer that data/file isn't visible until you call Close(). Then the new data is swapped into place, replacing any existing object and content.
ref: https://github.com/lytics/cloudstorage/pull/53/files#diff-9bae92b1bb719d9453e4c3469e9e7c8aR560 and https://github.com/lytics/cloudstorage/pull/53/files#diff-d96c12e79c906a3ad1845c783ac6a5c0R247 This lines both destroy the file when the writer is created.
We should discuss how big of a deal that is, maybe its fine? We'll never get each store to have exactly the same side effects. So, its just a best effort on our part.