cloudstorage icon indicating copy to clipboard operation
cloudstorage copied to clipboard

Overwriting behavioral difference between stores types, when using store.NewWriter...

Open epsniff opened this issue 6 years ago • 4 comments

Some cloud providers overwrite a file as an atomic operation that takes place on a call to writer.Close(). But for localfs and sftp, we're currently removing the file when the writer is opened and then we overwrite the object as we stream bytes to it. This creates a gap of time when the file is in an inconsistent state for those stores that don't support atomic replacement on Close().

epsniff avatar May 12 '18 15:05 epsniff

One possible solution would be to write to a remote tmp file and then in writer.Close() to move/rename the file to its final location.

epsniff avatar May 12 '18 15:05 epsniff

can't we just remove the backing file at end of writer.Close() ?

araddon avatar May 12 '18 15:05 araddon

There isn't a backing file when you use store.NewWriter since it writes/pipes directly to the target source. I know GCS the best, so in GCS when you begin piping the data via a writer that data/file isn't visible until you call Close(). Then the new data is swapped into place, replacing any existing object and content.

epsniff avatar May 12 '18 16:05 epsniff

ref: https://github.com/lytics/cloudstorage/pull/53/files#diff-9bae92b1bb719d9453e4c3469e9e7c8aR560 and https://github.com/lytics/cloudstorage/pull/53/files#diff-d96c12e79c906a3ad1845c783ac6a5c0R247 This lines both destroy the file when the writer is created.

We should discuss how big of a deal that is, maybe its fine? We'll never get each store to have exactly the same side effects. So, its just a best effort on our part.

epsniff avatar May 12 '18 18:05 epsniff