save: atomic file save
Sometimes, Visidata can encounter some error while saving to a file. This results in partial files being written on disk, possibly overwriting previous copies.
It could be better to write to a temporary file first, then move it in place only once the entire save succeeds, making the save operation atomic.
It would be good to know where the temporary files is as well, just in case you want it. Maybe output the path to a "silent status" (a status that doesn't get printed to the bar, and only appears in the status screen listing page)
Further contribution from @geekscrapy on data with privacy concerns: Make sure it's only accessible to the current user, so if probably stick it in the same dir they are trying to save the actual file to, just under a temporary name.
Agreed, it's important to stick to the same directory: putting it in another one could cause the file to be written on another mount, making the final move operation costly and no longer atomic.
Here's the commit I was trying locally for this. Unfortunately I add tables to a local sqlite db by "saving over it", which is a hacky but acceptable workflow for me; but with this patch, it erases the whole db. There should probably be a cleaner workflow for that, but until then, I can't make this happen. So we'll Kondo this wishlist item for now. (and if someone comes by and wants to add this fix, and also figure out the sqlite db add table workflow, I'm happy to review a PR for that!)
commit 70efc517e7f82a32afabc2621b645144b9eda065 (HEAD -> develop)
Author: Saul Pwanson <[email protected]>
Date: Mon Apr 17 19:47:59 2023 -0700
[save] tempfile and rename for better safety #1009
diff --git a/visidata/save.py b/visidata/save.py
index d2801a23..6edb0a77 100644
--- a/visidata/save.py
+++ b/visidata/save.py
@@ -1,3 +1,4 @@
+import string
import collections
from copy import copy
@@ -131,7 +132,7 @@ def saveSheets(vd, givenpath, *vsheets, confirm_overwrite=False):
# savefuncs(vd, p, vs) will have 3 argcount (vs counts as an arg, along with vd, path)
if savefunc.__code__.co_argcount == 3 and len(vsheets) > 1:
vd.fail(f'cannot save multiple {filetype} sheets to non-dir')
- return vd.execAsync(savefunc, givenpath, *vsheets)
+ return vd.execAsync(vd.atomicSave, savefunc, givenpath, vsheets)
# path is a dir
@@ -146,12 +147,29 @@ def saveSheets(vd, givenpath, *vsheets, confirm_overwrite=False):
def _savefiles(vsheets, givenpath, savefunc, filetype):
for vs in vsheets:
- p = Path((givenpath / vs.name).with_suffix('.'+filetype))
- savefunc(p, vs)
+ finalp = Path((givenpath / vs.name).with_suffix('.'+filetype))
+ vd.atomicSave(savefunc, finalp, [vs])
vs.hasBeenModified = False
+
return vd.execAsync(_savefiles, vsheets, givenpath, savefunc, filetype)
+def randstr(n=8, s=string.ascii_lowercase):
+ import random
+ return ''.join(s[random.randint(0, len(s))] for i in range(n))
+
+
[email protected]
+def atomicSave(vd, savefunc, p, sheets): #1009
+ tmpp = p.with_name(p.stem+'-'+randstr(8)+p.suffix)
+ vd.aside('actual save to', tmpp)
+ savefunc(tmpp, *sheets)
+ if p.exists():
+ p.unlink()
+ tmpp.rename(p)
+ return True
+
+
@VisiData.api
def save_zip(vd, p, *vsheets):
vd.clearCaches()