visidata
visidata copied to clipboard
[*-saver] when delete-selected is used while saving, save thread quits
Small description I was saving a large file and while saving (with progress bar in bottom right), decided to delete some selected rows. Once the rows had deleted (progress at 100% for row deletion), the save progress stopped.
Expected result For the rows to be deleted and the save process to continue.
Actual result with screenshot Unfortunately I dont have time right now to generate a demo, but needed to log this as I thought it was quite critical. I tried this without my funky vd config as well and the bug still existed.
Steps to reproduce with sample data and a .vd
Load a large file and save it (I tried with .jsonl)
Select some rows
Delete the rows with delete-selected
Save process stops
Additional context 2.7dev
This is expected, as modifying a list while iterating over it while being iterated has undefined behavior. We could make a snapshot of the list before iteration, which would probably prevent the save thread from stopping early in this case that you're seeing, but it would also prevent saving all the rows in some cases:
- start loading
- before loading finishes, start saving
- saving will ignore all rows loaded after saving started
Whereas in the current implementation, if loading finishes before saving finishes, all rows should get saved.
Another possibility is to disable row add/delete (and maybe sort/slide and anything else that modifies the rows list) during any operation that iterates over rows, including saving. This might be the most correct option overall, but of course is the most work, and possibly more annoying--it would prevent you from being able to delete rows at all until the save finished completely, which just enforces the current recommendation of "don't do that".
+1 for not modifying while iterating over rows. I've been bitten a few times because I got too eager and tried to do something while a file was still loading, which I assume is a similar issue? Perhaps commands could be buffered up, and applied after the iteration completes? You could apply the command visually immediately, but not actually change the underlying data until it's safe to do so?
Yea this makes sense. Question: how do other applications do this? Generally while loading you can't edit, and while saving it's generally accepted that the interface freezes while saving. I guess the one key difference here is the speed at which these ops happen and also how the data is saved. I'm presuming the speed is due to most apps being written in a faster language than Py, and also some file types are designed to save only that which has changed, and are generally binary formats.
One immediate thought is, to make a loader and a saver which acts as the intermediate "working copy", this is designed to be fast, and is capable of diff saving. Files are loaded into this format when read. This way, an export feature could be made which is single threaded and runs on a copy of the data while letting the user carry on working. Just a couple of thoughts 🤷♂️ Of the options presented above, I would lean towards disabling features which change the dataset while saving, primarily because this has caught me out first hand, and secondly it gives me an excuse to get a coffee while ts saving 😅
I wonder how this relates to replays? Do replays automatically wait until a load or save is finished before progressing to the next command?
Yes, that's what this vd.sync()
is for.
One proposal for this is to fail()
modifying commands, if there is a save in progress.
Modifying commands consistently have Undo
s, so we can use addUndo
to perform a check if a save is occurring, and abort if it is.
This would only work if every modifying function adds the undo before it tries the modify, so this would have to be checked before implementing.
If we decide that it feels like work that is not going to happen anytime soon, I propose documenting this behaviour (how modifying commands interact with saving) in jsvine's tutorial, or vd.sync()
to freeze VisiData during saving. The latter would probably feel very bad, though.
Okay I made it so commands can't create new threads while threads from previous commands are still running. This seems to be a pretty good solution, though notably search with e.g. /
spawns a thread, so you can't do that while loading or other long-running commands now. We could make a list of read-only commands like this that are exceptions if this change otherwise makes the whole experience better.