helix icon indicating copy to clipboard operation
helix copied to clipboard

Panic handler, that saves unsaved work to temporary files

Open cessen opened this issue 3 years ago • 25 comments

It would be good to have a panic handler that saves the user's work before crashing, so that when a panicking bug is encountered the user doesn't lose all their work.

It should probably save the work to temporary files, so that if the user didn't want to save they don't lose the on-disk data either. Maybe append ".crash-01/02/03/..." or something to the filename.

cessen avatar Jun 17 '21 21:06 cessen

We could possibly dump the Document's ChangeSet directly to the temporary file since it's unlikely that users will make radically large changes that would negate the space benefit of simply dumping the ChangeSet.

kirawi avatar Jun 17 '21 22:06 kirawi

I think persistent undo would partly solve this as well

archseer avatar Jun 18 '21 02:06 archseer

@kirawi: I don't know if optimizing for space/speed/etc. is the right thing to do in this case. This is an exceptional case (a crash), and I think reliability is most important.

So I would still advocate for simply dumping all unsaved buffers to disk (probably in utf8, regardless of what the original file was). Basically, just get the user's unsaved work to the disk in the dumbest, hardest-to-screw-up way possible. And in a format they can easily open with anything, so that if something odd is going on (Helix did crash, after all) they aren't beholden to Helix for conveniently accessing their data again.

In other words, I think this is a feature that should do absolutely nothing fancy. We can save the fancy stuff for when Helix is functioning as expected (which is ideally 100% of the time).

cessen avatar Jun 18 '21 06:06 cessen

I think persistent undo would partly solve this as well

Yeah, that's true to an extent. But unless we're really careful about how we implement those kinds of features, things can still potentially get into a weird state (e.g. maybe the on-disk undo history and on-disk file are out of sync, and it's hard to figure out where to start in the undo history to correctly recover the lost data).

This feature is basically like wearing your seat belt: you don't expect to crash, but just in case you do, you'd like to minimize irrecoverable damage. And I think it should be pretty simple to implement, at least to the extent that we're willing to trust Ropey to always stay in a valid state.

cessen avatar Jun 18 '21 06:06 cessen

Maybe swap file will do? For every change we keep a swap file in /tmp then if it panics we have a panic handler to report the swap file within /tmp.

pickfire avatar Jun 18 '21 09:06 pickfire

@cessen How about file compression then? If I was editing a big file for whatever reason and Helix crashed, I wouldn't want it to write it completely down. File compression and decompression on text files is exceptionally fast and well optimized. Maybe https://github.com/BurntSushi/rust-snappy

kirawi avatar Jun 18 '21 17:06 kirawi

Maybe swap file will do?

Yeah, we could do that. Basically like auto-save "light". Then on crash, the swap file will always still be there. That's probably more robust than the approach I've been suggesting.

How about file compression then?

The key point I'm trying to make is that this code will only run in exceptional circumstances, and ideally it will never run at all. It's also code that only runs when something has gone wrong (potentially seriously wrong).

All of these factors mean that it should be the dumbest, hardest-to-screw-up thing possible. Every additional layer of anything that we add on top of it is another code path that may have its own bugs. And for something that is supposed to be the safeguard in case of crashes, minimizing the potential for bugs trumps even major efficiency gains, I think.

Usually I'm all for making things faster, more efficient, etc. But this is one of the few cases where I think that's basically a non-consideration.

cessen avatar Jun 18 '21 18:06 cessen

Over the last day or so I used cargo-fuzz to fuzz the mutation methods on Ropey's Rope type. I let it run for ~14 hours on my 24-core workstation, for a total of ~1 billion iterations.

Each iteration includes a check that panics if the Rope doesn't pass any of Ropey's own validation checks. So the fuzzing is effectively searching not just for unexpected panics/memory-safety-violations/etc., but also for anything that can put the Rope into an invalid state by Ropey's own definition.

It found a single issue within the first 10 minutes or so (a benign validation-check violation), which I've fixed. But other than that, it's found zero issues. That's not proof of perfection, of course. But it does indicate that Ropey's Rope type is at least very difficult to get into an invalid state. So I think we can reasonably depend on it, if we want to go the route of dumping from memory.

cessen avatar Jun 19 '21 05:06 cessen

Should we prefer #401 to this?

kirawi avatar Sep 26 '21 14:09 kirawi

Should we prefer #401 to this?

I personally see those as two different features: for instance, I'd probably want to disable persistent undo as this is not a feature I want yet I'd like to have swap files in case of problems.

antoyo avatar Mar 03 '22 14:03 antoyo

#4215 author here. How difficult would you estimate this fix would be to implement? I'm a more or less advanced beginner when it comes to rust and i'd be interested in tackling this issue if it isn't too far out of my league.

sqwxl avatar Oct 11 '22 21:10 sqwxl

#4215 author here. How difficult would you estimate this fix would be to implement? I'm a more or less advanced beginner when it comes to rust and i'd be interested in tackling this issue if it isn't too far out of my league.

I think first a decision about how this should be implemented should be made. I also lean towards a general solution as #401. I don't think that this necessarily means persistent undo, but rather some kind of general session state that may be configurable, that is serialized to disk also in a panic handler (at least, if possible without serializing inconsistent state).

Philipp-M avatar Oct 11 '22 21:10 Philipp-M

I think we're missing the fact that most process crashes aren't due to internal panic but external factors, ie unexpected system shutdown, out-of-memory, some idiot forgetting his laptop is set to automatically log out after an hour of inactivity.

So while a panic handler is great, a swap file or operation journal is vital feature.

josephholsten avatar Nov 04 '22 15:11 josephholsten