helix icon indicating copy to clipboard operation
helix copied to clipboard

Write path fixes

Open dead10ck opened this issue 3 years ago • 17 comments
trafficstars

Edit: this PR was split up, now integration tests are in #2359 and this PR is based on that branch.

Well, I did not expect this PR to get this large, I'm so sorry about that. This PR started as an attempt to fix #1575, and along the way, I discovered that there was a larger problem: Document writes are submitted to a single job pool that all run unordered. I suspected that this could lead to file write inconsistencies if the user submitted multiple writes on the same document at the same time. I took this opportunity to continue work on the integration testing harness that @archseer started a while back. I added an integration test (see below) that confirms that indeed the file writes happen not only out of order, but simtultaneously, leading to the file being clobbered. In a test where we issue a thousand writes that each change the entire file to consist only of the iteration number (i.e., 1, 2, 3, ..., 1000), we can see that the file ends up with a seemingly random jumble:

failures:

---- integration::write::test_write_concurrent stdout ----
thread 'integration::write::test_write_concurrent' panicked at 'assertion failed: `(left == right)`
  left: `"1000"`,
 right: `"4010"`', helix-term/tests/integration/write.rs:66:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    integration::write::test_write_concurrent

This test is now passing with the changes to the write path.

Below is an explanation of some of my decisions in this PR. Opinions and suggestions are welcome!

Document writes

The approach taken was:

  1. Rather than submitting a future to the general job pool, each Document manages its own queue of writes with an mpsc channel. When a user submits a write, this merely queues up a write request on the internal channel.
  2. Document writes emit a new special event type that communicates the revision number that was written and the path. The main event loop is responsible for polling the futures, which prompt the runtime to carry out the actual writes. When writes finish, the main event loop will update the document's last saved revision and the path if the write was successful.
  3. All events that originate from the Editor were consolidated into one enum type. This was discussed in the chat room, and it was originally expressed that we could do this incrementally over time, but the borrow checker intervened with this plan: since the event polls require a mutable borrow, and we can only do this once within the event loop, we have to do this all at once.

Incidentally, this allowed fixes for two unrelated bugs that presently exist:

  • When a document write fails, the modified flag is still reset. By sending the revision number in the write event, the current revision number and the last saved revision number need not be the same at any given point in time. This would, e.g., allow large writes to carry out in the background while the user makes other edits at the same time, and correctly keep the modified flag even after the background write succeeds.
  • In a write command that sets a new path, if the write fails, the path is still changed to the argument path. Now, it will not update the path unless the write succeeds.

Also, the first attempt at fixing #1575 was done by adding fallibility to the job pool. Presently, if you join on all the jobs, any errors are discarded.

Since I was already here, I attempted to fix #1633 as well, and it kind of works, except that the status message quickly gets overwritten by LSP progress messages. I'm not quite sure how to fix this, but I think that can happen in another PR, as this one is quite big enough already.

Edit: integration tests were split off into their own PR: #2359

Fixes #3975 Fixes #2518 Fixes #3727

dead10ck avatar Apr 25 '22 15:04 dead10ck

Not really sure what's going on with the EAGAIN errors. Trying to dig into it. I'm guessing it has something to do with the terminal in the build environment.

dead10ck avatar Apr 26 '22 01:04 dead10ck

Would it be possible to split the PR up into two - one for the integration tests and one to fix document writes (probably based on the integration testing one)? It is quite large as you say 😅

the-mikedavis avatar Apr 27 '22 01:04 the-mikedavis

Sure, I have some of the changes mixed up, but I'll see if I can rebase and split them up

dead10ck avatar Apr 27 '22 13:04 dead10ck

I think you accidentally included out log file.

pickfire avatar May 15 '22 23:05 pickfire

I think you accidentally included out log file.

Thanks! I fixed this. Though as a side note, if you'd like to review, I'd recommend taking a look at #2359 first, as this PR is based on that one. A review would be appreciated. 🙂

dead10ck avatar May 17 '22 14:05 dead10ck

Can you rebase this now? :)

archseer avatar Jun 23 '22 13:06 archseer

Yep, will do next time I get a chance. Although I meant to make a comment here: I've been dogfooding this branch for a little while, and I've noticed one issue with the changes to the revision number tracking: when an auto format happens, helix doesn't mark it as unmodified because the revision number it saves for sending across the channel happens before the formatting change, which itself increases the revision number. I will have to think about how to fix this. I may have to split up the format and save operations into discrete steps so it can do the formatting first and then save the revision number before issuing the write operation.

dead10ck avatar Jun 23 '22 13:06 dead10ck

The issue of the auto format changing the revision number is proving to be kind of a tough nut to crack. I tried making it work by having the auto format callback kick off the save, but the problem here is that the callback actually needs to run, and the current set up is such that the main event loop is the only place callbacks are handled, so write-quit doesn't end up actually doing a write. I tried adding logic to handle callbacks during Jobs::finish, but we don't have the compositor available to us at this point, since that is a terminal specific thing.

I'm also considering if we could just somehow change the jobs so that any particular job can track when its callback has been handled, and then the write would just wait for that.

Progress has been slowed down recently due to personal stuff, but I am still trying to work through it when I can to get this PR done.

dead10ck avatar Jul 19 '22 07:07 dead10ck

Ok @archseer, I managed to get something working that waits for auto format and save to finish on write-quit

It turns the callback type into an enum to allow for callbacks that don't have a compositor and/or editor, and then basically allows you to tell Jobs::finish to ignore callbacks for which you don't give it a necessary reference (i.e. if you don't pass in a Compositor, ignore any callbacks that requires a compositor).

It's a bit of a hack, but it's simpler than some of the other ideas I was able to think of.

I'm open to other suggestions as well.

dead10ck avatar Jul 27 '22 02:07 dead10ck

I found a panic that happens when you quit immediately following a save when you have an active language server that auto-formats on save:

  1. cargo run helix-term/src/ui/editor.rs
  2. :w
  3. :q

I think the :q needs to be entered before the auto-formatting response is received.

backtrace...
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', helix-view/src/tree.rs:274:29
stack backtrace:
   0:     0x56463b08e67d - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x56463b08e67d - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x56463b08e67d - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
   3:     0x56463b08e67d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
   4:     0x56463b0b5f4c - core::fmt::write::hba4337c43d992f49
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
   5:     0x56463b0878b1 - std::io::Write::write_fmt::heb73de6e02cfabed
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
   6:     0x56463b0905f5 - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
   7:     0x56463b0905f5 - std::sys_common::backtrace::print::h426700d6240cdcc2
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
   8:     0x56463b0905f5 - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
   9:     0x56463b0902a9 - std::panicking::default_hook::h2e88d02087fae196
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
  10:     0x564639a491e3 - <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call::hb10f8f69f0ddaaee
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1875:9
  11:     0x564639a19808 - helix_term::application::Application::run::{{closure}}::{{closure}}::h3801ef15e785355d
                               at /home/michael/src/helix/hx/helix-term/src/application.rs:901:13
  12:     0x56463b090d00 - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:702:17
  13:     0x56463b090af9 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:586:13
  14:     0x56463b08eb34 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
  15:     0x56463b090869 - rust_begin_unwind
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
  16:     0x56463b0b42c3 - core::panicking::panic_fmt::h366d3a309ae17c94
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
  17:     0x56463b0b410d - core::panicking::panic::h8705e81f284be8a5
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:48:5
  18:     0x56463a3c6dfc - core::option::Option<T>::unwrap::hfb651dc565a95d85
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/option.rs:755:21
  19:     0x56463a3db6be - helix_view::tree::Tree::get::h9be0133c774319e5
                               at /home/michael/src/helix/hx/helix-view/src/tree.rs:274:9
  20:     0x564639b678b0 - helix_term::commands::make_format_callback::{{closure}}::{{closure}}::h35c41125ed72bca4
                               at /home/michael/src/helix/hx/helix-term/src/commands.rs:2479:23
  21:     0x564639c52998 - core::ops::function::FnOnce::call_once{{vtable.shim}}::haa758735588049f4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
  22:     0x564639c45bfa - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::hb9b36a8e823b7b97
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1861:9
  23:     0x564639df971d - helix_term::job::Jobs::finish::{{closure}}::h94411ea9371700c4
                               at /home/michael/src/helix/hx/helix-term/src/job.rs:128:33
  24:     0x564639f6b4fe - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hb734ba7cf1d0f750
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  25:     0x564639a19baf - helix_term::application::Application::close::{{closure}}::h50a58e8c7c952179
                               at /home/michael/src/helix/hx/helix-term/src/application.rs:919:72
  26:     0x564639a78c99 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h941c8e4690160904
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  27:     0x564639a194c9 - helix_term::application::Application::run::{{closure}}::h685593b0af45da56
                               at /home/michael/src/helix/hx/helix-term/src/application.rs:906:21
  28:     0x564639a7946b - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hec3dd69f9ce717f3
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  29:     0x564639a3273d - hx::main_impl::{{closure}}::h954da21fc956f2d1
                               at /home/michael/src/helix/hx/helix-term/src/main.rs:142:53
  30:     0x564639a7911b - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hccd7dc1c48f19f56
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
  31:     0x564639a2c8cc - tokio::park::thread::CachedParkThread::block_on::{{closure}}::h9fce7fe91ecde133
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/park/thread.rs:263:54
  32:     0x564639a35dfa - tokio::coop::with_budget::{{closure}}::h5496a9b2ab317313
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/coop.rs:102:9
  33:     0x564639a2d168 - std::thread::local::LocalKey<T>::try_with::h3beef407bce4b31e
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:442:16
  34:     0x564639a2ca4c - std::thread::local::LocalKey<T>::with::h5321acaa83343402
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:418:9
  35:     0x564639a2c242 - tokio::coop::with_budget::h06e05fd16a115c8a
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/coop.rs:95:5
  36:     0x564639a2c242 - tokio::coop::budget::had8f4b1b93b3f2c1
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/coop.rs:72:5
  37:     0x564639a2c242 - tokio::park::thread::CachedParkThread::block_on::h87d406f798c9e3a8
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/park/thread.rs:263:31
  38:     0x564639a37d55 - tokio::runtime::enter::Enter::block_on::h6006bfc704f6ecd1
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/runtime/enter.rs:152:13
  39:     0x564639a6b068 - tokio::runtime::thread_pool::ThreadPool::block_on::he30ecd538d338b64
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/runtime/thread_pool/mod.rs:90:9
  40:     0x564639a7b306 - tokio::runtime::Runtime::block_on::h5800ac6a89a0990b
                               at /home/michael/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/runtime/mod.rs:484:43
  41:     0x564639a2b62d - hx::main_impl::h1b84079ec26e0a46
                               at /home/michael/src/helix/hx/helix-term/src/main.rs:144:5
  42:     0x564639a2b47e - hx::main::h058367f4d46396e7
                               at /home/michael/src/helix/hx/helix-term/src/main.rs:37:21
  43:     0x564639a59e0b - core::ops::function::FnOnce::call_once::h8031714db130c54d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
  44:     0x564639a2e2ae - std::sys_common::backtrace::__rust_begin_short_backtrace::hacdace9adcecea47
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:122:18
  45:     0x564639a45761 - std::rt::lang_start::{{closure}}::ha07536b68a461eb7
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:145:18
  46:     0x56463b08157e - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::had4f69b3aefb47a8
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:259:13
  47:     0x56463b08157e - std::panicking::try::do_call::hf2ad5355fcafe775
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  48:     0x56463b08157e - std::panicking::try::h0a63ac363423e61e
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  49:     0x56463b08157e - std::panic::catch_unwind::h18088edcecb8693a
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  50:     0x56463b08157e - std::rt::lang_start_internal::{{closure}}::ha7dad166dc711761
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:48
  51:     0x56463b08157e - std::panicking::try::do_call::hda0c61bf3a57d6e6
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  52:     0x56463b08157e - std::panicking::try::hbc940e68560040a9
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  53:     0x56463b08157e - std::panic::catch_unwind::haed0df2aeb3fa368
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  54:     0x56463b08157e - std::rt::lang_start_internal::h9c06694362b5b80c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:20
  55:     0x564639a45730 - std::rt::lang_start::h44af8e86ec90ecff
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:144:17
  56:     0x564639a2b6dc - main
  57:     0x7f4ec38d3237 - __libc_start_call_main
  58:     0x7f4ec38d32f5 - __libc_start_main_impl
  59:     0x564639a090f1 - _start
                               at /build/glibc-2.34/csu/../sysdeps/x86_64/start.S:116
  60:                0x0 - <unknown>

the-mikedavis avatar Aug 08 '22 00:08 the-mikedavis

Nice, good catch @the-mikedavis! This makes sense. It looks like the format callback needs the view to apply the LSP edits, but the shutdown (1) sequence(2) is initiated by closing the view, so by the time we execute the job, the view is gone.

This was fixed in write-quit by blocking on the jobs finishing before quitting, but it looks like this will have to happen even for a regular quit.

I suspect the only reason this isn't common already is because this job wasn't marked as wait before quit before, but now it is, so this code never got a chance to execute even though it would panic the same way.

Edit: actually, I'm a bit confused now. I just tried this out locally and when I do :q, it errors because the document is still modified. And this should be the case. The mod flag should not be cleared to allow quitting until the save is completely done, which now includes formatting.

Edit 2: ah, it can happen with a doc that isn't already formatted, but hasn't been modified in your session. I was able to reproduce.

dead10ck avatar Aug 08 '22 02:08 dead10ck

Ok, I think I fixed it

dead10ck avatar Aug 09 '22 15:08 dead10ck

Thanks for taking a look @pickfire ! Would you be willing to dogfood this build for a bit of time to see if you notice any issues in practice? If we have the ability, it would be prudent to field test a change of this significance before merging to master for everyone.

dead10ck avatar Aug 29 '22 03:08 dead10ck

@dead10ck Sure, will try this out when doing normal editing rather than master branch.

@the-mikedavis I think would be nice to have some experimental features toggling to try out experimental features, or maybe even a label (like testers-needed or something like that) would be nice in a way that can attract some people to try it out.

pickfire avatar Aug 29 '22 13:08 pickfire

I rebased and incorporated a fix with the same effect as #3684. I also did a little refactoring to be able to write a regression test for it; it is now possible to specify a custom TOML string that will get merged into the default languages config to enable testing of language config settings.

I spent a while trying to think of a way to do it in a config format agnostic way, since I know we're not planning on keeping TOML around forever, but I wasn't able to think of any alternative that let us simply override the defaults. Fully specifying the entire language config would be extremely repetitive and get outdated quickly because of the grammar definitions, so this seemed like the worse of the two options.

dead10ck avatar Sep 17 '22 14:09 dead10ck

Thanks a lot @pickfire , I rebased. I don't think that PR is related, though it might touch the same code.

dead10ck avatar Sep 20 '22 20:09 dead10ck

I currently encounter crashes infrequently in helix-core/src/transaction.rs:397:13 on the master branch, I wonder if this PR would fix them.

I haven't managed to reproduce it consistently so I cannot test yet

poliorcetics avatar Oct 11 '22 19:10 poliorcetics

I am not able to write to a file by helix but I am able to do with vim, what is the reason ?

dawkrish avatar Feb 01 '24 11:02 dawkrish

nevermind, it got automaticaly fixed

dawkrish avatar Feb 01 '24 11:02 dawkrish