helix icon indicating copy to clipboard operation
helix copied to clipboard

(Git) diff gutter implementation

Open pascalkuthe opened this issue 1 year ago • 54 comments

Supercedes #1623.

I took this PR up as the original PR was tagged as waiting-for-author/inactive.

This PR is rebased on the current master to avoid conflicts. Additionally, I have updated the PR with some improvements. The design is now much more similar to the design suggested by @p-e-w.

The original author already did most of the work by implementing incremental diffing. However, I have now streamlined the git interface. The incremental diff only requires that git provides the head state of a file while the rest is handled by helix. Therefore, I have simplified the git interface down to one function: get_file_head. This interface is now provided by a trait, that is implemented for git but is trivial to implement for other diff sources as well and could be exposed to the plugin system in the future.

The original PR cached the git repositories in memory. This has been removed because it did not handle nested repositories correctly (example: $HOME as git repo for dotfile management). It's also not required anymore as the git repo is only accessed when a file is open instead of every keystroke.

Finally, I have improved the git implementation by switching to gitoxide (as suggested in the original PR). During this switch, I found a few edge cases that were not handled correctly by the previous implementation:

  • Non utf-8 encoded files were not supported (now uses document encoding)
  • Directories/symlinks that were replaced by files would use the directory children/link path as the diff base (now the diff is not shown)

I have purposefully not added more features to this PR to keep it somewhat small and reviewable. In the future, I would like to expand on this PR. In particular, I would like to implement a mechanism to reload the diff base after a file is first opened. This avoids having to restart helix after every commit, to continue reviving complex diffs. However, this feature is more complex (requires watching .git) so I choose not to include it here.

Currently, all commits are separate to make review easier. I would like to squash the commits before this is merged, as the implementation changed quite a bit.

Note that there is currently one test failing in helix-core but that seems completely unrelated as the PR does not touch anything in helix-core

pascalkuthe avatar Sep 18 '22 16:09 pascalkuthe

It seems the pipeline is failing because gitoxide has MSVR of 1.59 but helix currently targets 1.57. Gitoxide has been targeting newer versions for quite a while so downgrading doesn't seem like an option. I am not sure what the policy on MSRV is for helix, perhaps a MSVR bump to 1.59 could be possible. Even Ubuntu has updated to 1.59 at this point so it seems a reasonable requirement

pascalkuthe avatar Sep 18 '22 17:09 pascalkuthe

We set the MSRV a while back, we're probably ok to update now (though that should be done separately from this PR). I think we set it to 1.57 on account of Void linux.

Is gitoxide really necessary though? It adds a very significant number of transitive dependencies (285 crates to build where master has 179 and with git2 we have 189) and compile time. The transitive dependencies also seem to add new compile-time dependencies on cmake and make (libz-sys).

the-mikedavis avatar Sep 18 '22 17:09 the-mikedavis

Whether gitoxide or git2-rs is used doesn't make too much difference for this PR. I was able to convert my implementation to git2-rs fairly quickly

I was able to remove the zlib-ng native dependency by disabling the gitoxide default features (this costs some performance but is hardly relevant for our use-case), git2-rs can be compiled in vendored mode. This means no matter what implementation we use, no dependency on system libraries is introduced. In that case the compile times are similar (clean debug build 27s git2-rs vs 27.5s gitoxide on my machine).

I think compile times are the main concern here, because both projects are currently used in cargo/will be used in cargo in the near future and are maintained by trusted members of the rust community so supply/maintenance concerns should not be a problem. Furthermore comparing number of dependencies between the project does not really make sense because git2-rs contains 350k LOC of C in a single crate, while gitoxide simply spreads its functionality across multiple crates.

I believe gitoxide is preferable because its development is much more active and its written in rust. Contributing to libgit2 is more difficult than contributing to gitoxide, because contributing to such a large C project is very difficult (the usual problems with C) compared to rust.

I hope that git support is expanded upon in helix in the future. If we run into issues with gi2 in during that process, we can not do much about it because even longstanding issues of much larger projects (like cargo) are undressed. By comparison contributing to gitoxide is much easier.

I personally believe that trading 0,5 seconds of compile time for a cleaner/easier to fix git implementation is worth but for this PR it doesn't make a difference so I could easily change the git implementation if that is preferred.

pascalkuthe avatar Sep 18 '22 21:09 pascalkuthe

I have discovered that the current diffing implementation I reused from #1623 has major shortcomings. I am working on improving this but it will take a while

pascalkuthe avatar Sep 19 '22 02:09 pascalkuthe

Very much in support of gitoxide too, because 1) it would help @byron know if there are any missing features that would be useful for helix and have more real-life testing 2) it's very easy to contribute to 3) less C to build means users will have much less unexpected problems because of unrelated-to-helix causes

poliorcetics avatar Sep 19 '22 07:09 poliorcetics

Thanks for pinging me! It's exciting to see gitoxide in helix, the editor 'of the new kind' that I liked most thus far and want to use ❤️!

Just now I was thinking about adding a line-counter to ein tool hours -p (now 15% faster than git at 1/10th of the peak memory) which would probably be the moment where this plumbing-level implementation of diffing I did when switching crates-index-diff (powering docs.rs) to gitoxide would turn into an easy-to-use API for consumption by everyone without sacrificing performance. I wanted to move on and postpone once again as to me this is 'candy' even though I have a lot of cargo related work to do… but now I think it's worth to see how fast I can put a higher-level API around the plumbing code that would hopefully also work here one day.

@pascalkuthe Thanks for this PR and for running the numbers, I never did and was positively surprised the compile times are only marginally different. I also took a look at the code and understand gitoxide is currently used to pull the base-object bytes to compare to out of the object database, with the diffing happening on top of that. I also hope that in future iterations it will be possible to re-use the Repository instance as discovering a git repository is costly in terms of IOps. Additionally I recommend to set a ceiling-directory for the discovery to assure that files that aren't in a git-repo don't cause it to look up to the root of the filesystem, if that such a ceiling is known at all - maybe the CWD could work initially? Since only the ODB is required here, one should probably adjust the open-options to not load global git configuration, but rely only on the repository-local one to avoid waste. The code for that is a bit cumbersome, but starship already went through the trouble and the result could be copied from there. Also note how it uses the git:: namespace to access many of the types, which helps to see more meaningful type names.

That said, I think it's safe to say that if helix-editor chooses to use gitoxide, I'd be there for instant-support as far as my sleep-cycle permits :D, and would make sure you get the fastest and most efficient git implementation available.

Byron avatar Sep 19 '22 08:09 Byron

In the previous PR diffing was implemented using the helix internal changeset after a suggestion by @archseer. While using this PR in practice I noticed that this approach does not work well. The helix internal diffs are meant to allow helix to efficiently apply edits to the text and not to produce a human readable diff. It is possible to fix this up somewhat (as the original PR did) but the fixup in the original PR was pretty naive and did not produce good results. For example removing a character produced a removed line symbol, adding it back did not change that, inserting new test after that character adds more removed line symbols.

I have invested quite a bit of effort into improving this and have fixed the most horrible cases. However this still does not work well. The problem essentially boils down to the fact that a linediff is fundamentally different from a character diff and trying to construct one from the other doesn't really work.

Therefore I decided to abandon this approach and instead use a more traditional async diff approach that is used in other editors (I looked at nvim-gitsgins and micro mainly). This implementation simply spawns an async task for each file that computes the diff on each file change. I have spend quite a bit of time optimizing this to address to comment made in the original PR regarding duplicate allocations. While this approach might not scale as-well to huge files as the other approach it is still plenty fast enough for any reasonable sized file. I have added a check that disables the diffing in case files get too huge to ensure helix is still able to deal large files.

The implementation (even with optimizations) is much simpler and more maintainable than the UI fixup code I had developed for the changset based approach while providing much better results. I did not notice any significant different between the async-diff and changset implementation for normal-sized files. Furthermore the performance could be improved further by optimizing the line iterator cessen/ropey#25 and potentially using a different diffing algorithm (histogram diffs should outperform patience and mayer for sourcecode: mitsuhiko/similar#9).

@Byron thanks for reaching out/the review/the support. I had my hands full with this and work today but I will take a look at the git implementation again tomorrow :)

pascalkuthe avatar Sep 20 '22 00:09 pascalkuthe

There are quite a few grammar mistakes in the comments. Should these be addressed in a follow-up PR so as to not clog up this one?

kirawi avatar Sep 20 '22 20:09 kirawi

There are quite a few grammar mistakes in the comments. Should these be addressed in a follow-up PR so as to not clog up this one?

I am still working on incorporating the suggestions from @Byron, once that is done I will quickly go trough all my comments again with a spellchecker. I hope that catches most things. I also think that further grammer fixes might then be better bundeled into a follow up PRs.

pascalkuthe avatar Sep 20 '22 22:09 pascalkuthe

@Byran Sorry it took a while to respond to your points but solving your suggestions turned out more complicated then expected.

I also hope that in future iterations it will be possible to re-use the Repository instance as discovering a git repository is costly in terms of IOps.

I wanted to include repository caching in this PR originally. However an efficent directory caching strategy turned out to be nontrivial. I already have a prototype locally that I spent a lot of time one (hence the delay) but the implementation still needs some work. Furthermore I think the caching implementation is best split into a seperate PR.

Additionally I recommend to set a ceiling-directory for the discovery to assure that files that aren't in a git-repo don't cause it to look up to the root of the filesystem, if that such a ceiling is known at all - maybe the CWD could work initially?

I think a ceiling directory can not generally be determined in a useful manner. Users might start helix in subdirectory (I frequently do) of the repo and I would expect the diff to still show up in that case. Anytime helix needs to find the root directory (for example when opening the file picker) it currently walks up the FS tree. Thats just as slow as gitoxide walking up the fs tree so its not worth adding to this PR. However my repository caching implementation actually ended up provide ceiling directories quite naturally so this will be improved in the future.

Think this PR is ready for a final review now and could be merged if there are no more comments (after a history squash).

pascalkuthe avatar Sep 21 '22 14:09 pascalkuthe

Been using this for about a week and it seems to work out very nicely. One thing that might be nice is to allow the user to configure the character used for the gutter. The defaults might be + for additions and - for negations and perhaps ~ for changed lines.

nrdxp avatar Oct 03 '22 20:10 nrdxp

I prefer the current set of markers but I agree that it should be customizable :)

archseer avatar Oct 03 '22 20:10 archseer

I prefer the current set of markers but I agree that it should be customizable :)

I do have my own fork with a couple different PRs merged together including this one, so maybe I should ensure that this is actually what the markers are supposed to look like: image

If so, a regular colored bar seems a bit bland for a default, but that's just my opinion. The default doesn't matter too much though if we can configure it :sweat_smile:

nrdxp avatar Oct 03 '22 20:10 nrdxp

It's similar to how it looks in other editors:

archseer avatar Oct 03 '22 20:10 archseer

I like the current default but also think customization would be nice. I was planning to implement customization in a follow-up PR to keep this one easier to review. Is that ok @archseer or do you see that as a requirement for merging?

pascalkuthe avatar Oct 03 '22 21:10 pascalkuthe

I haven't used a graphical editor for quite some time, so I was unaware that this was common, however after explaining that, it makes sense why its the default now.

If making it configurable is a hassle, I think it'd be fine to merge and work that out later assuming that's fine with everyone else.

nrdxp avatar Oct 03 '22 22:10 nrdxp

Using this for some time already, works pretty great so far.

A few things I noticed though:

  • Staged changes are still displayed as changes, git diff doesn't do that. AFAIK other git diff gutter implementations don't do that as well. Maybe add an optional different color for staged changes?
  • If a change outside of the helix occurs (e.g. git commit) the change isn't displayed.

These are nitpicks though, and especially the second change may require file-watching (unless git-oxide provides some kind of eventstream?) and should probably be resolved in a follow-up PR.

Philipp-M avatar Oct 13 '22 16:10 Philipp-M

These are nitpicks though, and especially the second change may require file-watching (unless git-oxide provides some kind of eventstream?)

Currently it does not, even though I believe it would be beneficial to have behind a feature toggle. gitoxide would have to do the watching, but could know exactly which files to watch (e.g. configuration files, the index, the HEAD, refs). I'd be happy to help with that, just give me a hint for when you need it and your ideas and it can be done. Note that it might be trivial to implement it just the way it's needed here as well, and such a feature might arrive earlier if done that way.

Byron avatar Oct 14 '22 04:10 Byron

Using this for some time already, works pretty great so far.

A few things I noticed though:

* Staged changes are still displayed as changes, `git diff` doesn't do that. AFAIK other git diff gutter implementations don't do that as well. Maybe add an optional different color for staged changes?

* If a change outside of the helix occurs (e.g. `git commit`) the change isn't displayed.

These are nitpicks though, and especially the second change may require file-watching (unless git-oxide provides some kind of eventstream?) and should probably be resolved in a follow-up PR.

Thanks for the feedback. I am very aware of these shortcomings and I have purposefully not opted to fix them in this PR. As @Byron already pointed out watching the repo for changes requires quite a bit of effort. My plan was just to watch the entire .git directory in helix (thats how other editors do it but I will definitly talk to @Byron when I get to that, thanks for offering help :smiley:). This will also tie into repository caching (which is also more complex feature I want to implement in the future). I want to avoid implementing these more complex features in this PR to ensure its not stuck in limbo forever.

However your comment did inspire me to implement a workaround: The reload command now reloads both the file itself and the file from the git repo it is compared against.

I opted to display the diff against HEAD instead of against the index because the index changes quite frequently, while committing is usually more rare. Having to reload each time you stage changes (which I personally do quite often) would be annoying and highlight the current shortcomings. However reloading each-time the file is committed seems reasonable to me. This PR is modeled after the micro editor which handles git diff signs similarly

pascalkuthe avatar Oct 14 '22 10:10 pascalkuthe

Been testing this branch for a while and it works pretty well for me, and is reasonably responsive. One minor issue I've encountered so far is that sometimes an update isn't triggered after I do a paste (p) or undo (u), and the gutters are only updated when I press another key (hjkl or really anything). Haven't figured out a reliable way to reproduce it though.

That said I hope this gets merged sooner, as it drastically improves the experience of editing existing files.

xJonathanLEI avatar Oct 18 '22 16:10 xJonathanLEI

OK I'm getting a panic (and hence complete crash of helix) when using this on files containing non-ASCII characters. The crash happens in helix_vcs::differ::DiffWorker::perform_diff based on the backtrace. I'm working on figuring out the reproduction steps. Will update shortly.

Panic message: byte index 24 is not a char boundary

Update: it happens every single time on the document I'm having the issue with, but it's kinda hard to come up with a minimal setup that reproduces it. This is likely gonna take a while.

xJonathanLEI avatar Oct 20 '22 11:10 xJonathanLEI

Thanks for the feedback @xJonathanLEI

Been testing this branch for a while and it works pretty well for me, and is reasonably responsive. One minor issue I've encountered so far is that sometimes an update isn't triggered after I do a paste (p) or undo (u), and the gutters are only updated when I press another key (hjkl or really anything). Haven't figured out a reliable way to reproduce it though.

That said I hope this gets merged sooner, as it drastically improves the experience of editing existing files.

I am currently working on making the diff a lot faster (similar is actually a pretty slow diff implementation) that should also improve latency/allow me to reduce the debounce time. I think the reason you are not seeing an update after a paste is that the diff is not updated asynchronously and may not finish before the redraw happens. The next redraw only happens when you press another key. Fixing this will require some kind of async redraw event. I have a rough design sketched out in my head but I will probably not get around to working on that before the weekend.

OK I'm getting a panic (and hence complete crash of helix) when using this on files containing non-ASCII characters. The crash happens in helix_vcs::differ::DiffWorker::perform_diff based on the backtrace. I'm working on figuring out the reproduction steps. Will update shortly.

Panic message: byte index 24 is not a char boundary

Update: it happens every single time on the document I'm having the issue with, but it's kinda hard to come up with a minimal setup that reproduces it. This is likely gonna take a while.

Please let me know if you have a reproducible example you can share. A full backtrace would be very useful aswell. I am sort of surprised by this crash because I am not actually dealing with byte indices (only line indices).

pascalkuthe avatar Oct 20 '22 11:10 pascalkuthe

@pascalkuthe I'm still working on getting a minimal example to work, but here's the full backtrace:

thread 'tokio-runtime-worker' panicked at 'byte index 24 is not a char boundary; it is inside '[REDACTED]' (bytes 23..26) of `[REDACTED]`', library/core/src/str/mod.rs:127:5
stack backtrace:
   0:     0x5590ce0f3ad4 - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x5590ce0f3ad4 - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5590ce0f3ad4 - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
   3:     0x5590ce0f3ad4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
   4:     0x5590cd84eedc - core::fmt::write::hba4337c43d992f49
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
   5:     0x5590ce0ed045 - std::io::Write::write_fmt::heb73de6e02cfabed
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
   6:     0x5590ce0f597e - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
   7:     0x5590ce0f597e - std::sys_common::backtrace::print::h426700d6240cdcc2
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
   8:     0x5590ce0f597e - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
   9:     0x5590ce0f56ad - std::panicking::default_hook::h2e88d02087fae196
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
  10:     0x5590ce0f600b - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:702:17
  11:     0x5590ce0f5e34 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:588:13
  12:     0x5590ce0f4004 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
  13:     0x5590ce0f5b9d - rust_begin_unwind
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
  14:     0x5590cd7bee23 - core::panicking::panic_fmt::h366d3a309ae17c94
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
  15:     0x5590cd852097 - core::str::slice_error_fail_rt::h4a702334d6a8cbe3
  16:     0x5590cd848b77 - core::ops::function::FnOnce::call_once::h82ad8e60b04d5dd8
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
  17:     0x5590cd84d298 - core::intrinsics::const_eval_select::h6f036bf6ff0cbd0d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/intrinsics.rs:2361:5
  18:     0x5590cd7bf0e2 - core::str::slice_error_fail::h83fedaa865594a4f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/str/mod.rs:86:9
  19:     0x5590ce0acc06 - <ropey::slice::RopeSlice as core::cmp::PartialEq<ropey::slice::RopeSlice>>::eq::h6bf5c5ab8098096c
  20:     0x5590cde4520e - similar::algorithms::compact::shift_diff_ops_up::hb367b4fbddb00f31
  21:     0x5590cde446cb - <similar::algorithms::compact::Compact<Old,New,D> as similar::algorithms::hook::DiffHook>::finish::h5c87f98be3c719e3
  22:     0x5590cde56b4e - similar::algorithms::myers::diff_deadline::h1393bf66f65ae71a
  23:     0x5590cde491e8 - similar::common::capture_diff_slices_deadline::he9fd99b0ccc2db79
  24:     0x5590cde53296 - helix_vcs::differ::DiffWorker::perform_diff::h2a8fc7713734917a
  25:     0x5590cde56700 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h57728c80c1eb8cbb
  26:     0x5590cde5061a - tokio::runtime::task::core::CoreStage<T>::poll::h4e5864b3412f2549
  27:     0x5590cde5ae0e - tokio::runtime::task::harness::Harness<T,S>::poll::h037aae2d6cb197d7
  28:     0x5590ce13e9d3 - std::thread::local::LocalKey<T>::with::h32ce78d83cb8438d
  29:     0x5590ce1375a7 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h6fe208a88942a4e0
  30:     0x5590ce1361fe - tokio::runtime::scheduler::multi_thread::worker::Context::run::h0935d99e9f2a637e
  31:     0x5590ce1402e7 - tokio::macros::scoped_tls::ScopedKey<T>::set::h375dd2ffcbb41b17
  32:     0x5590ce135fd5 - tokio::runtime::scheduler::multi_thread::worker::run::h90e208ae4403939d
  33:     0x5590ce117cc1 - <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll::h7b3db4a195860447
  34:     0x5590ce119a99 - tokio::runtime::task::harness::Harness<T,S>::poll::h5368f81052e23cce
  35:     0x5590ce12acfb - tokio::runtime::blocking::pool::Inner::run::hb1a9142cae4d0127
  36:     0x5590ce11e9e2 - std::sys_common::backtrace::__rust_begin_short_backtrace::h11fb56588b2a047f
  37:     0x5590ce122a0b - core::ops::function::FnOnce::call_once{{vtable.shim}}::h9450455c191bc87b
  38:     0x5590ce0fb0b5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::ha99802c2c52ada61
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1861:9
  39:     0x5590ce0fb0b5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::ha39aea1c57e28a15
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1861:9
  40:     0x5590ce0fb0b5 - std::sys::unix::thread::Thread::new::thread_start::h9f8e3d72b1f7662f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys/unix/thread.rs:108:17
  41:     0x7ff87b6dc609 - start_thread
  42:     0x7ff87b4b2133 - clone
  43:                0x0 - <unknown>

Looks like the issue could be from inside the similar crate. Please let me know if this is enough information for you. I can DM you the full file causing the issue if needed. Thanks!

xJonathanLEI avatar Oct 20 '22 12:10 xJonathanLEI

@pascalkuthe I'm still working on getting a minimal example to work, but here's the full backtrace:

thread 'tokio-runtime-worker' panicked at 'byte index 24 is not a char boundary; it is inside '[REDACTED]' (bytes 23..26) of `[REDACTED]`', library/core/src/str/mod.rs:127:5
stack backtrace:
   0:     0x5590ce0f3ad4 - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x5590ce0f3ad4 - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5590ce0f3ad4 - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
   3:     0x5590ce0f3ad4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
   4:     0x5590cd84eedc - core::fmt::write::hba4337c43d992f49
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
   5:     0x5590ce0ed045 - std::io::Write::write_fmt::heb73de6e02cfabed
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
   6:     0x5590ce0f597e - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
   7:     0x5590ce0f597e - std::sys_common::backtrace::print::h426700d6240cdcc2
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
   8:     0x5590ce0f597e - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
   9:     0x5590ce0f56ad - std::panicking::default_hook::h2e88d02087fae196
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
  10:     0x5590ce0f600b - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:702:17
  11:     0x5590ce0f5e34 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:588:13
  12:     0x5590ce0f4004 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
  13:     0x5590ce0f5b9d - rust_begin_unwind
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
  14:     0x5590cd7bee23 - core::panicking::panic_fmt::h366d3a309ae17c94
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
  15:     0x5590cd852097 - core::str::slice_error_fail_rt::h4a702334d6a8cbe3
  16:     0x5590cd848b77 - core::ops::function::FnOnce::call_once::h82ad8e60b04d5dd8
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
  17:     0x5590cd84d298 - core::intrinsics::const_eval_select::h6f036bf6ff0cbd0d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/intrinsics.rs:2361:5
  18:     0x5590cd7bf0e2 - core::str::slice_error_fail::h83fedaa865594a4f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/str/mod.rs:86:9
  19:     0x5590ce0acc06 - <ropey::slice::RopeSlice as core::cmp::PartialEq<ropey::slice::RopeSlice>>::eq::h6bf5c5ab8098096c
  20:     0x5590cde4520e - similar::algorithms::compact::shift_diff_ops_up::hb367b4fbddb00f31
  21:     0x5590cde446cb - <similar::algorithms::compact::Compact<Old,New,D> as similar::algorithms::hook::DiffHook>::finish::h5c87f98be3c719e3
  22:     0x5590cde56b4e - similar::algorithms::myers::diff_deadline::h1393bf66f65ae71a
  23:     0x5590cde491e8 - similar::common::capture_diff_slices_deadline::he9fd99b0ccc2db79
  24:     0x5590cde53296 - helix_vcs::differ::DiffWorker::perform_diff::h2a8fc7713734917a
  25:     0x5590cde56700 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h57728c80c1eb8cbb
  26:     0x5590cde5061a - tokio::runtime::task::core::CoreStage<T>::poll::h4e5864b3412f2549
  27:     0x5590cde5ae0e - tokio::runtime::task::harness::Harness<T,S>::poll::h037aae2d6cb197d7
  28:     0x5590ce13e9d3 - std::thread::local::LocalKey<T>::with::h32ce78d83cb8438d
  29:     0x5590ce1375a7 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h6fe208a88942a4e0
  30:     0x5590ce1361fe - tokio::runtime::scheduler::multi_thread::worker::Context::run::h0935d99e9f2a637e
  31:     0x5590ce1402e7 - tokio::macros::scoped_tls::ScopedKey<T>::set::h375dd2ffcbb41b17
  32:     0x5590ce135fd5 - tokio::runtime::scheduler::multi_thread::worker::run::h90e208ae4403939d
  33:     0x5590ce117cc1 - <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll::h7b3db4a195860447
  34:     0x5590ce119a99 - tokio::runtime::task::harness::Harness<T,S>::poll::h5368f81052e23cce
  35:     0x5590ce12acfb - tokio::runtime::blocking::pool::Inner::run::hb1a9142cae4d0127
  36:     0x5590ce11e9e2 - std::sys_common::backtrace::__rust_begin_short_backtrace::h11fb56588b2a047f
  37:     0x5590ce122a0b - core::ops::function::FnOnce::call_once{{vtable.shim}}::h9450455c191bc87b
  38:     0x5590ce0fb0b5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::ha99802c2c52ada61
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1861:9
  39:     0x5590ce0fb0b5 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::ha39aea1c57e28a15
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1861:9
  40:     0x5590ce0fb0b5 - std::sys::unix::thread::Thread::new::thread_start::h9f8e3d72b1f7662f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys/unix/thread.rs:108:17
  41:     0x7ff87b6dc609 - start_thread
  42:     0x7ff87b4b2133 - clone
  43:                0x0 - <unknown>

Looks like the issue could be from inside the similar crate. Please let me know if this is enough information for you. I can DM you the full file causing the issue if needed. Thanks!

Hmm I don't think similar is causing this. This looks more likely a bug in ropey, comapring two RopeSlices should never panic as far as I am aware. If you could send me the file that is causing this (and the file from git you are comparing it against) as an email to [email protected] that would be great.

pascalkuthe avatar Oct 20 '22 12:10 pascalkuthe

@pascalkuthe OK I trimmed down my file quite a bit, and redacted everything so I can publish it here for everyone to see.

It's not exactly "minimal", but it should help reproduce the issue.

First, set up a repository with the file commited into HEAD:

$ mkdir gutter-bug
$ cd gutter-bug
$ git init
$ git config user.name "Temp"
$ git config user.email "[email protected]"
$ curl -OL "https://github.com/helix-editor/helix/files/9830422/test.txt"
$ git add .
$ git commit -m "Comit"

To cause the crash, open helix:

$ hx test.txt

And press these EXACT keys:

47ggv9j12ld (the key after 12 is lower-case L, not 1)

I haven't figured out why, but no, using neovim to make the exact same deletion and then opening it in Helix doesn't cause the crash. In fact, if you delete just one less character in Helix, :wq and open it again, go to line 47 and delete one char, it doesn't crash. Technically both using another editor and separating the edits should make no difference, but there you go.

Please let me know if you can reproduce the crash with this setup. Thanks!

xJonathanLEI avatar Oct 20 '22 13:10 xJonathanLEI

BTW I tried the following Helix versions:

  • Current master (8c9bb236): no crash
  • PR head (b7513220): crash
  • PR rebased on master: crash

xJonathanLEI avatar Oct 20 '22 13:10 xJonathanLEI

Wow that's a pretty obscure edgecase. How did you manage to come up with the key sequence :D I would have never found that, thanks for reporting that! I will try my best to fix it. Just looking at the backtrace I am not sure whether this PR is actually causing that. I think this might actually be a super nieche bug in helix or ropey that is just exposed by this PR. Nonetheless I will look into this the next few days and try to fix either in this PR or as a separate fix for helix/ropey

pascalkuthe avatar Oct 20 '22 13:10 pascalkuthe

Great! Looking forward to the fix.

For the key sequence, I actually noticed that sometimes I can trigger the crash by deleting chars. So I started looking for a continuous block of chars that can be deleted to trigger it. I started somewhere (line 47 in this case), and keep pressing d until it crashes, and keep track of how many chars I've deleted, and then reverse engineered the sequence lol.

Speaking of this, one more observation that might be helpful: if I simply hold d down and wait for chars to be deleted, sometimes it would skip pass the crash point without crashing. I figured this has to do with the async design so it should be irrelevant, but I'm bringing up here just in case.

xJonathanLEI avatar Oct 20 '22 13:10 xJonathanLEI

Just looking at the backtrace I am not sure whether this PR is actually causing that.

Yeah I suspect some part of the diffing algorithm is using byte indexing instead of char indexing or something like that.

archseer avatar Oct 21 '22 01:10 archseer

Reporting yet another problem, this time directly related to the PR.

I'm not sure what the support status for Android is, but I've been using Helix on Android myself and it works great. However, this PR doesn't build on Android Termux:

$ cargo install --locked --path helix-term
---- SNIP ----
error[E0201]: duplicate definitions with name `interpolate_user`:
   --> /data/data/com.termux/files/home/.cargo/registry/src/github.com-1ecc6299db9ec823/git-config-0.7.1/src/values/path.rs:174:5
    |
166 | /     fn interpolate_user(
167 | |         self,
168 | |         _home_for_user: fn(&str) -> Option<PathBuf>,
169 | |     ) -> Result<Cow<'a, std::path::Path>, interpolate::Error> {
    | |_____________________________________________________________- previous definition of `interpolate_user` here
...
174 | /     fn interpolate_user(
175 | |         self,
176 | |         home_for_user: fn(&str) -> Option<PathBuf>,
177 | |     ) -> Result<Cow<'a, std::path::Path>, interpolate::Error> {
    | |_____________________________________________________________^ duplicate definition

For more information about this error, try `rustc --explain E0201`.
error: could not compile `git-config` due to previous error
warning: build failed, waiting for other jobs to finish...
error: failed to compile `helix-term v0.6.0 (/data/data/com.termux/files/home/repos/helix/helix-term)`, intermediate artifacts can be found at `/data/data/com.termux/files/home/repos/helix/target`

Looks like the new dependency introduced in this PR (git-config 0.7.1) doesn't work on Android. I haven't checked their code but looks like a cfg misuse.

xJonathanLEI avatar Oct 22 '22 06:10 xJonathanLEI