nvim-rs icon indicating copy to clipboard operation
nvim-rs copied to clipboard

Nvim crash on 10th request

Open eulegang opened this issue 1 year ago • 3 comments

I stumbled upon this issue https://github.com/neovim/neovim/issues/23781 while using this library.

I'm using a nvim_rs::create::new_parent (like thing), which uses tokio::io::Stdout underneath the hood.

This seems to have line buffering on it since whenever I hit the 10th message neovim crashes.

The 10th message id emits 0x0A (10 in positive fix int in the msgpack encoding) which flushes a partial message.

I've switched to unsafe { File::from_raw_fd(1) } and it seems to continue beyond the 10th request.

This might not be a viable solution because of unsafe code, but the flushing (to often / out of control) seems to be an issue over local unix io streams.

eulegang avatar Dec 12 '23 04:12 eulegang

Hey thanks for your report!

So, if what you're diagnosing is true, the scorched earth example should crash after 10 messages, too? Since you mentions unix streams, this is on linux, I presume?

What's weird is that tokio's Stdout wraps std::io::Stdout, which does the line-buffering I think, and which is also used by the predecessor of nvim-rs (namely neovim-lib). I've used that extensively in the past, I'd surely have noticed such a problem.

Or is this a problem with newer neovim version only? There's no mention in the linked issue, but I wonder.

KillTheMule avatar Dec 12 '23 10:12 KillTheMule

That logic seems sound to me. I tried to run the example and I've seen it just work as expected. yet when I run something like

let buf = nvim.get_current_buf().await.unwrap();
for i in 0..20 {
  buf.set_name(&format!("msg-{i}")).await.unwrap()
}

after nvim is initialized, It triggers.

this does seem quite strange.

It's late here. I'll look at it further tomorrow.

eulegang avatar Dec 13 '23 04:12 eulegang

I've hit this problem too with this plugin, it seems it's really an issue on Neovim to me, I've looked into it a bit and found that the problem is easily reproducible by simply using Neovim's RPC and only sending a subset of what you mean to send, the right flush will trigger this, and flushing should be OK.

That being said if this workaround works might be worth going with it as the Neovim team haven't offered much in the way of a fix yet.

Can be triggered by just doing this with a relatively recent version of Neovim:

echo -e -n '\x94\x00\x0a' | ./build/bin/nvim --embed

strottos avatar Dec 22 '23 15:12 strottos

That logic seems sound to me. I tried to run the example and I've seen it just work as expected. yet when I run something like

let buf = nvim.get_current_buf().await.unwrap();
for i in 0..20 {
  buf.set_name(&format!("msg-{i}")).await.unwrap()
}

after nvim is initialized, It triggers.

this does seem quite strange.

It's late here. I'll look at it further tomorrow.

I tried to reproduce, but could not using both an older version of neovim as well as the latest master. Could you share the full code, both on the rust side and what you called from neovim? And share specifics about your system?

KillTheMule avatar Dec 31 '23 22:12 KillTheMule

Can be triggered by just doing this with a relatively recent version of Neovim:

echo -e -n '\x94\x00\x0a' | ./build/bin/nvim --embed

Even that did not reproduce on latest neovim master, can you share specifics about your system?

KillTheMule avatar Dec 31 '23 22:12 KillTheMule

Can be triggered by just doing this with a relatively recent version of Neovim:

echo -e -n '\x94\x00\x0a' | ./build/bin/nvim --embed

Even that did not reproduce on latest neovim master, can you share specifics about your system?

Wonder if they've fixed it, checking. It is possible it's windows only, though I use Linux too...

Will recheck on both.

strottos avatar Jan 04 '24 16:01 strottos

Literally hit straight away on latest version of Neovim on master:

strotter:neovim$ make distclean
rm -rf .deps build
make clean
<snip>
make[1]: Leaving directory '/home/strotter/code/nvim/neovim'
strotter:neovim$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

nothing to commit, working tree clean
strotter:neovim$ git pull
Already up-to-date.
strotter:neovim$ make
mkdir -p ".deps"
<snip>
[384/414] Generating nvim.pot
[414/414] Generating sk.cp1250.mo
strotter:neovim$ echo -e -n '\x94\x00\x0a' | ./build/bin/nvim --embed --clean
nvim: /home/strotter/code/nvim/neovim/src/nvim/rbuffer.c:126: rbuffer_consumed: Assertion `count && count <= buf->size' failed.
Aborted (core dumped)

System: Linux mint 21.1 x86_64

strottos avatar Jan 04 '24 16:01 strottos

In fact it seems to trigger on anything that it doesn't see as valid msgpack, which is not completely unreasonable, except in the case that it's incomplete msgpack that's just flushed early.

strottos avatar Jan 04 '24 17:01 strottos

Ok, so with current neovim HEAD (8df3742), I can reproduce that, both on my PC and my thinkpad. When running the rust code, the last buffer name change I can see is msg-8, so it seems to crash on the 9th request. Strangely enough, this seems not to happen with build type "Release". Could you try that on your end, just to make sure? Maybe the assert simply gets compiled out in Release mode? But it DOES happen for "RelWithDebInfo" builds...

What I'm grappling with right now is the following. Seems to me that the line buffering means that whenever there's a \n char in the stream, the buffer gets flushed (which it also does when it is full). However, when I run

  nvim.feedkeys("iab\ncd", "", true).await.unwrap();

there's a newline as well, but this works just as expected. Maybe I'm misunderstanding something? I'll try examining the bytes we're writing and try to compare them to what neovim sees... Unfortunately, neovim seems not to write a log when crashing :(

KillTheMule avatar Jan 07 '24 14:01 KillTheMule

@strottos In the neovim issue, you wrote

I've tracked it down to that Rust sometimes will send just this bytes snippet:

How did you track that down? I can control what nvim-rs puts into the writer, but that's obviously not the problem, so how did you see what actually arrives at neovim's side?

KillTheMule avatar Jan 07 '24 19:01 KillTheMule

Sorry just seen this, I'll try and look properly tomorrow again but I think just on my Rust based nvim plugin: https://github.com/strottos/nvim-vadre The plugin is basically a step through debugger because I've never found one I'm totally happy with (nor this one yet). It's been easily reproducible for me by me just putting a breakpoint in a new file and Nvim will abort. I think I tracked the above by simply logging bytes out to a file, can't remember how exactly now, was a good few months back. Possibly by just manually adding logging to Nvim (which might make this a heisenbug of course, but it didn't seem to).

strottos avatar Jan 19 '24 22:01 strottos

I did investigate some, I've added my findings to the neovim issue.

KillTheMule avatar Jan 20 '24 17:01 KillTheMule

A lot of your conclusions looked similar to mine. The linked PRs did nothing to fix this and after a day or two of trying to figure out a fix I gave up, seemed hard. Seems like something is amiss in the RPC itself that you wouldn't trigger in Lua (as per most nvim plugins).

strottos avatar Jan 20 '24 22:01 strottos

Sorry for posting and running, life got busy.

Looks you two are more familiar with nvim. I was just spelunking through the code base and running gdb, but it's hard to track of what's going on for me.

I'm fine with this being closed out with out me being involved.

Also using unsafe is fine for me since it's just in an application where I'm using unsafe any way. But I'm sure it should be used in this crate.

eulegang avatar Jan 21 '24 18:01 eulegang

I've written some details in the neovim issue, but overall the line buffering from rust's stdout is triggering the issue, so I'll work on removing that.

KillTheMule avatar Jan 21 '24 21:01 KillTheMule

Anyone interested in trying out https://github.com/KillTheMule/nvim-rs/pull/48? This should fix the issue from nvim-rs's side, although I'll be looking into fixing it in neovim later on.

KillTheMule avatar Jan 26 '24 16:01 KillTheMule

Seems to work for me now on first test, and this was really something that was happening constantly before for my plugin.

strottos avatar Jan 28 '24 09:01 strottos

Thanks for all your hard work on this @KillTheMule , really is appreciated 👏

strottos avatar Jan 28 '24 09:01 strottos

Closed by https://github.com/KillTheMule/nvim-rs/pull/48, will cut a new release.

KillTheMule avatar Feb 03 '24 12:02 KillTheMule