Dune crashes when editing a file in exec watch mode
Expected Behavior
Dune not to crash when editing data.ml
Actual Behavior
Dune crashes when data.ml is edited, even if you just save the file without changing anything, with this output:
Error: _build/default/.hello.eobjs/dune__exe__Hello.impl.d: No such file or
directory
-> required by _build/default/.hello.eobjs/dune__exe__Hello.impl.all-deps
-> required by _build/default/hello.exe
Had 1 error, waiting for filesystem changes...
Reproduction
clone https://github.com/LAC-Tech/dunebug run
dune exec -w ./hello.exe
Save file data.ml in your editor. (saving hello.ml works fine). You may sometimes have to save 2 or 3 times to get it to crash.
Specifications
dune 3.16.0 The OCaml toplevel, version 5.2.0 Arch Linux, kernel 6.10.6, x86_64
This might be related to #10959.
@LAC-Tech could you try again and tell me if it is working with the latest version of dune? It seems to have been fixed with #10960. I can't reproduce with your example again using main version of dune.
Hi, excuse the late reply and basic question - but what's the best way to get this main dune version? build myself, install with opam flag? Just point me in the right direction.
You can either build it yourself or use the binary version from https://preview.dune.build (nightly version of dune built from the main branch).
Thanks. Exact same issue I'm afraid.
$ ~/.dune/bin/dune --version
"Dune Developer Preview: build 2024-11-02T02:13:00Z, git revision
c2e0160fd0b593c63536c7081238da1da5140720"
Then running with: ~/.dune/bin/dune exec -w ./hello.exe, and saving data.ml (which seems to be what's causing the race condition), I get the same error message.
Tried with a plain neovim, default config no plugins incase an LSP was causing an issue. But no luck.
Is there any information I can provide that might help?
Thanks for trying! I was betting on the fact that the mentioned PR introduced delays in computation that could possibly solve the issue. Apparently not.
You already provided a reproduction case so we don't need extra information.
I am unable to reproduce this. Is anybody else able to reproduce the issue?
Greetings future people trawling obscure github issues trying to solve their problem!
I have solved the issue, it was actually a neovim issue. Something about the way neovim (and I believe vim) saves files interacts very poorly with hot reloading (I had this same issue with bun).
Turns out you need to add this to your init.lua:
vim.opt.backupcopy = "yes"
I'm reliably informed by an LLM that it's the following in vim script, but I cannot confirm:
set backupcopy=yes
So this was my mistake, I should have tried another editor. But on the plus side, this should help other n(vim) users.
Thanks for providing the feedback and the notes for future people! I'm closing as completed 👍
I'm still curious why this made Dune crash however.
Maybe you could clear something up. Was it a crash or an error that you were experiencing?
@Alizter it was an error I think, perhaps crash was a poor choice of words. The process did not terminate, but was stuck in a broken state while in watch mode.
Thanks, but I suppose it was able to recover once you edited the file again. I'm really just interested in knowing if dune was able to recover from this.
It was not able to recover, no.
You can confirm for yourself with a default neovim
Excellent! I was able to reproduce this. We definitely have something racy going on here and Dune misses a dependency. I will investigate further into what exactly neovim is doing and try to reproduce this in a cram test.
Here is a repro case using nvim:
$ DONE_FLAG=_build/done_flag
$ cat > dune-project <<EOF
> (lang dune 3.18)
> EOF
$ cat > dune <<EOF
> (executable
> (name foo)
> (libraries unix))
> EOF
$ cat > touch.ml <<EOF
> let touch path =
> let fd = Unix.openfile path [ Unix.O_CREAT ] 0o777 in
> Unix.close fd
> ;;
> EOF
$ cat > foo.ml <<EOF
> let () =
> Touch.touch "$DONE_FLAG"
> ;;
> EOF
$ dune exec -w -- ./foo.exe &
Success, waiting for filesystem changes...
Error: _build/default/.foo.eobjs/dune__exe__Foo.impl.d: No such file or
directory
-> required by _build/default/.foo.eobjs/dune__exe__Foo.impl.all-deps
-> required by _build/default/foo.exe
Had 1 error, waiting for filesystem changes...
$ PID=$!
$ ./wait-for-file.sh $DONE_FLAG
$ nvim --headless touch.ml -c 'write' \
> -c 'quit' 2> /dev/null
$ kill $PID
this uses the watching stuff in test-cases/exec-watch.
Strange difference with exec and build here is that dune build -w doesn't run into this problem, it manages to build ok. dune exec on the other hand, including my implementation in #11621 suffers from this issue. I suspect it is something to do with the file watcher. I will investigate further.
Some progress today. I've managed to reproduce the issue with dune without neovim by using inotifywait and observing exactly what nvim (and probably vim) is doing. It's good that vim users can fix this behaviour, but dune should be more resilient in these situations and should be able to recover.
I've checked our inotify library, our file_watcher library and everything reported there looks sane. I suspect the issue lies with Fs_memo, but I need to dissect this issue further to be certain.
Here is my new and improved repro case:
Reproduction case for #11010. Neovim saves files in a perculiar way which interacts poorly
with dune watch mode.
$ DONE_FLAG=_build/done_flag
$ cat > dune-project <<EOF
> (lang dune 3.18)
> EOF
$ cat > dune <<EOF
> (executable
> (name foo)
> (libraries unix))
> EOF
$ cat > touch.ml <<EOF
> let touch path =
> let fd = Unix.openfile path [ Unix.O_CREAT ] 0o777 in
> Unix.close fd
> ;;
> EOF
$ cat > foo.ml <<EOF
> let () =
> Touch.touch "$DONE_FLAG"
> ;;
> EOF
Simulate the detailed vim save sequence observed via inotifywait.
Timing is crucial and needs to be tuned via sleep based on observation.
$ simulate_nvim_save() {
> local original_fn="$1"
> local sleep_duration=${2:-0.01}
> local backup_fn="${original_fn}~"
>
> # Simulates MOVED_FROM, MOVED_TO
> mv "${original_fn}" "${backup_fn}"
>
> # Sleep after rename to backup
> sleep "$sleep_duration"
>
> # Simulate new file creation and write
> echo "let () = print_endline \"saved\"" >> "${original_fn}"
>
> # Sleep after new original creation/write/close
> sleep "$sleep_duration"
> rm "${backup_fn}"
> }
$ dune exec -w -- ./foo.exe --debug-backtraces &
Success, waiting for filesystem changes...
Error: _build/default/.foo.eobjs/dune__exe__Foo.impl.d: No such file or
directory
-> required by _build/default/.foo.eobjs/dune__exe__Foo.impl.all-deps
-> required by _build/default/foo.exe
Had 1 error, waiting for filesystem changes...
$ PID=$!
$ ./wait-for-file.sh $DONE_FLAG
This might not trigger correctly on every machine, so this parameter can be tuned
accordingly.
$ simulate_nvim_save touch.ml 0.01
$ kill $PID
I have a fix for this issue, but I am not yet sure it is the right fix. The idea is to introduce a small delay just before we run our build and program via Scheduler.sleep ~seconds:0.1. Doing this avoids the race condition observed in the cram test.
This might just be pushing the real problem under the rug however and I am not too fond of making all rebuilds slower for no reason.
Can we make it so that dune just doesn't crash on this error and just keeps waiting for the next file change? Presumably at some change the save will be finished and dune will be able to build successfully?
@yawaramin Yes, that is the goal here. The issue is that Dune's internal file system tracking becomes invalidated leading it to believe a file is missing when it is actually present. This is likely due to a race-condition with how we are loading rules in dune exec -w.
Once this internal representation is broken, subsequent rebuilds cannot progress due to the erroneous state. We don't get this issue in dune build -w since things are correctly loaded there.
This is reported as a user-error but it is really an unexpected side-effect of dune's current implementation that needs to be fixed. I don't think special-casing this for a full-reload of the rules is a good idea when we can just fix it.
I'm still not sure what the best fix is, but I will come up with something soon.