GitButler unable to open repo because it tries to watch node_modules
Version
Commit b80af79d3a0f95afa40d4ae5359e01148bfe70ec
Operating System
Linux
Distribution Method
deb (Linux) (I built the deb myself since I'm on Ubuntu 22.04)
Describe the issue
I am trying to open a monorepo that contains multiple .gitignore files in subfolders. In the frontend folder, the .gitignore ignores /node_modules/.
However, Git Butler refuses to load the repo because it gets an error OS file watch limit reached and in the log I see it is trying to watch all files located inside the frontend/node_modules directory.
Unsure if it's related but I also have opened another repository in the same Git Butler window. When I'm on the safe repo, I try to switch to the buggy repo and everything seems to work well but after a few seconds, I'm auto switched back to the safe repo with an error in the borner saying "This project is already open in another window OS file watch limit reached." (Of course I have no other Git Butler window open)
How to reproduce (Optional)
No response
Expected behavior (Optional)
No response
Relevant log output (Optional)
Thanks a lot for reporting!
The issue seems related to #6076 where the file watcher is slow in setting up the watchers, while additionally exhausting the filewatch limit. The OP there mentions that they raised the limit for that so it became a non-issue, yet setting up the watches seems to take a long time, repeatedly for unknown reasons.
This makes opening even moderately large repositories unbearably slow, if it works at all.
VSCode doesn't seem to have such problems even though, even though it was reported to also run out of handles unless the value is increased.
In any case, it's great to know that this issue should readily be reproducible on a linux box, maybe even on the GitButler repository which also sports a node_modules directory, with a total of 143742 files.
Finally, we are working on a fix for the bogus message about "This project is already open in another window".
Hello, thanks for replying!
I've done some quick tests to get more meaningful numbers:
My repo:
❯ find . | wc -l
269666
❯ git ls-files | wc -l
4545
The GitButler repo:
❯ find . | wc -l
117356
❯ git ls-files | wc -l
1911
I've been able to easily reproduce the issue using the git butler repository with a slight cheat:
- In the gitbutler gitignore, change the
node_modulesto/node_modulesso it only ignores the one at root cp -r node_modules apps/node_modulesto create a copy of a huge folderecho "/node_modules/" > apps/.gitignore--> The leading slash means "root" but is relative to the file (that might be the underlying bug in whatever library you use to handle the gitignores)
Now you have a setup that gitbutler won't be able to load by switching!
I have a small repo loaded in gitbutler, I add the modified gitbutler repo and it doesn't let me switch to gitbutler. After about 30s, it reverts back to my small repo and prints the error message I've mentioned in my OP.
Of course, it's possible that increasing the watched file limit fixes the issue but I'd rather avoid it, it's not a real fix.
Of course, it's possible that increasing the watched file limit fixes the issue but I'd rather avoid it, it's not a real fix.
Agreed, and the time that takes is very unacceptable as well.
Further, the notify crate was updated to 8.0 which hopefully helps us to get better results on Linux. There is also the new… but watch sub-command to watch filesystem events as they come in. This is useful to see how fast the watcher responds and starts up, particularly to facilitate debugging on Linux.
If you want to give it a shot, you can now run just the watch component like this:
cargo run -p but-cli -- watch
It should need nothing more than a Rust installation to work, the but-cli crate doesn't have all the heavy tauri dependencies.
Something I'd be interested in seeing is if there is a huge delay to starting up the watcher. In theory, this process can also be strace'd` much more cleanly just to see what it spends its time with.
My expectation would be that the it should take about 30s to startup and then fail due to exhausted inodes, just like the user interface.
I tried it but I get the same error and it takes around 15s to appear. The watcher still tries watching files from node_modules despite the directory being gitignored:
I've done the following after a git pull (I'm on c9e4d1f27)
❯ cargo run -p but-cli -- watch
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.23s
Running `target/debug/but-cli watch`
Error: failed to start watcher
Caused by:
OS file watch limit reached. about ["/home/cedric/tmp/gitbutler/./apps/node_modules/.pnpm/@[email protected]/node_modules/@modelcontextprotocol"]
Thanks so much!
Since the elapsed time isn't printed, this would mean the watcher sets up the watches synchronously, and that runs out of resources.
On my VM that works and takes around 8s (on a shared volume), a clear sign that it steps into node_modules and that it is slow. Interestingly, the watch didn't work, so changing anything either on linux or on MacOS didn't trigger a notification.
Also, the version under test is the latest one with all dependencies update, so that didn't really help with anything.
Even if it would work, 8s (or more) clearly is unacceptable here.
On a native volume it takes 0.7 seconds only, but still, no events are actually coming in.
The main problem I found was that no event is coming in at all.
However, when doing…
git clone https://github.com/notify-rs/notify
cargo run --example monitor_raw -- .
…notify events come in for everything. This is certainly something to fix on our side.
The other problem is that watching can exhaust the watch limit, something I couldn't reproduce on my (virtual) machine.
I will take a look at notify to see if it's possible to filter folders somehow, to have a chance to avoid watching ignored directories.
Here is some notes of my investigation:
Running into the watch limit
- We are using a
RecommendedWatcherwhich chooses the best implementation to actually watch a filesystem. - On linux,
inotifyis used and there is no way to specify more than recursive or non-recursive watching.- Generally, there is no way to do more with any watcher implementation than is supported by the
Watchertrait. - Thus, it's probably easiest to workaround the watch-handle exhaustion by increasing the limit.
- However, we could fork the
inotifywatcher and implement a Git-aware version of it, maybe even in such a way so that it is behind a feature toggle and can be contributed.
- Generally, there is no way to do more with any watcher implementation than is supported by the
Not responding to any event
This seems to be on us as we see log message of events that are received in the watcher implementation, but we seem to filter them out early.
On smaller folders it actually works, but on GitButler events seem to be filtered.
It turned out that this might have been an issue mostly with but-cli as the paths it received were canonicalized paths, while the path it would use to check if it's inside of the worktree may even have been relative.
It did look like everything else worked alright.
Conclusion
The only way to solve this properly is to implement an inotify watcher with .gitignore support, or to contribute it to the upstream project outright. The latter seems like the way to go, but we have to postpone it in favor of helping folks to increase their watch limits for now.
byron@gitbutler:~/gitbutler$ sysctl -a 2>/dev/null | grep inotify
fs.inotify.max_queued_events = 1048576
fs.inotify.max_user_instances = 1048576
fs.inotify.max_user_watches = 1048576
user.max_inotify_instances = 1048576
user.max_inotify_watches = 1048576
It would be interesting to see what happens if you adjust these, maybe this helps?
sudo sysctl fs.inotify.max_user_watches=524288 # Or a higher value, like 1048576