vscode icon indicating copy to clipboard operation
vscode copied to clipboard

File Watcher stops when git reverting multiple files at the same time

Open yoshiotobe opened this issue 1 year ago • 20 comments

Type: Bug

Does this issue occur when all extensions are disabled?: Yes

  • VS Code Version: Code 1.71.1
  • Remote OS Version: Linux x64 5.10.102.1-microsoft-standard-WSL2 (Ubuntu 20.04)
  • Remote OS Version: Linux x64 5.4.0-125-generic (Ubuntu 20.04 on Virtualbox/Vagrant)
  • OS Version: Windows_NT x64 10.0.19044

Steps to Reproduce:

  1. Move folder that contains about 30 -50 files in an other folder in a git repository
  2. press "Discard All Changes" in source control tab then file watcher stops
  3. Or after commiting everything, reset commit with command "git reset --hard HEAD^" then file watcher stops

Error logs

  • [remoteagent] [error] [File Watcher (universal)] restarting watcher after error: terminated by itself with code null, signal: SIGSEGV
  • [remoteagent] [error] [File Watcher (parcel)] Unexpected error: std::bad_alloc (EUNKNOWN)
  • [remoteagent] [error] [File Watcher (universal)] restarting watcher after error: std::bad_allo

It seems that the file watcher stops when git tries to restore many files (about 40 files) at one time. As far as git commit, there is no problem.

This happens in only remote environment (WSL and Virtualbox/Vagrant) not in windows host, which has much power (CPU and memory). So it might be just a lack of power but it happens in remote machine with 8GB memory as well.

I believe 40 files is not so much. I hope the file watcher can handle that much file change.

System Info
Item Value
CPUs Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz (4 x 3193)
GPU Status 2d_canvas: enabled
canvas_oop_rasterization: disabled_off
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_renderer: enabled_on
video_decode: enabled
video_encode: enabled
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: disabled_off
Load (avg) undefined
Memory (System) 15.94GB (6.08GB free)
Process Argv --folder-uri=vscode-remote://wsl+ansible/home/ubuntu/venv/ansible --remote=wsl+ansible --crash-reporter-id 4b441a4a-eaf1-4b7e-bd8f-9410d751d669
Screen Reader no
VM 0%
Item Value
Remote WSL: ansible
OS Linux x64 5.10.102.1-microsoft-standard-WSL2
CPUs Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz (4 x 3192)
Memory (System) 3.83GB (3.37GB free)
VM 0%
Item Value
Remote SSH: test
OS Linux x64 5.4.0-125-generic
CPUs Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz (1 x 3192)
Memory (System) 3.84GB (3.09GB free)
VM 50%

yoshiotobe avatar Sep 14 '22 14:09 yoshiotobe

@aeschli what was the setting again to force polling?

bpasero avatar Sep 14 '22 15:09 bpasero

It even happens when I delete about 20 files from the editor at once in a git repository. It seems that the file watcher can not follow the git operation if the amount of files exceeds certain limit. It does not happen when I delete files from CLI though.

yoshiotobe avatar Sep 15 '22 04:09 yoshiotobe

I cannot reproduce this, tried with both Ubuntu 20 and 22. Maybe as a workaround configure "remote.WSL.fileWatcher.polling": true

bpasero avatar Sep 16 '22 07:09 bpasero

We offer the polling only for WSL1 distros. The settings is remote.WSL.fileWatcher.polling.

@yoshiotobe Are the files on the Windows mount (/mnt/...) or on folder in the Linux file system (e.g. the home folder)?

In WSL2, there's unfortunately a known issue with missing file events on '/mnt/..' folders: https://github.com/microsoft/vscode-remote-release/issues/5000.

aeschli avatar Sep 16 '22 09:09 aeschli

Thank you for replies.

I use WSL2 but tried the remote.WSL.fileWatcher.polling setting anyway and it didn't work. Files are under /home, so they are not under /mnt.

It happens with remote Virtualbox/Vagrant as well. So, I guess problem is not WSL. But it does not happens in Windows host, so it has something to do with remote environment.

Also, I found it does not happen with tracked files when I click "Discard Tracked files".

2022-09-16 221900

But it happens with untracked files when I click "Delete Files".

2022-09-16 221931

2022-09-16 223127

This time is just 9 files. It might be something to do with the combination of builtin git with the file watcher on remote?

yoshiotobe avatar Sep 16 '22 13:09 yoshiotobe

I would say [File Watcher (parcel)] Unexpected error: std::bad_alloc (EUNKNOWN) is the cause and I suggest we move this to the parcel watcher, @bpasero ?

aeschli avatar Sep 19 '22 11:09 aeschli

Yeah maybe, but without steps to reproduce or the full crash dump, I would not file an issue. And this does not seem to happen on normal Linux distros, only WSL. I have not seen this issue so far reported elsewhere.

bpasero avatar Sep 19 '22 13:09 bpasero

On WSL (WSL2, Ubuntu) I can reproduce the error message. I already get it when copying a folder with many files into the file explorer

Unexpected error: std::bad_alloc (EUNKNOWN) (path: /home/aeschli/workspaces/foo)
[2022-09-19 15:41:16.770] [remoteagent] [error] [File Watcher (universal)] restarting watcher after error: std::bad_alloc

I tried to get the core dump file, but non was produced (instructions here: https://code.visualstudio.com/docs/remote/troubleshooting#_the-server-fails-to-start-with-a-segmentation-fault)

aeschli avatar Sep 19 '22 14:09 aeschli

@deepak1556 can maybe advise how to get the crash report if any

bpasero avatar Sep 19 '22 14:09 bpasero

I am able to reproduce as well now, maybe I was not looking at the correct log files.

bpasero avatar Sep 19 '22 15:09 bpasero

@bpasero can you share some repro steps, I can get a stacktrace from my setup. @aeschli the core dump location can change depending on your system configuration, you can try

* coredumpctl list <path-to-executable> // You can find the PID that caused the crash
* coredumpctl info <pid-that-caused-the crash>

deepak1556 avatar Sep 19 '22 15:09 deepak1556

@deepak1556 I fail to run coredumpctl somehow, here are the steps:

  • open vscode folder inside WSL2 Ubuntu 20 (not as mount, but really inside)
  • rename build folder to build2
  • undo from git changes view

bpasero avatar Sep 20 '22 07:09 bpasero

I also tried again, no luck getting a dump (if there's one)

coredumpctl
Failed to acquire bus: No such file or directory
No journal files were found.
No coredumps found.
``

aeschli avatar Sep 20 '22 14:09 aeschli

Deepak got it.

bpasero avatar Sep 20 '22 14:09 bpasero

@aeschli @bpasero I had to remove the build configuration https://github.com/parcel-bundler/watcher/blob/478a1ad66d44663cb24f3f73428ff2b52a244098/binding.gyp#L5 to bubble up the exception so that it ended up creating a coredump. I had to make this change only when debugging in WSL, after creating a minimal repro outside of VSCode I was able to trigger the crash on vanilla Linux without any build config changes, so it is interesting to find out what is different with WSL.

As for the crash, it is triggered in the following function https://github.com/parcel-bundler/watcher/blob/478a1ad66d44663cb24f3f73428ff2b52a244098/src/linux/InotifyBackend.cc#L149 on L152

149     bool InotifyBackend::handleSubscription(struct inotify_event *event, std::shared_ptr<InotifySubscription> sub) {
150       // Build full path and check if its in our ignore list.
151       Watcher *watcher = sub->watcher;
152       std::string path = std::string(sub->entry->path);
153       if (event->len > 0) {
154         path += "/" + std::string(event->name);
155       }

Basically sub->entry is pointing to invalid memory causing an allocation failure for std::string. But the root issue is the corrupted value of sub->entry. Following is a minimal repro,

// Following contents are from file test.js created under the root of parcel-bundler/watcher
// Make changes to the require call depending on were you place the file

const watcher = require('./');

async function start() {
  const subscription = await watcher.subscribe('<path>/test-dir', () => {}, { backend: 'inotify' });
}

start()
// In terminal
* Create <path>/test-dir with a large directory, for my testing I copied `vscode/build` folder into this path
* git init // we need a certain git action to trigger this crash
* node test.js // start the test file


// In a different terminal

* cd <path>/test-dir
* mv build build1
* git checkout -q -- build
* rm -rf build1
* mv build build2
* git checkout -q -- build
* rm -rf build2 // CRASH

sub-entry points to an entry from DirTree which is held by sub->tree, the crash would be from the fact that watcher is deleting entries from the tree without invalidating sub->entry which then would be pointing to invalid memory regions and this is somehow getting used for subsequent notifications. Possible location https://github.com/parcel-bundler/watcher/blob/478a1ad66d44663cb24f3f73428ff2b52a244098/src/linux/InotifyBackend.cc#L175 and https://github.com/parcel-bundler/watcher/blob/478a1ad66d44663cb24f3f73428ff2b52a244098/src/linux/InotifyBackend.cc#L203. Still pending investigation on how the notification sequences are generated for the above test case.

deepak1556 avatar Sep 21 '22 04:09 deepak1556

@deepak1556 thanks a ton! I wonder if https://github.com/parcel-bundler/watcher/pull/103 is a related fix? Sorry for not bringing that up earlier, I remember I had seen the PR some time ago...

bpasero avatar Sep 21 '22 04:09 bpasero

Yup that change would address the crash since it replaces the problematic pointer with a copy, also verified the change locally with the above test case. But we need to assess if the problem we encounter is the same as one described in that PR, I was not able to get the crash test with a single level parent->child directory, it required a complex directory structure, might be a race in the order of inotify events depending on the directory structure. I would say a bit more investigation with the test case to understand the root issue and how git helps to trigger the crash would be good.

deepak1556 avatar Sep 21 '22 07:09 deepak1556

Should we move this issue to them though? I feel its entirely upstream now right, with your steps.

bpasero avatar Sep 21 '22 12:09 bpasero

Hi, I am facing this same issue. Is there a possible workaround please ? I tried installing the previous version of VSCode (1.70.2) but it did not solve the issue.

In my case the issue happens systematically after a couple of minutes every time I launch VSCode and remote into a Fedora35 VirtualBox VM and open my project. I first suspected it was the Java+Maven extensions triggering this at startup when loading and compiling the projects (about ~700 Java classes).

amdmdi avatar Sep 21 '22 13:09 amdmdi

@bpasero yup that sounds good.

deepak1556 avatar Sep 21 '22 15:09 deepak1556

Reported as https://github.com/parcel-bundler/watcher/issues/110

bpasero avatar Sep 22 '22 04:09 bpasero