winfsp icon indicating copy to clipboard operation
winfsp copied to clipboard

Random errors and performance issues using memfs as compilation dir

Open alvinhochun opened this issue 5 years ago • 3 comments

I finally got to try out this project today and I encountered a few issues that I don't have enough info for separate bug reports, so I think I'd create a consolidated one at first.

Background: I tried to compile a sort of large project (Krita) with a native mingw-w64 toolchain using a memfs drive for both the output dir and TEMP dir. The build process writes a lot of small files and occasionally some large files. The build runs parallel jobs.

My memfs command: memfs-x64.exe -i -F MEMFS -n 65536 -s 1073741824 -m Z:

Issues:

  1. During the build process, I randomly encountered errors like "permission denied" and "invalid argument" emitted by g++ when it is trying to look for some header files relative to the output dir which don't exist. The same error doesn't reappear immediately when I re-run (resume) the build but could reappear for other files. I got these errors several times throughout a build. Any idea what could be causing these?
  2. The performance isn't great at times. During certain build steps that write large amounts of data to a file, the memfs process can really spin the CPU for a long time (it became the bottleneck). I think it has something to do with the realloc strategy only resizing the buffer to the new size (with alignment) on every write. When I compile my own memfs to use a growth factor of 2, it became quite a bit faster on writing the large files, though it still hogged the CPU at times. Is this just how memfs is?
  3. When trying to compile the memfs included in the samples folder with VS2017, there were several errors with const-correctness that I needed to change. (There's a particularly ugly const_cast that I had to put in.) Just thought I'd note this here.

I've put my half-assed changes up for reference: https://github.com/alvinhochun/winfsp-memfs-test

WinFsp version: 1.4.19016 Windows: Windows 10 1803 (17134.648)

alvinhochun avatar Apr 03 '19 18:04 alvinhochun

Thanks for the report. I might try your experiment to see this first hand. My comments inline:

  1. During the build process, I randomly encountered errors like "permission denied" and "invalid argument" emitted by g++ when it is trying to look for some header files relative to the output dir which don't exist.

I would like to understand these issues better. MEMFS should behave like NTFS in most respects; anything else should be considered a bug in MEMFS and/or the WinFsp core.

Some speculation follows:

  • The "permission denied" problem is often attributed to trying to use a file that has been deleted but not closed (the original NTOS status code is STATUS_DELETE_PENDING). This is unfortunately a common problem in Windows file systems and I have also seen it with NTFS. (A usual culprit for this problem is AntiVirus software; BTW, what AntiVirus software are you using?)

  • I am not certain where the "invalid argument" comes from, but often POSIX layers on Windows translate unknown errors to EINVAL. So unfortunately we have no real indication of the original error.

One possibility would be to enable debug log messages on MEMFS (use the -d -1 -D <LOGFILE> options) or watch the file system under FileSpy, but you would probably be overwhelmed by the sheer amount of logging information.

The performance isn't great at times. During certain build steps that write large amounts of data to a file, the memfs process can really spin the CPU for a long time (it became the bottleneck).

This is true. MEMFS has a simple design and keeps file data in a single contiguous block of memory. This means lots of realloc calls, which can kill performance for processes with lots of extending WRITE's. MEMFS was developed over time as the test file system for WinFsp, so I opted for simplicity over performance in many cases. (There was an attempt to alleviate this problem with the "large heap" functions, but it did not solve the problem for all scenarios. A better solution would be to keep data in memory chunks rather than contiguous memory to avoid the cost of realloc.)

billziss-gh avatar Apr 03 '19 22:04 billziss-gh

BTW, I note that you changed the MEMFS_SECTOR_SIZE to 4096 from 512. I agree that this will improve performance because it will force WinFsp and MEMFS to use whole pages instead of page fragments and will also improve internal memory management that WinFsp does.

While I recommend file systems to use a block/sector size of 4096, MEMFS uses 512 in order to exercise those parts of WinFsp that handle buffers that are page fragments rather than whole pages. I should probably have made this a command-line option so that it could be easily changed.

billziss-gh avatar Apr 03 '19 22:04 billziss-gh

Thanks for the quick reply. I'll reply inline:

Thanks for the report. I might try your experiment to see this first hand. My comments inline:

  1. During the build process, I randomly encountered errors like "permission denied" and "invalid argument" emitted by g++ when it is trying to look for some header files relative to the output dir which don't exist.

I would like to understand these issues better. MEMFS should behave like NTFS in most respects; anything else should be considered a bug in MEMFS and/or the WinFsp core.

Some speculation follows:

* The "permission denied" problem is often attributed to trying to use a file that has been deleted but not closed (the original NTOS status code is `STATUS_DELETE_PENDING`). This is unfortunately a common problem in Windows file systems and I have also seen it with NTFS. (A usual culprit for this problem is AntiVirus software; BTW, what AntiVirus software are you using?)

This doesn't sound plausible, because the files it tried to access did not exist before and will never exist.

I only have the built-in Windows Defender. I have the drive path in the exclusion list.

* I am not certain where the "invalid argument" comes from, but often POSIX layers on Windows translate unknown errors to `EINVAL`. So unfortunately we have no real indication of the original error.

One possibility would be to enable debug log messages on MEMFS (use the -d -1 -D <LOGFILE> options) or watch the file system under FileSpy, but you would probably be overwhelmed by the sheer amount of logging information.

Okay. I might consider gathering some logs. Though my go-to tool is Process Monitor.

On another note, I also tried using airfs. Other than fixing #226, I didn't get any random errors like I did with memfs. This isn't indicative though.

The performance isn't great at times. During certain build steps that write large amounts of data to a file, the memfs process can really spin the CPU for a long time (it became the bottleneck).

This is true. MEMFS has a simple design and keeps file data in a single contiguous block of memory. This means lots of realloc calls, which can kill performance for processes with lots of extending WRITE's. MEMFS was developed over time as the test file system for WinFsp, so I opted for simplicity over performance in many cases. (There was an attempt to alleviate this problem with the "large heap" functions, but it did not solve the problem for all scenarios. A better solution would be to keep data in memory chunks rather than contiguous memory to avoid the cost of realloc.)

Understandable. I have thought of writing a toy implementation myself but I'm already overwhelmed with stuff I want to do.

alvinhochun avatar Apr 04 '19 06:04 alvinhochun