nitro icon indicating copy to clipboard operation
nitro copied to clipboard

perf: limit open files in generateFSTree

Open marvin-j97 opened this issue 1 year ago โ€ข 1 comments

๐Ÿ”— Linked issue

https://github.com/unjs/nitro/issues/1833

โ“ Type of change

  • [ ] ๐Ÿ“– Documentation (updates to the documentation, readme, or JSdoc annotations)
  • [x] ๐Ÿž Bug fix (a non-breaking change that fixes an issue)
  • [x] ๐Ÿ‘Œ Enhancement (improving an existing functionality like performance)
  • [ ] โœจ New feature (a non-breaking change that adds functionality)
  • [ ] ๐Ÿงน Chore (updates to the build process or auxiliary tools and libraries)
  • [ ] โš ๏ธ Breaking change (fix or feature that would cause existing functionality to change)

๐Ÿ“š Description

generateFSTree opens up to N files at the same time. Operating systems limit the amount of open files per process (linux defaults to 1024). Also, awaiting many Promises adds overhead, especially when a function is CPU-bound (gzipping for example). Collecting files has been extracted into collectFiles, which uses globby & tiny-async-pool to process 8 (subject to change) files at a time.

Should limit the amount of open files to fix https://github.com/unjs/nitro/issues/1833

๐Ÿ“ Checklist

  • [x] I have linked an issue or discussion.
  • [x] I have updated the documentation accordingly.

marvin-j97 avatar Feb 11 '24 16:02 marvin-j97

Thanks for this PR. Quickly looking it seems nice changes. I might delay it to next releases to test better.

pi0 avatar Feb 27 '24 16:02 pi0

While the fix here mainly concerns memory exhaustion caused by too many open promises at once, I propose some further optimizations to additionally work around possible performance issues caused by processing larger files.

It seems unnecessary to read entire files into memory just to calculate gzip size on them. gzip-size already contains methods to do this on a path directly, which internally uses streams for reduced memory usage. gzipSize(src) could be in this case replaced by gzipSizeFromFile(path). By using this method instead, the entire file read process can be eliminated in cases when gzip size is not required.

Obtaining the uncompressed file size could be done through (await fsp.stat(path)).size instead.

Also Node v19.4.0, v18.14.0 introduced os.availableParallelism(), this however isn't available in older versions still supported by nitro and I think falling back to a sane number like 8 or os.cpus().length on such versions could work. Something like const parallelism = os.availableParallelism?.() ?? os.cpus().length, perhaps with a comment to remove the optional & fallback after mentioned nodejs versions become minimum required by nitro.

Ingramz avatar Apr 14 '24 13:04 Ingramz

@marvin-j97 Did you close this PR by removing your repo deliberately ๐Ÿ˜…?

@pi0 should we create a new one ๐Ÿค”?

katywings avatar May 21 '24 10:05 katywings

@katywings

Damnit I think I did. I cleaned up my repos because I have way too many and GitHub doesn't show you you have open PRs :skull:

marvin-j97 avatar May 21 '24 10:05 marvin-j97

@marvin-j97 Ohh I see ๐Ÿ™ˆ, unlucky ๐Ÿ˜…! Do you wanna create a new PR or would you prefer if someone else would take care of this?

katywings avatar May 21 '24 12:05 katywings

@marvin-j97 Ohh I see ๐Ÿ™ˆ, unlucky ๐Ÿ˜…! Do you wanna create a new PR or would you prefer if someone else would take care of this?

https://github.com/unjs/nitro/pull/2458

marvin-j97 avatar May 21 '24 12:05 marvin-j97