perf: limit open files in generateFSTree
๐ Linked issue
https://github.com/unjs/nitro/issues/1833
โ Type of change
- [ ] ๐ Documentation (updates to the documentation, readme, or JSdoc annotations)
- [x] ๐ Bug fix (a non-breaking change that fixes an issue)
- [x] ๐ Enhancement (improving an existing functionality like performance)
- [ ] โจ New feature (a non-breaking change that adds functionality)
- [ ] ๐งน Chore (updates to the build process or auxiliary tools and libraries)
- [ ] โ ๏ธ Breaking change (fix or feature that would cause existing functionality to change)
๐ Description
generateFSTree opens up to N files at the same time. Operating systems limit the amount of open files per process (linux defaults to 1024). Also, awaiting many Promises adds overhead, especially when a function is CPU-bound (gzipping for example). Collecting files has been extracted into collectFiles, which uses globby & tiny-async-pool to process 8 (subject to change) files at a time.
Should limit the amount of open files to fix https://github.com/unjs/nitro/issues/1833
๐ Checklist
- [x] I have linked an issue or discussion.
- [x] I have updated the documentation accordingly.
Thanks for this PR. Quickly looking it seems nice changes. I might delay it to next releases to test better.
While the fix here mainly concerns memory exhaustion caused by too many open promises at once, I propose some further optimizations to additionally work around possible performance issues caused by processing larger files.
It seems unnecessary to read entire files into memory just to calculate gzip size on them. gzip-size already contains methods to do this on a path directly, which internally uses streams for reduced memory usage. gzipSize(src) could be in this case replaced by gzipSizeFromFile(path). By using this method instead, the entire file read process can be eliminated in cases when gzip size is not required.
Obtaining the uncompressed file size could be done through (await fsp.stat(path)).size instead.
Also Node v19.4.0, v18.14.0 introduced os.availableParallelism(), this however isn't available in older versions still supported by nitro and I think falling back to a sane number like 8 or os.cpus().length on such versions could work. Something like const parallelism = os.availableParallelism?.() ?? os.cpus().length, perhaps with a comment to remove the optional & fallback after mentioned nodejs versions become minimum required by nitro.
@marvin-j97 Did you close this PR by removing your repo deliberately ๐ ?
@pi0 should we create a new one ๐ค?
@katywings
Damnit I think I did. I cleaned up my repos because I have way too many and GitHub doesn't show you you have open PRs :skull:
@marvin-j97 Ohh I see ๐, unlucky ๐ ! Do you wanna create a new PR or would you prefer if someone else would take care of this?
@marvin-j97 Ohh I see ๐, unlucky ๐ ! Do you wanna create a new PR or would you prefer if someone else would take care of this?
https://github.com/unjs/nitro/pull/2458