fflate
fflate copied to clipboard
No cap on web-workers created
When calling async functions (e.g. inflate
), a new web-worker is created.
This is problematic when calling it thousands of times, as it'll easily overload the browser with worker instantiations.
When the file-sizes are small it's quite problematic because the 70ms worker instantiation takes longer than the decompression.
The solution that comes to mind would be a worker-pool (whether in-library or by the user).
It may be worth noting this in the documentation for future readers.
Aside: that it may be worth capping the number of worker threads based on the hardware e.g.
const maxProcs = Math.max(navigator.hardwareConcurrency - 1, 1)
Note that navigator.hardwareConcurrency
is not supported in Safari/IE.
I considered this a long time ago, even as I was building the first asynchronous function, but I found that a worker pool was simply too limiting. To create a system that would figure out when a worker thread was finished and reuse it would increase bundle size significantly, and when I did try it the performance on small files was just as bad as before. Therefore, I simply recommend synchronous functions for smaller files.
I am aware of the navigator.hardwareConcurrency
property; it was the default maximum size of the worker pool, when I had one.
If you have an alternative idea for a worker pool in fflate
, I'll implement it, since I'm also not happy with the current system, but if it's still slow AND complex, I don't think I'll make the changes.
Thanks, that's great to know.
In the case of many similarly small files, what about a "fire and forget" thread system. Does eliminating the requirement of figuring out when a worker thread is finished simplifies things? If that's the case:
- Requests could pick a pool thread incrementally (mod the pool-size), or even randomly.
- Threads could queue the requests and run them in-order.
The additional complexity would be implementing a queue in the worker-threads (which may not actually be simpler, but I'd expect it might be, hence the suggestion).
I'd expect this to help substantially with many small files, though it has a worst-case conditions when many large and files are mixed with many small files as the large files would block the smaller jobs. There may be solutions for that use-case (e.g. creating a new independent worker outside the pool when the file-size is deemed "large"), but that seems to be an edge case compared to the desire to parallelize many small files.
Another even simpler alternative, may be to simply have a pool size of 1. A queue would still be needed, but the thread-picking would be trivial.
I hope that's some useful food-for-thought.
I'm looking into this suggestion for future versions of fflate
and will keep the issue open, but it's on a backburner for the time being.