fflate icon indicating copy to clipboard operation
fflate copied to clipboard

No cap on web-workers created

Open brianmhunt opened this issue 3 years ago • 4 comments

When calling async functions (e.g. inflate), a new web-worker is created.

This is problematic when calling it thousands of times, as it'll easily overload the browser with worker instantiations.

When the file-sizes are small it's quite problematic because the 70ms worker instantiation takes longer than the decompression.

The solution that comes to mind would be a worker-pool (whether in-library or by the user).

It may be worth noting this in the documentation for future readers.

brianmhunt avatar May 05 '21 14:05 brianmhunt

Aside: that it may be worth capping the number of worker threads based on the hardware e.g.

const maxProcs = Math.max(navigator.hardwareConcurrency - 1, 1)

Note that navigator.hardwareConcurrency is not supported in Safari/IE.

brianmhunt avatar May 05 '21 14:05 brianmhunt

I considered this a long time ago, even as I was building the first asynchronous function, but I found that a worker pool was simply too limiting. To create a system that would figure out when a worker thread was finished and reuse it would increase bundle size significantly, and when I did try it the performance on small files was just as bad as before. Therefore, I simply recommend synchronous functions for smaller files.

I am aware of the navigator.hardwareConcurrency property; it was the default maximum size of the worker pool, when I had one.

If you have an alternative idea for a worker pool in fflate, I'll implement it, since I'm also not happy with the current system, but if it's still slow AND complex, I don't think I'll make the changes.

101arrowz avatar May 05 '21 15:05 101arrowz

Thanks, that's great to know.

In the case of many similarly small files, what about a "fire and forget" thread system. Does eliminating the requirement of figuring out when a worker thread is finished simplifies things? If that's the case:

  • Requests could pick a pool thread incrementally (mod the pool-size), or even randomly.
  • Threads could queue the requests and run them in-order.

The additional complexity would be implementing a queue in the worker-threads (which may not actually be simpler, but I'd expect it might be, hence the suggestion).

I'd expect this to help substantially with many small files, though it has a worst-case conditions when many large and files are mixed with many small files as the large files would block the smaller jobs. There may be solutions for that use-case (e.g. creating a new independent worker outside the pool when the file-size is deemed "large"), but that seems to be an edge case compared to the desire to parallelize many small files.

Another even simpler alternative, may be to simply have a pool size of 1. A queue would still be needed, but the thread-picking would be trivial.

I hope that's some useful food-for-thought.

brianmhunt avatar May 05 '21 15:05 brianmhunt

I'm looking into this suggestion for future versions of fflate and will keep the issue open, but it's on a backburner for the time being.

101arrowz avatar May 23 '21 17:05 101arrowz