js-lingui
js-lingui copied to clipboard
lingui-extract-experimental.ts `extractFromFiles` concurrently?
Is your feature request related to a problem? Please describe.
For big project, it is too slow to extractFromFiles, after investigation, how about to make lingui-extract-experimental.ts extractFromFiles concurrently?
https://github.com/lingui/js-lingui/blob/04b7cef9876168bc376b55424ecc0b0ebde976c1/packages/cli/src/lingui-extract-experimental.ts#L73
Describe proposed solution
i considered implementing a worker thread pool to do so. But for first iteration stopped as it is now. You probably a first user who really started experimenting with that and we started getting the feedback.
If you have capacity for implement worker threads i would happy to help. But for now i'm out of capacity to do so by my own.
BTW p-limit doesn't really help here because while extracting we invoking Babel on a bundles, which are single big file. Babel is very CPU bound and synchronous by nature. In all bounding code there also not much async operations, so you will not benefit of running them in one node process.
Thanks for you patient explanation. I'm not familier with worker threads, but very interesting of it. I'll study the theory first, glad to join the work if possible.
FYI https://www.npmjs.com/package/jest-worker
The caveat of working with workers - you don't have a shared memory between it. Treat them as few standalone nodejs programs ran by another one.
So if you want to expose something for all workers, you could not just store it in some global variable. Usually, passing data between main / child processes is done by serializing and storing in some place, and then reading and deserializing it on another side. So you could not pass from main process to child something non-serializable, say a function or class instance.
In lingui there might be few places where it's needed, and should be re-designed in a different way.
- Passing a lingui config to child workers (config is not serializable to json, as it might have custom formatters / extractors as function). So you rather need to read config in each thread by it's own (this might bring a significant overhead!)
- Passing a Catalog instance object, this should be just designed in diffrent way.
Got it, how about make each worker to extract each entry? It seems isolated.
In your very first message you point into the right place in sourcecode which should be parallelized. Start from there.
I know that Vitest instead of using jest-worker is using Piscina https://www.npmjs.com/package/piscina which is more robust by far than jest-worker, probably could be a good addition here