microjob
microjob copied to clipboard
Persistent context
Need ability to pass some context firstly and then, it will be always available in workers pool. For example, CPU-intensive geo task - check point in polygons. Polygons so weight and every time serialize - deserialize it so expensive. Would be better to pass it firstly
await job(() => {
}, {persistentCtx: {polygons: [/* many-many polygons */]}});
And then on every job execute it always accessible:
await job(() => {
polygons // it accessible here yet.
}, {data: {point: [12.3434, 56.3434]}});
Related issue: https://github.com/wilk/microjob/issues/42
https://github.com/wilk/microjob/pull/48 PR
@darky Thanks for this issue!
Well, let me check if I got it rightly: you need a global bucket shared between worker threads to avoid multiple massive serialisations/deserialisations, correct? This could be done with SharedArrayBuffer (shared memory) by you. However, yes, it could be a useful feature to embed in microjob.
Anyway, your PR is moving the serialisation/deserialisation problem from the user to the core: https://github.com/darky/microjob/commit/67c21aec41ec0ddc3903d6f28cfaae490e41fc95#diff-c9253097723f89dd4716748fab2e00cdR108
Every time the user invokes job
, the whole persistentCtx
gets serialised and sent via postMessage and then deserialised from the worker thread.
I think a good solution could be to pass a global shared context from an external facade, convert it to a SharedArrayBuffer and then convert it back with a proper getter from the worker.
I wouldn't use the job
interface to define a global context: it's ambiguous.
Every time the user invokes job, the whole persistentCtx gets serialised and sent via postMessage and then deserialised from the worker thread.
It occurred once at first time, after it always available via https://github.com/darky/microjob/commit/67c21aec41ec0ddc3903d6f28cfaae490e41fc95#diff-5bfbc2def8d97c3939b537c3f6f31b3eR3
I think a good solution could be to pass a global shared context from an external facade, convert it to a SharedArrayBuffer and then convert it back with a proper getter from the worker.
Can you please provide little example, also you can close #42 via it example :)
I wouldn't use the job interface to define a global context: it's ambiguous.
Yep, agree. Maybe better to use start
function for this purpose?
Yep, agree. Maybe better to use start function for this purpose?
In this scenario, would persistentCtx
be mutable (from within a job
for example)?
I have a bit of a weird use case:
- in one job that runs every N minutes, some data is passed in via context, and the synchronous algorithm builds a sharded index based on the data, then returns it from the job to the main thread.
- this index is stored in memory along with the data, where a synchronous search algorithm uses the index and data to compute search results.
Ideally i'd like to be able to do the following:
- keep both
index
anddata
in persistent state of the job (mutable) - run the search algorithm inside of jobs, instead of in main thread as it is now
unfortunately the serialization cost is too high without persistent state, and idea the state would be mutable would be advantageous, otherwise i'd have to stop
and start
a new worker pool everytime i need to update the dataset.
@r3wt #48 PR can satisfy your needs about mutation