crew icon indicating copy to clipboard operation
crew copied to clipboard

Common data

Open wlandau opened this issue 2 years ago • 5 comments

clustermq has a feature called "common data": objects that are part of the worker environment for all tasks. These objects need only get sent once, rather than with each new task. If the objects get assigned to .GlobalEnv on the worker (@mschubert, is this what clustermq does?), then I can implement this feature in crew without asking @shikokuchuo to implement it in mirai::server().

wlandau avatar Mar 07 '23 14:03 wlandau

Yes, that's what clustermq currently does.

However, please note that I'm transitioning to a system where the master environment gets updated on the workers automatically.

mschubert avatar Mar 07 '23 14:03 mschubert

Thanks for explaining. In the new system, is the master environment still the same as .GlobalEnv? When you say the master environment gets updated automatically, does that have to do with the handshake that uses the set_common_data() and send_common_data() methods?

wlandau avatar Mar 07 '23 14:03 wlandau

is the master environment still the same as .GlobalEnv?

Yes, at least for now.

When you say the master environment gets updated automatically

You will be able to add additional objects to the master at any point, which will then be propagated to the workers (that can then reuse them in subsequent calls)

mschubert avatar Mar 07 '23 14:03 mschubert

I did more experiments with nanonext, and it seems like both the listener and the dialer need to both be connected in order for me to send the messages I would need to implement common data. This forces a level of synchronicity that would require me to implement an active daemon to watch for common data messages, and this is not feasible in crew. Fortunately, mirai is very fast, so common data may not even be necessary,

wlandau avatar Mar 12 '23 03:03 wlandau

Now that mirai 2.0.0 is simpler, common data in crew might be achievable. I'll think about it.

wlandau avatar Apr 11 '25 21:04 wlandau

Just out of curiosity, will this feature reduce memory requirement if many tasks require one common large data?

psychelzh avatar Apr 22 '25 09:04 psychelzh

Possibly. If it works, the main advantage would be speed because it would avoid repeated sends of the same data over the network in some cases.

wlandau avatar Apr 22 '25 12:04 wlandau

On reflection, I don't think this is likely to become possible in the foreseeable future. Migrating to a discussion.

wlandau avatar Aug 19 '25 20:08 wlandau