rfcs Many Worlds

trafficstars

Building on the ideas introduced in #16 by @NathanSWard (and the associated thread), this RFC presents a general-purpose multiple worlds API.

The TL;DR:

The app contains many worlds, each with their own WorldLabel.
Each schedule is associated with a WorldLabel.
By default, all schedules run in parallel, then a special global schedule is run, then AppCommands are applied.
AppCommands can be used to send data between worlds, move entities, make new worlds, clone schedules and so on.

This should be useful for:

cleaning up pipelined-rendering logic
rollback networking
simulation with varying hyperparameters
sharding / chunking the game world
much, much more!

Nov 15 '21 22:11 alice-i-cecile

Would this help making a future bevy editor a reality? Having an editorworld and a gameworld and having the editor interact between the two?

Nov 16 '21 13:11 Weibye

Would this help making a future bevy editor a reality? Having an editorworld and a gameworld and having the editor interact between the two?

Oh, that's a neat architecture. I haven't dug deep enough into that problem space to say for sure, but I suspect it's worth exploring.

Nov 16 '21 16:11 alice-i-cecile

I have a basic proof of concept for 0.5 for the scientific simulation case here. It's very limited in features, but should work well for that narrow use case.

Nov 26 '21 05:11 alice-i-cecile

Here's a task-pool based design for multi-world I had, which can be implemented fully user-side

fn add_world_to_app<P, F>(app: &mut App, mut schedule: Schedule, mut sync_fn: F)
where
    P: SystemParam,
    F: for<'w, 's> FnMut(&mut World, <P::Fetch as SystemParamFetch<'w, 's>>::Item),
    for<'w, 's> <P::Fetch as SystemParamFetch<'w, 's>>::Item: SystemParam<Fetch = P::Fetch>,
    F: Send + Sync + 'static,
{
    let (task_to_world_tx, task_to_world_rx) = async_channel::unbounded();
    let (world_to_task_tx, world_to_task_rx) = async_channel::unbounded();

    let task = async move {
        let mut world = Some(World::new());
        let tx = task_to_world_tx;
        let rx = world_to_task_rx;

        loop {
            schedule.run(world.as_mut().unwrap());
            tx.send(world.take().unwrap()).await.unwrap();
            world = Some(rx.recv().await.unwrap());
        }
    };

    let system = move |system_param: <P::Fetch as SystemParamFetch<'_, '_>>::Item| {
        let tx = &world_to_task_tx;
        let rx = &task_to_world_rx;
        if let Ok(mut world) = rx.try_recv() {
            sync_fn(&mut world, system_param);
            tx.try_send(world).unwrap();
        }
    };

    app.add_system(system);
    app.world
        .get_resource::<AsyncComputeTaskPool>()
        .unwrap()
        .spawn(task)
        .detach();
}

fn main() {
    let mut app = App::new();
    // the turbofish shouldn't be necessary, but it is. blame rustc
    add_world_to_app::<Res<u32>, _>(&mut app, Schedule::default(), |world, res| {
        // this closure is ran in a system in the main world.
        // and here you have access to the entire subworld and any resources / queries you want from the main world.
        // this function runs once per subworld tick for synchronization
        // and a subworld tick does not block main world progress
    })
}

Nov 27 '21 09:11 TheRawMeatball

Hey sorry I left a comment and then deleted it because I realized that I hadn't fully grokked the RFC when I wrote it, and I want to leave more coherent feedback. I like the general direction of the RFC.

I'm thinking about how it will address my particular use case of running a discrete time simulation where I can manually call Schedule::run_once. One obvious blocker is that the executors assume there is a ComputeTaskPool on whatever World we use to run the schedule, but we should prefer to use a global pool; I believe this is properly addressed by the RFC.

Then, for example, let's say I want to run my schedule every time a specific key is pressed. Doing this from a custom runner might work, although I don't think the RFC proposed a specific interface for custom runners. If we just have exclusive access to the App, then I could manually inspect the global input events and then run the schedule.

But I also think it might be nicer if I could run my schedule within another system via ResMut<(Schedule, World)>. Then so long as my system can access the global task pool, I have a more ergonomic interface for accessing resources.

For example, I could have this:

fn update_simulation(
    mut sim: ResMut<Simulation>,
    mut tick_signal: EventReader<TickEvent>,
) {
    let Simulation { world, schedule } = &mut *sim;
    for _ in tick_signal.iter() {
        // Presumably this runs on a global task pool, which can be accessed via any world. It would be nice if this could be
        // async as well so long simulations don't block rendering.
        schedule.run_once(world);
    }
}

Finally, the ergonomics, discoverability and potential for optimization of a custom-built API for this very expressive feature is much better than Res<World>-based designs.

Could you expand on this?

Dec 01 '21 04:12 bonsairobo

There are two things that jump out to me as being desirable here, although I'm not sure whether either is feasible.

It would be cool to be able to create a world in which all components and resources are statically typechecked to implement some trait (or collection of traits), including the ability to safely enumerate them as dyn Trait. Hiding worlds behind wrappers that enforce this is one option, but we would somehow need to deal with the fact that .set_world() ... insert_resource() would bypass that. This is motivated by https://github.com/bevyengine/bevy/issues/3877 which would benefit from Serialize worlds, but it's also conceivable that a user may want to do this with a custom trait on a custom world. It's not obvious to me that Rust's typesystem can support this.
If people are dealing with multiple worlds, I predict a common class of bug where an Entity is used in the wrong world. We could maybe do something with 'world lifetime magic here, but maybe that would add undesirable complexity to Entity in other places. Would this even be reliably caught at runtime in the current proposal?

May 06 '22 00:05 SamPruden

Does this RFC support simulating different worlds at different update rates?

Jun 17 '22 02:06 ottworks

is this still relevant now that we have sub apps with their own world?

Jun 17 '22 06:06 mockersf

Does this RFC support simulating different worlds at different update rates?

Currently no. I'm not fully happy with the current state of this RFC; moving it to draft until I have some cycles to devote to it.

is this still relevant now that we have sub apps with their own world?

Very. This is / was an attempt to try and wrangle the API and complexities of the subapps to create a coherent design.

Jun 17 '22 12:06 alice-i-cecile

While working on https://github.com/bevyengine/bevy/issues/3877 I noticed that a central hurdle when working with multiple worlds is the question of relating entities. Obviously if there is a correspondence between entities in two worlds there could just be a HashMap<Entity, Entity> that stores this relation, or maybe a component RelatedEntity(Entity), both of which would probably work fine.

But it would be nice if worlds could just share the same "entity namespace", i.e. related entities just have the same id. This is how the relation between the main and render world works right now, which I think is a very valuable example for anything that works with multiple worlds. I'm not sure though if this implementable without changing very fundamental ECS code, the Entities struct basically needs to be thread safe I think.

As a side node: The proposed AppCommands means that data is "pushed" rather than "pulled". I agree with this design, but the main <-> render world extraction switched from a "push" style to a "pull" style. I don't know the reason for the switch, but that should be investigated as the reasoning might also apply here.

Oct 19 '22 09:10 MDeiml

This seems to be a quite hard problem even without sharing entity namespaces. See https://github.com/bevyengine/bevy/issues/3096 for why AppCommands::spawn basically can't return an Entity. As I see it there are 3 solutions to this:

Make Entities thread safe, e.g. put it behind a RwLock. This should be ok as there should only be one write per cycle to flush reserved entities, so most of the time aquiring the lock shouldn't block.
Don't run schedules in parallel with worlds they need to access the entity namespace of. Also works fine, but the implementation might get pretty ugly as the execution model suddenly becomes very complicated. This is the approach that RenderStage::Extract is taking right now
Have some intrinsic entity mapping in AppCommands. E.g. have an extra namespace for entities spawned in other worlds and keep a HashMap of relations that is updated when AppCommands are applied. This would probably be the easiest to implement, but it doesn't solve the problem that AppCommands can't know about entity references stored in component, so developers will have to write some manual mapping for them

Personally I like 1. the most. With some effort we might even get around using RwLocks or similar if ownership of Entities is given to the App or global world. Entities already supports this "mutate in parallel, then flush synchronously" flow.

Oct 19 '22 09:10 MDeiml

Proposal what to add to the RFC:

Entities will no longer be stored in World. Methods in World that need access to &Entities will take it as a parameter.

For convenience a Universe (I like that name :sweat_smile:) struct is added, which is a wrapper around World and &Entities. It will support the same interface that World supports now (no Entities as method parameters) except entities_mut.

Basically anything that at that is now a &mut World, or &World will become a Universe. Exclusive systems for example will get a &mut Universe instead of &mut World. (EDIT: Probably it makes sense to make rename World to something else and name the wrapper World. This way most of the code should stay unchanged).

Alongside a list of worlds App will also contain namespaces: Vec<Entities> as well as namespace_map: HashMap<WorldId, usize>, which determines which namespace to use for each world.

All in all execution of worlds will not be impacted. During the "global step" entities are flushed.

Pros:

No performance impact
Developers can't mess up when copying / moving components to other worlds

Cons:

More difficult API, almost duplicate interfaces of World and Universe
Changes in almost everything that touches ECS
Code gets ugly as &Entities is passed around
Developers can still mess up if worlds don't share a namespace

Oct 19 '22 10:10 MDeiml

Exclusive systems for example will get a &mut Universe instead of &mut World

Is this overly restrictive in terms of blocking all worlds at once? I can imagine scenarios where one small world may want to exist with an exclusive system that doesn't block all of the other worlds. Would it still be feasible to have world-exclusive as well as universe-exclusive systems?

It's been a while since I looked at this part of the code, so I don't have any technical feedback at this time.

Oct 19 '22 13:10 SamPruden

Universe would also only have a &Entity not &mut Entity or Entity. So it's non-blocking with regard to other worlds. This should pretty much be ok since the only thing you can't do without a mutable reference to Entities is flush and delete entities, both of which would be done during the global sync step.

Oct 19 '22 13:10 MDeiml

Closing for now: this needs more thought in the context of what we've learned about pipelined rendering and I want to let others try their hand at a design for this!

I remain strongly in favor of an API to support this sort of design in the future though.

Jul 31 '23 19:07 alice-i-cecile

rfcs rfcs copied to clipboard

Many Worlds

rfcs
rfcs copied to clipboard