stride Script and Task improvements Epic

Various ideas this Epic covers:

Make our task system alloc-free (possible with recent C# version) so that people are not afraid to create thousands of scripts with await
Reorganize task system so that we can wait various points of engine. So far we only have Update/Draw, but I suspect many scripts wants much more control when they want to run (after physics, etc...). Not sure yet if we want some kind of high-level sync point graph editor, or something more programmatic.

Aug 03 '18 06:08 xen2

Is there anyone working on this task ?

Sep 30 '19 22:09 velhaco20000

You proposed 2 features:

Minimize allocation in a task start, possibly with a minimum thread local storage;
Create several points to start a script in order to achieve a precondition.

Is that right ?

Sep 30 '19 23:09 velhaco20000

Yes, that's two separate features (probably done together in the same feature branch). I started some prototyping on this task: https://github.com/xen2/xenko/tree/microthread2 I have made a simple library to test part of the concepts, next step is to decide exactly how to integrate it into Xenko and switch various part of Xenko to use it.

Oct 01 '19 16:10 xen2

Do you want to create a mini batch system ?

Oct 01 '19 18:10 velhaco20000

I will read the code and make a proposal.

Oct 01 '19 18:10 velhaco20000

FYI the code is still a prototype/playground, it is not yet integrated and lot of things will change.

Oct 01 '19 19:10 xen2

Just a little update, and correct me if I'm wrong :

You wrote a mini batch system , with a scheduler and a synchronization system;
Normally, users don't access directly such systems. There is a layer in the engine breaking great tasks into little taks and this layer and the little tasks weight depends on system configuration (I.E. core counts, core capabilities, etc) ;
If those assumptions are correct, the next step is construct this layer to break great tasks in little tasks and create the synchronization points; -DX12 and Vulkan helps with this job;
I don't know if you are thinking on transparency on this layer, permitting the user/programmer to control it beyond creating custom sync points; Let me hear your observations and corrections.

Oct 02 '19 04:10 velhaco20000

Little Update 2: (correct me if I'm wrong)

Xenko uses SDL as Window System, SharpDX as DX11/12 Binding, SharpVulkan as Vulkan Binding , OpenTK (Xamarin) as an OpenGL/OpenGL ES/Metal binding/layer ;
Seems the engine was implemented in a way to be more compatible with older APIs ;
One simple feature is implement async compute in order to use MicroThreads and Vulkan/DX12/Metal .Use MicroThreads to populate the queue and get the response. This will break compatibility in components;
This imply in divide the engine implementation in high perfomance devices and low performance(mobile) devices .

Let me hear your observations and corrections.

Oct 03 '19 04:10 velhaco20000

Little Update 3: ( corrections and observations are welcome)

I will create an async compute language subset using ANTLR4 and generating the parser in C# (Any restrictions/directions for grammar?);
The subset will produce METAL/Vulkan-OpenGL/HLSL compute shaders;
The compute shader will be processed by MicroThread using the scheduler ;
A use scenario will be created and the solution could be better evaluated. Any restriction to this course of action ?

Oct 03 '19 17:10 velhaco20000

Little Update 4:

There is a Metal API in Xamarin, so the compute queue could be enable for testing purpose;

Oct 03 '19 20:10 velhaco20000

@velhaco20000 I would suggest you make a draft pull request to a new branch based on master from a fork of this repo and talk about that over there. This is mostly to reduce spam on epics, and you'll be able to explain yourself better using the PR template.

Oct 03 '19 20:10 fdrobidoux

@fdrobidoux , you are right. I just don't want to create something useless or something somebody else is doing right now.

Oct 03 '19 20:10 velhaco20000

Hey @xen2 - got a question about this Epic what do you mean by a "high-level sync point graph editor"? Do you mean visual Scripting because that would be nice to get. :)

Nov 29 '19 16:11 HeadClot

Has anything happend anymore in regard of this topic since i see that the tasks in stride 4.0.01 are still .NET 4.0 tasks and nott valuetasks afaik.

Dec 13 '20 17:12 CodingMadness

@Shpendicus did you check the changes in the linked WIP branch?

Dec 13 '20 18:12 tebjan

@tebjan what is WIP branch? where can i check it out?

Do u mean this one? https://github.com/xen2/stride/tree/microthread2

but isnt this more or less depricated and replaced now officially by stride?

Dec 13 '20 19:12 CodingMadness

There was some further discussion on Discord about this topic and this is a summery of it. (@Eideren helped write a lot of this summery)

Rename "Script" (ScriptSystem, SyncScript, etc.) to be more fitting and intuitive. Additionally, the world "script" can be used to refer to other things such as files, and so can cause confusion. Current suggestions are:

ExecutionComponent
WorkerComponent
GameplayComponent
FunctionalityComponent
TaskComponent
LogicComponent
Routine
Procedure

Condense StartScript, SyncScript, and AsyncScript, and perhaps ScriptComponent in to a single component to simplify and streamline the API.

Change the implementation to allow for more flexibility and extensibility. These are the current ideas.

Interface based:
- The different sync points are implemented as interfaces: IUpdate, IFixedUpdate, IRender, etc. Users add interfaces to their class declaration and implement the corresponding function.
- Pros/Cons:
  - Very little bolierplate.
  - Discoverability of available interfaces is poor (perhaps having a naming convention could help, such as I____Receiver).
  - Access modifiers are public or only through interface.
  - Performance cost for adding a component to an entity, while very small, would increase linearly for each new interface type we add with a naïve implementation.
Register from entry point:
- From a single start method, users register/add the component's methods to run on specific events/sync points.
- Pros/Cons
  - A fair amount of boilerpate.
  - Good discoverability.
  - Access modifiers/encapsulation retained.
  - Very little performance cost but a small allocation cost for each one of those instance methods passed through those ‘Add’ calls.
Code gen:
- When users implement methods whose signature matches one of the predefined event ones, those methods will be picked up by a code generator and automatically added in a similar way to above.
- Pros/Cons:
  - No boilerplate.
  - Discoverability is bad.
  - Fixed access modifiers.
  - Same performance hit as above.
  - Feels like magic.
- Pros/Cons from a partial implementation:
  - Small bolierplate (add partial to class definition).
  - Best discoverability.
  - Fixed access modifiers.
  - Same performance hit as above.
  - Feels like magic.

Aug 08 '22 18:08 MechWarrior99

Another thing that came to my head - we should consider if the current way of having scripts directly attached to entities is the way to go forward. Currently scripts may a bit too universal - it's easy to misuse them instead of using a pair component+processor. I think it would be nice to see a new entry point into the scene where beefier and usually single instance objects would be registered for execution and if they need to be tied to a single entity's lifetime, maybe we could leverage cancellation tokens that would be cancelled when an entity is removed from the scene. The second thing to keep in mind is to promote asynchronous code with good "wait for event" APIs - wait for input combination, wait for a physics event, rather than wait for every frame and check something.

Aug 08 '22 19:08 manio143

The AsyncScript should remain separate, although it could receive a similar renaming and a move to ValueTask. Having callbacks mixed with async is probably not the way to entice people to write the recommended code style. If instead of listening for callbacks people can await all the events they need, the resulting code will be much easier to write, read and debug.

No need to touch cancellation tokens and whatnot in the one callback to affect some other methods execution flow, in the process having to define all sorts of random fields to keep track of things which suddenly need to be accessible in multiple places, etc. And you'll find people unaware that they gave away half of the performance they won by using async due to that one high-frequency callback that they still have.

Aug 08 '22 20:08 ericwj

I think it would be nice to see a new entry point into the scene where beefier and usually single instance objects would be registered for execution and if they need to be tied to a single entity's lifetime

Do you mean like a scene-level/'global' system for managers and systems, where you will only have a single instance of them? Things where you wouldn't need a entity for it? (An example could be a system that spawns enemies in the scene) If that is not, then can you explain more? If so, then I was actually also thinking just the other day how that could be a good idea. How would you see this changing the design of the current implementation though?

wait for input combination, wait for a physics event, rather than wait for every frame and check something.

Genuine question, how would the performance differ from this approach, apposed to one where certain methods are called when those events happen, or events are invoked? And would you say the wait approach is a nicer design?

Aug 08 '22 21:08 MechWarrior99

The AsyncScript should remain separate, although it could receive a similar renaming and a move to ValueTask

It would be needlessly redundant in my opinion. If we go with say the interface implementation, simply having a IAsyncExecute interface would have the same end result, while being cohesive with the rest of the system.

Aug 08 '22 21:08 MechWarrior99

simply having a IAsyncExecute interface would have the same end result

Async is much more than just an interface. Its a style of coding. Mixing the two yields spagetthi. Or worse, async over sync over async, deadlocks, and crashes. There's many excellent articles on this by Stephen Toub on devblogs.microsoft.com.

Aug 08 '22 21:08 ericwj

Do you mean like a scene-level/'global' system for managers and systems, where you will only have a single instance of them? Things where you wouldn't need a entity for it? (An example could be a system that spawns enemies in the scene)

Yes, exactly something like that. Currently I'd either create a script and place an otherwise unused entity in the scene to hold a reference to it, or I need to implement a GameSystem, register it with the game class and potentially do some additional wiring to have it interact with the scene. With the task system we could have multiple representations of task sources - i.e. a component placed on an entity, a one off task executed on the game thread by another thread (think dispatcher of UI frameworks), a scene procedure invoked when a scene is activated (added as root or added as a subscene to an active scene) - this of course would mean extending the scene asset, which needs to be done with care.

(regarding async wait for event) how would the performance differ from this approach, apposed to one where certain methods are called when those events happen, or events are invoked? And would you say the wait approach is a nicer design?

The performance of a virtual method call directly is one thing, but currently we have a scheduler such that all callbacks of script methods happen in the same moment of a frame (sequentially, ordered by priority) - this adds some overhead but may allow additional robustness (it'd would be great to document Stride.Core.MicroThreading in greater detail first to understand it better). I imagine the main benefit of the async/await pattern is context persistence - rather than manually managing a state machine and fields of my object with single method callbacks I can first await one event, then await a new frame, modify some entity, await a different event and throughout all that any context will be captured and the compiler will do all the heavy lifting, leaving me with a nice looking sequential logic, that would otherwise be spread across multiple callbacks.

Aug 08 '22 22:08 manio143

(regarding async wait for event) how would the performance differ from this approach

The performance difference is in the amount of callbacks required as well, since the callback will in the synchronous case always be called for a class of event, even if you aren't actually interested in the specific one happening - for example you collide and you asked to be notified, but the current collission is with something you don't care about.

In the async model it would be fairly easy to specify in detail what events you care about at that particular place in the code and there will be no callback until the actual event you care about occurs. It won't be easy to optimize this to the fullest, but the end result could be close to simply never having to filter anymore - if the physics engine reports an event, exactly the right callbacks would then be instantly known from the carefully crafted context.

An additional benefit automatically obtained is that the priorities of callbacks are much less relevant, because callbacks happen much more seldomly, at different times, so there will be less sorting and generally a faster task system.

Aug 08 '22 22:08 ericwj

Make our task system alloc-free (possible with recent C# version) so that people are not afraid to create thousands of scripts with await

Might you be interested in adopting ProtoPromise? Benchmarks show it to be the most efficient alloc-free async library.

Apr 12 '24 09:04 timcassell

Make our task system alloc-free (possible with recent C# version) so that people are not afraid to create thousands of scripts with await

Might you be interested in adopting ProtoPromise? Benchmarks show it to be the most efficient alloc-free async library.

Looks very good, an implementation within the Engine could be smth, what do you guys say?

Apr 13 '24 18:04 CodingMadness

Make our task system alloc-free (possible with recent C# version) so that people are not afraid to create thousands of scripts with await

Might you be interested in adopting ProtoPromise? Benchmarks show it to be the most efficient alloc-free async library.

as its your library, it would make sense that you write a prototype of an integration and showing the advantages/disadvantages of what is existing currently vs your thing

you can make a WIP PR to let us see the progress

its the same with BEPU, nicogo made a showcase of the integration and showed the advantages and all were in the "lets goo" mood :D

Apr 13 '24 19:04 IXLLEGACYIXL

stride stride copied to clipboard

Script and Task improvements Epic

stride
stride copied to clipboard