Tracking Issues for removing operators
Context
Dora-rs Operators was built to make Intra Process Communication and makes it possible to run multiple operators within a same process. This could reduce process usage and use green threads instead of OS threads.
Challenges
The problem is that the implementation and abstraction coming with Operators was big and the more we advance in dora:
- People are confused with custom nodes
- People are confused about how to program operators
- Operators are very verbose
- This add a hierarchy in the dataflow
- Multiple Python operators does not work with the GIL
- Rust Operators with shared library is pretty hard, with a lot of complexity
- Same thing with C/C++ and that leads to having complex build step due too having to compile C/C++ Operators
And we don't see people caring for Intra Process Communication or use deadline time-management functionality.
So we think about depreciating Operators, and favors nodes which is the current custom nodes.
We'll provide guide to migrate and we'll release a minor version for it.
We will migrate unique functionality within the operators to nodes such as hot-reloading.
API
Python API will look as follows:
from dora import Node
state = "XYZ"
if __name__ == "__main__":
node = Node(hot_reload_states=[state])
for input in node:
...
node.send_output(pa.array([]))
with the graph:
nodes:
- id: node_1
path: something.py
inputs:
- input_1: "image"
- input_2: "audio"
outputs:
- output_1
- output_2
The rest will be free for the user to defines in its liking.
For C/C++/Rust, the API will be the current custom nodes API and will remove the support for operators.
What's next
This should makes using dora a lot more simple.
And reduce the burden for maintainer. We will then focus more on making IPC as efficient as possible in the likes of making GPU IPC available.
Follow Up TODO:
- [ ] Make hot-reloading available for Python custom nodes
- [ ] Remove Runtime and Operators from the code base.
Additional notes, on people who would like to use extensive Rust Green threaded application, we believe that the way to go is to use native threadpool such as tokio, rayon, ... We think that this would mean a more direct and intuitive approach to multi threading.
How about we create a branch called next where we point all breaking changes? This way, we could implement this step-by-step across multiple PRs and do the breaking release once everything is ready (implementation, testing, migration guides, docs, etc).
So the thing is that we don't have to remove operators directly, we can probably keep them as is in the codebase with some warning before retiring them in couple of versions as well as making the necessary changes at the node level for hot reloading
I tend to not be a fan of having big releases as it is always a bit stressful and hard to deal with fixes. I think this is more of the current gitflow paradigm, but then open for discussions.
Good point! Let's try to keep things backwards-compatible for now then.
Agreed. We should keep things backward compatible when introducing new programing patterns. There are some good reasons when we introduce concepts of operators for complex use cases, such as stateless operator, stateful operators, fault tolerance/redundancy, etc.
I created a PR at https://github.com/dora-rs/dora/pull/478 to implement the new dataflow parsing logic without the extra nesting behind the custom field. I was able to implement this in a backwards compatible way, so existing dataflow definitions should continue to work. In the future, we can then deprecate the old format at some point.
We will then focus more on making IPC as efficient as possible in the likes of making GPU IPC available.
Just curious because (again) I'm new to this space.
Are the latency / throughput requirements explicit? Something like "Must be faster than X else no one will use dora, but ideally target of Y to differentiate".
Hey @Michael-J-Ward, it's not 😅 but in general, I tend to think that being able to have low latency means that you're able to be lean, not have too much slack.
Kind of lean management applied to software production.
In the case of GPU IPC, it's something couple of people shown interest and it would definitely push the industry forward so let's do it 🔥
Looking deeper into the code, I understand the desire to simplify things by removing operators.
There are some good reasons when we introduce concepts of operators for complex use cases, such as stateless operator, stateful operators, fault tolerance/redundancy, etc.
@heyong4725 - Could you elaborate? If there are important use-cases that can only be implemented by operators then that would be prevent their removal, right?
If they can be removed, then would a path forward look like:
- update any examples / docs to use nodes instead of operators
- new release with deprecation warning to operator API
- then start nuking operator code for a future release
@Michael-J-Ward Originally we have dora operators designed in such that we could potentially offload complex use cases from developers to dora framework thru dora daemon/runtime. However I agree that we can demote the operator which depends on dora framework/daemon and promote node API, keep it simple. We may revisit the operator design in future.