dora icon indicating copy to clipboard operation
dora copied to clipboard

Tracking Issues for removing operators

Open haixuanTao opened this issue 1 year ago • 12 comments

Context

Dora-rs Operators was built to make Intra Process Communication and makes it possible to run multiple operators within a same process. This could reduce process usage and use green threads instead of OS threads.

Challenges

The problem is that the implementation and abstraction coming with Operators was big and the more we advance in dora:

  • People are confused with custom nodes
  • People are confused about how to program operators
  • Operators are very verbose
  • This add a hierarchy in the dataflow
  • Multiple Python operators does not work with the GIL
  • Rust Operators with shared library is pretty hard, with a lot of complexity
  • Same thing with C/C++ and that leads to having complex build step due too having to compile C/C++ Operators

And we don't see people caring for Intra Process Communication or use deadline time-management functionality.

So we think about depreciating Operators, and favors nodes which is the current custom nodes.

We'll provide guide to migrate and we'll release a minor version for it.

We will migrate unique functionality within the operators to nodes such as hot-reloading.

API

Python API will look as follows:

from dora import Node

state = "XYZ"

if __name__ == "__main__":
    node = Node(hot_reload_states=[state]) 

    for input in node:
        ...
        node.send_output(pa.array([]))

with the graph:

nodes:
    - id: node_1
      path: something.py
      inputs:
        - input_1: "image"
        - input_2: "audio"
      outputs:
        - output_1
        - output_2

The rest will be free for the user to defines in its liking.

For C/C++/Rust, the API will be the current custom nodes API and will remove the support for operators.

What's next

This should makes using dora a lot more simple.

And reduce the burden for maintainer. We will then focus more on making IPC as efficient as possible in the likes of making GPU IPC available.

Follow Up TODO:

  • [ ] Make hot-reloading available for Python custom nodes
  • [ ] Remove Runtime and Operators from the code base.

haixuanTao avatar Apr 17 '24 16:04 haixuanTao

Additional notes, on people who would like to use extensive Rust Green threaded application, we believe that the way to go is to use native threadpool such as tokio, rayon, ... We think that this would mean a more direct and intuitive approach to multi threading.

haixuanTao avatar Apr 18 '24 11:04 haixuanTao

How about we create a branch called next where we point all breaking changes? This way, we could implement this step-by-step across multiple PRs and do the breaking release once everything is ready (implementation, testing, migration guides, docs, etc).

phil-opp avatar Apr 18 '24 12:04 phil-opp

So the thing is that we don't have to remove operators directly, we can probably keep them as is in the codebase with some warning before retiring them in couple of versions as well as making the necessary changes at the node level for hot reloading

I tend to not be a fan of having big releases as it is always a bit stressful and hard to deal with fixes. I think this is more of the current gitflow paradigm, but then open for discussions.

haixuanTao avatar Apr 18 '24 13:04 haixuanTao

Good point! Let's try to keep things backwards-compatible for now then.

phil-opp avatar Apr 18 '24 14:04 phil-opp

Agreed. We should keep things backward compatible when introducing new programing patterns. There are some good reasons when we introduce concepts of operators for complex use cases, such as stateless operator, stateful operators, fault tolerance/redundancy, etc.

heyong4725 avatar Apr 18 '24 14:04 heyong4725

I created a PR at https://github.com/dora-rs/dora/pull/478 to implement the new dataflow parsing logic without the extra nesting behind the custom field. I was able to implement this in a backwards compatible way, so existing dataflow definitions should continue to work. In the future, we can then deprecate the old format at some point.

phil-opp avatar Apr 18 '24 16:04 phil-opp

We will then focus more on making IPC as efficient as possible in the likes of making GPU IPC available.

Just curious because (again) I'm new to this space.

Are the latency / throughput requirements explicit? Something like "Must be faster than X else no one will use dora, but ideally target of Y to differentiate".

Michael-J-Ward avatar Apr 23 '24 16:04 Michael-J-Ward

Hey @Michael-J-Ward, it's not 😅 but in general, I tend to think that being able to have low latency means that you're able to be lean, not have too much slack.

Kind of lean management applied to software production.

In the case of GPU IPC, it's something couple of people shown interest and it would definitely push the industry forward so let's do it 🔥

haixuanTao avatar Apr 23 '24 18:04 haixuanTao

Looking deeper into the code, I understand the desire to simplify things by removing operators.

There are some good reasons when we introduce concepts of operators for complex use cases, such as stateless operator, stateful operators, fault tolerance/redundancy, etc.

@heyong4725 - Could you elaborate? If there are important use-cases that can only be implemented by operators then that would be prevent their removal, right?

If they can be removed, then would a path forward look like:

  • update any examples / docs to use nodes instead of operators
  • new release with deprecation warning to operator API
  • then start nuking operator code for a future release

Michael-J-Ward avatar Apr 29 '24 20:04 Michael-J-Ward

@Michael-J-Ward Originally we have dora operators designed in such that we could potentially offload complex use cases from developers to dora framework thru dora daemon/runtime. However I agree that we can demote the operator which depends on dora framework/daemon and promote node API, keep it simple. We may revisit the operator design in future.

heyong4725 avatar Apr 30 '24 18:04 heyong4725