dora icon indicating copy to clipboard operation
dora copied to clipboard

errror about operator node in C++

Open Jia-Baos opened this issue 1 year ago • 19 comments

dataflow.yml file:

nodes:
    - id: node-cpp-api
      path: install/bin/node_cpp_api
      inputs:
          tick: dora/timer/millis/300
      outputs:
          - image

    - id: runtime-node-1
      operators:
          - id: operator-plot
            shared-library: install/lib/operator_plot
            inputs:
                image: node-cpp-api/image

commands to start dataflow.yml

source .venv/bin/activate
dora build dataflow.yml --uv
dora run dataflow.yml --uv

[ERROR] Dataflow failed:

Node runtime-node-1 failed: exited with code 2 with stderr output:

/home/seer/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/bin/python3.11: can't open file '/home/seer/Project-Rust/dora/examples/cmake-dataflow-camera/runtime': [Errno 2] No such file or directory

Jia-Baos avatar Mar 20 '25 09:03 Jia-Baos

Could you create an example so I can reproduce it? Thanks for your work

starlitxiling avatar Mar 27 '25 07:03 starlitxiling

@dora-bot assign me

vedantvijay avatar Mar 29 '25 23:03 vedantvijay

Hello @vedantvijay, this issue is now assigned to you!

github-actions[bot] avatar Mar 29 '25 23:03 github-actions[bot]

here is the code about operator-plot, its function is to display the image sent by node-cpp-api. In addition, I can use node just fine, but the operator will report an error

operator.h

#pragma once
#include <memory>
#include "operator/operator_api.h"

class Operator
{
public:
    Operator();
    unsigned char counter;
};

#include "dora-operator-api.h"

std::unique_ptr<Operator> new_operator();

DoraOnInputResult on_input(Operator &op, rust::Str id, rust::Slice<const uint8_t> data, OutputSender &output_sender);

operator.cpp

#include "operator.h"
#include <iostream>
#include <vector>
#include "dora-operator-api.h"

#include <opencv2/opencv.hpp>

Operator::Operator() {}

std::unique_ptr<Operator> new_operator()
{
    return std::make_unique<Operator>();
}

DoraOnInputResult on_input(Operator &op, rust::Str id, rust::Slice<const uint8_t> data, OutputSender &output_sender)
{
    op.counter += 1;
    std::cout << "Rust API operator received input `" << id.data() << "` with data `" << (unsigned int)data[0] << "` (internal counter: " << (unsigned int)op.counter << ")" << std::endl;

    // 解码图像数据
    std::vector<unsigned char> img_vec(data.begin(), data.end());
    cv::Mat img = cv::imdecode(img_vec, cv::IMREAD_COLOR);
    if (img.empty())
    {
        std::cout << "Failed to decode image" << std::endl;
        DoraOnInputResult result = {rust::cxxbridge1::String("failed decode image"), false};
        return result;
    }
    else
    {
        cv::namedWindow("frame", cv::WINDOW_NORMAL);
        cv::imshow("frame", img);
        if (int key = cv::waitKey(10) && key == 27)
        {
            DoraOnInputResult result = {rust::cxxbridge1::String("failed decode image"), true};
            return result;
        }
    }

    std::vector<unsigned char> out_vec{op.counter};
    rust::Slice<const uint8_t> out_slice{out_vec.data(), out_vec.size()};
    auto send_result = send_output(output_sender, rust::Str("status"), out_slice);
    DoraOnInputResult result = {send_result.error, false};
    return result;
}

node-plot.cpp

#include "dora-node-api.h"

#include <iostream>
#include <ratio>
#include <string>
#include <vector>

#include <opencv2/opencv.hpp>

int main()
{
    std::cout << "HELLO FROM plot image node" << std::endl;
    unsigned char counter = 0;

    auto dora_node = init_dora_node();

    while (true)
    {
        auto event = next_event(dora_node.events);
        auto ty = event_type(event);

        if (ty == DoraEventType::AllInputsClosed)
        {
            break;
        }
        else if (ty == DoraEventType::Input)
        {
            auto input = event_as_input(std::move(event));

            counter += 1;

            std::cout << "Received input " << std::string(input.id) << " (counter: " << (unsigned int)counter << ")" << std::endl;

            std::cout << "Rust API operator received input, id: " << std::string(input.id) <<  std::endl;

            // 解码图像数据
            std::vector<unsigned char> img_vec(input.data.begin(), input.data.end());
            cv::Mat img = cv::imdecode(img_vec, cv::IMREAD_COLOR);
            if (img.empty())
            {
                std::cout << "Failed to decode image" << std::endl;
            }
            else
            {
                cv::namedWindow("frame", cv::WINDOW_NORMAL);
                cv::imshow("frame", img);
                if (int key = cv::waitKey(10) && key == 27)
                {
                    return 0;
                }
            }
        }
        else
        {
            std::cerr << "Unknown event type " << static_cast<int>(ty) << std::endl;
        }
    }

    std::cout << "GOODBYE FROM plot image node (using Rust API)" << std::endl;

    return 0;
}

Jia-Baos avatar Mar 31 '25 02:03 Jia-Baos

Thanks for reporting this! Could you try whether it works without the --uv flag?

My suspicion is that this error is caused by https://github.com/dora-rs/dora/pull/765, especially these lines:

https://github.com/dora-rs/dora/blob/79f7304457f75725748dc00463230039f448d8ca/binaries/daemon/src/spawn.rs#L172-L179

The error message above seems to indicate that the daemon tries to start the dora runtime through uv runtime.

To solve this, we should only use uv if resolved_path points to a python executable. A PR for this would be appreciated!

phil-opp avatar Apr 01 '25 13:04 phil-opp

emm, it can't works without the --uv flag, and i meet the following error message when start node without the --uv flag

INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api stdout: HELLO FROM C++
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api daemon: node is ready
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` daemon: all nodes are ready, starting dataflow
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api stdout: terminate called after throwing an instance of 'rust::cxxbridge1::Error'
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api stdout:   what():  failed to init event stream
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api stdout: 
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api stdout: 
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api daemon: marking `node-cpp-api` as cascading error caused by `runtime-node-1`
ERROR dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` node-cpp-api daemon: exited because of signal SIGABRT. This error occurred because node `runtime-node-1` exited before connecting to dora.
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on daemon `d82182f1-3258-4686-a5d6-ca9a1c8d128f` daemon: dataflow finished on machine `d82182f1-3258-4686-a5d6-ca9a1c8d128f`
INFO  dataflow `0195f55f-f836-7a66-9fa6-6c4071bd2150` on default daemon coordinator: dataflow finished


[ERROR]
Dataflow 0195f55f-f836-7a66-9fa6-6c4071bd2150 failed:

Node `runtime-node-1` failed: unknown exit status with stderr output:
---------------------------------------------------------------------------------
spawn failed: failed to spawn node `runtime-node-1`

Caused by:
   0: failed to run runtime 0195f55f-f836-7a66-9fa6-6c4071bd2150/runtime-node-1
   1: No such file or directory (os error 2)

Location:
    /home/runner/work/dora/dora/binaries/daemon/src/spawn.rs:339:18
---------------------------------------------------------------------------------



Location:
    binaries/cli/src/lib.rs:693:17

Jia-Baos avatar Apr 02 '25 07:04 Jia-Baos

@Jia-Baos i tried reporducing the issue but there were some problems. Can you like share git repo or something?

Arlott8 avatar Apr 02 '25 08:04 Arlott8

@Jia-Baos i tried reporducing the issue but there were some problems. Can you like share git repo or something?

Yes, you can refer this:

https://github.com/Jia-Baos/dora/tree/master/examples/cmake-dataflow-camera

I suspect there is a problem with c++ operator because I encountered a similar phenomenon when I start the cmake-example.

Jia-Baos avatar Apr 02 '25 08:04 Jia-Baos

Thanks for trying!

Your example doesn't seem to different from the cmake-dataflow example in the main dora repo. Does this example work for you? Could you also try to run cargo run --example cmake-dataflow in the dora main repo?

phil-opp avatar Apr 02 '25 09:04 phil-opp

The cmake-dataflow does indeed runs with the command cargo run --example cmake-dataflow but it will not run if we run it with dora run dataflow.yml. So I build the dora executable from the current main dora repo and it was successful in running the cmake-dataflow. I guess the issue is that when we run it with the dora version installed from pip it fails but if we build it from current main git repo it is successful. I also tried it for your code and it seems working.

Arlott8 avatar Apr 02 '25 09:04 Arlott8

Interesting! Could you to install dora through cargo instead of pip? The command for installing is:

cargo install dora-cli --locked

Afterwards you should have a ~/.cargo/bin/dora binary. Not sure whether the one from cargo or pip will take precedence, but you can just run ~/.cargo/bin/dora run dataflow.yml to be sure that the cargo one is used.

phil-opp avatar Apr 02 '25 09:04 phil-opp

The one from cargo does works. Although they are of same version i wonder what could be the reason.

Arlott8 avatar Apr 02 '25 09:04 Arlott8

Yeah, that's what I suspected. My guess is that the pip-installed version is not part of the PATH or only installed into some venv. When the daemon is not started in the same environment, the dora command is not available to it. So when it tries to spawn the dataflow and start a dora runtime, it doesn't find any dora binary.

@haixuanTao might know more about the details

phil-opp avatar Apr 02 '25 09:04 phil-opp

Of course this is something that we should handle as part of our dora run command, so this is clearly a bug on our side.

I think the problematic line might be: https://github.com/dora-rs/dora/blob/79f7304457f75725748dc00463230039f448d8ca/binaries/daemon/src/spawn.rs#L305

For non-pip dora installations this would point to a dora binary, which makes the following dora runtime call work. When dora is installed through pip, the binary might (?) be a python binary, which would then result in an invalid python runtime command. Not sure about that, though.

phil-opp avatar Apr 02 '25 09:04 phil-opp

I ran other examples such as rust-dataflow , c-dataflow, c++-dataflow with pip installed dora. It only gives error for the c and c++ dataflows.

Arlott8 avatar Apr 02 '25 10:04 Arlott8

Yeah, I think only dataflows with shared-library operators are affected.

phil-opp avatar Apr 02 '25 10:04 phil-opp

@Arlott8, @phil-opp thanks so much, i have also reproduced the above issue……

Jia-Baos avatar Apr 03 '25 01:04 Jia-Baos

I opened https://github.com/dora-rs/dora/pull/940 to fix this.

phil-opp avatar Apr 03 '25 12:04 phil-opp

@vedantvijay has been automatically unassigned from this stale issue after 2 weeks of inactivity.

github-actions[bot] avatar Apr 18 '25 00:04 github-actions[bot]