dora icon indicating copy to clipboard operation
dora copied to clipboard

Fix `dora runtime` spawn command for dora installed through `pip`

Open phil-opp opened this issue 10 months ago • 1 comments

Pip installations of dora use a small Python wrapper script, which then calls a functions of the dora_cli module. This means that the current_exe() will be a python binary instead of a standalone dora binary. This leads to errors when trying to start a dora runtime instance for shared library operators (see #900).

This commit fixes this issue by introducing a new DoraCommand struct that is initialized in the respective main function. For normal Rust binaries, this is just initialized with current_exe() as before. When invoked from Python, we also include the first argument, which should be the path to the Python wrapper script invoking dora. We also use the 0th argument instead of current_exe for invoking the python binary because current_exe resolves symlinks on some platforms, which affects import statements (different base directory).

Fixes #900

phil-opp avatar Apr 03 '25 12:04 phil-opp

FYI, I have fixed this issue for the daemon and the coordinator with:

fn start_daemon() -> eyre::Result<()> {
    let path = if cfg!(feature = "python") {
        std::env::args_os()
            .nth(1)
            .context("Could not get first argument correspond to dora with python installation")?
    } else {
        std::env::args_os()
            .next()
            .context("Could not get dora path")?
    };
    let mut cmd = Command::new(path);
    cmd.arg("daemon");
    cmd.arg("--quiet");
    cmd.spawn().wrap_err("failed to run `dora daemon`")?;

    println!("started dora daemon");

    Ok(())
}

I think this changes creates a lot of additional argument changes that can be annoying down the line.

haixuanTao avatar Apr 06 '25 09:04 haixuanTao

FYI, I have fixed this issue for the daemon and the coordinator with:

Unfortunately, that's different because this is invoked from the CLI directly.

The run function, on the other hand, is part of the node API. So the dora CLI might not even be installed. But in order to run dataflows with operators, we need a way to spawn the dora runtime as a process. Do you have any idea how we can achieve this reliably?

One simple approach could be to just invoke dora and require that it's in the PATH. This seems a bit hacky and restricting, though.

phil-opp avatar May 21 '25 14:05 phil-opp

I have opened a simpler PR that I have tested on c++ dataflow example: https://github.com/dora-rs/dora/pull/1011

haixuanTao avatar May 27 '25 13:05 haixuanTao

Closed in favor of #1011

phil-opp avatar Jun 23 '25 13:06 phil-opp