dora icon indicating copy to clipboard operation
dora copied to clipboard

Simplify the `run.rs` scripts of our dataflow examples using `xshell`

Open phil-opp opened this issue 1 year ago • 4 comments

The xshell crate provides a more convenient way to build and run a Command, which is more similar to traditional bash scripts. Using this crate, we can simplify our run.rs scripts while still being platform-independent and not requiring any external dependencies. We can also still run the examples using cargo run --example.

Alternative to #454 .

phil-opp avatar May 02 '24 11:05 phil-opp

Do we want to proceed with this approach? Personally, I think this is a considerable improvement because it changes the previous Command builder code to something that resembles normal shell commands:

From:

let cargo = std::env::var("CARGO").unwrap();
    let mut cmd = tokio::process::Command::new(&cargo);
    cmd.arg("run");
    cmd.arg("--package").arg("dora-cli");
    cmd.arg("--").arg("build").arg(dataflow);
    if !cmd.status().await?.success() {
        bail!("failed to build dataflow");
    };

To:

let dora = prepare_dora(&sh)?;

cmd!(sh, "{dora} build dataflow.yml").run()?;

Possible alternatives:

  • Remove the run.rs completely and only explain the build instructions in the README.
    • Drawback: No more automated testing on CI.
    • Drawback: No cargo run --example <example_name>
  • Use a some cross-platform scripting language such as just or nu to create a run.sh
    • Drawback: Users need to install and learn a third-party tool to run our examples
    • Automated CI testing and cargo run --example <name> support still require a run.rs wrapper
  • Use a standard Unix tool for building, such as a Makefile or bash script.
    • Simplifies Python venv_ setup as we can use the standard activate script
    • Drawback: Windows users need to install Unix tools, which can be difficult

phil-opp avatar May 22 '24 12:05 phil-opp

Looks good to me !

haixuanTao avatar May 22 '24 12:05 haixuanTao

@haixuanTao I get a ModuleNotFoundError: No module named 'dora' error again with the new python-operator-dataflow example. The venv is created before dora up, so it doesn't seem like the same issue as https://github.com/dora-rs/dora/pull/491#issuecomment-2079464273. Any idea what it could be?

phil-opp avatar May 28 '24 15:05 phil-opp

We decided to wait a bit with this PR until https://github.com/dora-rs/dora/pull/551 is ready. We hope that we can create a proper integration test framework then instead of using the example as integration tests. Perhaps we can find a even better solution for running the examples then (after the testing part is moved to the test framework).

phil-opp avatar Jun 20 '24 15:06 phil-opp