rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Cargo allow running binaries from development or build dependencies

Open tchernobog opened this issue 3 years ago • 14 comments

In addition to binaries from workspace members, teach cargo run to execute binaries from dependent packages.

Rendered.

tchernobog avatar Aug 24 '21 10:08 tchernobog

(From the RFC:)

if the specified pkgid is not related to a workspace member, use the resolver against the locked manifest to determine acceptable versions of dependencies. The resulting set should have a size of one, or we fail.

I have some questions about what exactly this means:

  • '... pkgid is not related to a workspace member ...': What does 'related' mean? Being a direct dependency? Being a transitive dependency? Or just that the pkgid points to a member of the workspace? Or something else?
  • 'use the resolver against the locked manifest to determine acceptable versions of dependencies': What if there's 1.) a direct dependency on a package and 2.) an transitive dependency on the same package but with incompatible version constraints? Will this prevent me from running executables from the direct dependency?

Thanks a lot for working on this! It'd be great to have this work!

Also, I'm not sure this is the right place to ask questions. If not, please kindly point me to the right one.

soenkehahn avatar Aug 25 '21 16:08 soenkehahn

Thanks @soenkehahn, I addressed the points you raised in the reworked RFC.

In particular:

pkgid is not related to a workspace member

I rewrote this section to clearly mention that pkgid is not the package id of any workspace member. Dependency resolution happens always as a fallback mechanism after checking that.

'use the resolver against the locked manifest to determine acceptable versions of dependencies': What if there's 1.) a direct dependency on a package and 2.) an transitive dependency on the same package but with incompatible version constraints? Will this prevent me from running executables from the direct dependency?

This is still an open point in "unresolved questions". I personally would be inclined to disallow calling a binary from a transitive dependency, and allow only running binaries from direct dependencies, which would also remove this ambiguity.

What are your thoughts on that?

tchernobog avatar Aug 25 '21 19:08 tchernobog

Thanks for the clarification!

One typo did sneak in: '[...] not match a the package id [...]'. I think either 'a' or 'the', but not both.

This is still an open point in "unresolved questions". I personally would be inclined to disallow calling a binary from a transitive dependency, and allow only running binaries from direct dependencies, which would also remove this ambiguity.

What are your thoughts on that?

I completely agree. It would be bad if a transitive dependency could break my build by adding an executable. And then make it impossible to work around the issue. On the other hand requiring to declare a direct dependency in order to call a binary from it is always possible and also seems like a very reasonable and unsurprising requirement.

Also, you mention version stability and I think that's also a really good argument against allowing to run binaries from transitive dependencies. Projects -- especially libraries that don't ship the Cargo.lock file -- have almost no control over which versions of transitive dependencies are going to be used and may get major version updates of transitive dependencies just through minor or patch-level updates of direct dependencies. So it seems good to force authors to specify these packages as direct dependencies.

One other thing:

(From the RFC:)

Disallowing transitive dependencies would be more robust in terms of version stability, and would remove the second condition above (where the package set can have cardinality greater than one).

I think there's a corner case where different members of the workspace can directly depend on the same package with incompatible version constraints. And that direct dependency contains a binary that a user attempt to run with cargo run. So the number of found packages could still be greater than 1, even if transitive dependencies are not included. I personally think that it'd be fine if this ambiguity would result in an error. Because for the direct dependencies of members of a workspace a user is in control of which versions are being chosen. So in most cases where this ambiguity arises it would be easy to change the dependency declarations and remove the ambiguity. This would have two drawbacks I can think of:

1.) There may be cases where different members of a workspace use different versions of a dependency for very good reasons. So it may be difficult to change those dependency version constraints. 2.) You could have two packages that are not in a joint workspace and they work perfectly fine. But putting them into one workspace doesn't work without tweaking them. Which seems undesirable. (@casey brought this up to me offline.)

Personally I don't care about these cases too much, since they seem fairly rare.

(And to be clear: I'm not involved in approving RFCs, or whatever needs to happen to move this forward. I'm not even sure how that works. I'm just a random person that would love to have this feature. So take my opinions with a grain of salt.)

soenkehahn avatar Aug 26 '21 00:08 soenkehahn

Thanks again @soenkehahn, I integrated changes as suggested, and kept the case about incompatible version constraints. I still haven't seen that happen in practice, but since it costs close to nothing to include, let's be comprehensive.

tchernobog avatar Aug 26 '21 11:08 tchernobog

I like the concept of this. I feel that this should be an extension to artifact dependencies, rather than just a related feature as this RFC presents it. In particular, I think it would be appropriate to require that a dependency be an artifact dependency in order to run its binaries via cargo-run. That would unify the two concepts, in that both require depending on the binary rather than just the library.

Given artifact dependencies, it'd be easy enough to simply execute the corresponding binary from a build script, and I think that's the appropriate method. But from the command line, it would be convenient to be able to run a dependency's binaries for a variety of purposes, including testing, package maintenance, and similar. (Consider, for instance, using cargo run to run generators for templates from one of your dependencies, or migration tools, or tools to manage and generate data at compile time to be read in by a corresponding library.)

Would you consider designing this as an extension to artifact dependencies, rather than a mostly independent mechanism?

joshtriplett avatar Aug 27 '21 04:08 joshtriplett

I like the concept of this. I feel that this should be an extension to artifact dependencies, rather than just a related feature as this RFC presents it. In particular, I think it would be appropriate to require that a dependency be an artifact dependency in order to run its binaries via cargo-run. That would unify the two concepts, in that both require depending on the binary rather than just the library.

I agree that would be possible, although it's not, strictly speaking, a necessity. However, what I would like to avoid is creating a functional dependency among the two that might block merging of one feature on the other. I don't see a technical drawback in keeping this functionality more open so that, as you mention...

From the command line, it would be convenient to be able to run a dependency's binaries for a variety of purposes

Having artifact dependencies would just ensure that binary artifacts are always built, instead of lazily built. Additionally, for development dependencies, artifact dependencies are likely built for the target system when cross-compiling (to allow bundling), and not for the host system. Here I see the two RFCs diverging in behavior and hard to reconcile, since the use case differs.

This RFC is more about a developer or CI calling an executable built on the fly, than an artifact to be distributed with the rest of the build (and this is the reason why considering normal dependencies -- not build or development --, is explicitly disallowed by this RFC).

Would you consider designing this as an extension to artifact dependencies, rather than a mostly independent mechanism?

We would need to solve the cross-compilation issue, then it becomes a possibility. However, I am a bit unconvinced the two systems need to be intertwined, as they cover different use cases and likely use different compilation profiles in some scenarios.

tchernobog avatar Aug 27 '21 09:08 tchernobog

It's worth noting that running binaries of arbitrary dependencies is already possible, using a small hack:

  1. Use cargo metadata or the cargo_metadata crate to find the folder where cargo put the dependency source.
    • First, find the root resolve node in resolve.nodes by looking for the node with id resolve.root
    • Next, find the ID of the desired dependency by looking through the deps field of the root resolve node.
    • Then find the package with the dependency ID in packages.
    • The manifest_path field of the package specifies the local path where cargo put the crate.
  2. Now we can change the working directory to the manifest path of the dependency and do a cargo run or cargo run --bin ... there.
    • We don't want to write to this folder, so we use the --target-dir argument to place the target folder somewhere else.

The above steps can be easily wrapped into crate, so it's already possible to create an external cargo subcommand that does what this RFC proposes. So, in my opinion, the question is not whether we should allow running binaries of arbitrary dependencies (it's already possible!), but whether we want to provide a more ergonomic command line API for this.

phil-opp avatar Sep 19 '21 12:09 phil-opp

That is a wonderful point! That suggests that we can experiment with this as a third party subcommand, and uplift to Cargo when the design is worked out.

Eh2406 avatar Sep 19 '21 16:09 Eh2406

  1. Now we can change the working directory to the manifest path of the dependency and do a cargo run or cargo run --bin ... there.
  • We don't want to write to this folder, so we use the --target-dir argument to place the target folder somewhere else.

But is the resolver guaranteed to pull exactly the same transitive dependency versions as the workspace root with this approach? Else, it would violate version immutability principles needed for CI.

The above steps can be easily wrapped into crate, so it's already possible to create an external cargo subcommand that does what this RFC proposes. So, in my opinion, the question is not whether we should allow running binaries of arbitrary dependencies (it's already possible!), but whether we want to provide a more ergonomic command line API for this.

This has however the drawback to still require on CI of running cargo install cargo-task before being able to run a dependency. At that point, we go back to having an out-of-manifest build dependency to the CI process, which is exactly one of the things our devops engineers want to avoid. Especially in a no-network scenario.

tchernobog avatar Sep 23 '21 15:09 tchernobog

But is the resolver guaranteed to pull exactly the same transitive dependency versions as the workspace root with this approach? Else, it would violate version immutability principles needed for CI.

Another very similar point: Is the resolver guaranteed to select the same features for the crate that contains the binary that is being run and all its dependencies?

This has however the drawback to still require on CI of running cargo install cargo-task before being able to run a dependency.

That's very true, as a separate tool this would not really solve the problem at hand. It would make it possible to experiment and hopefully work out all the kinks, without having to coordinate with the cargo project, though. Which sounds like a huge advantage.

soenkehahn avatar Sep 23 '21 20:09 soenkehahn

cargo-metadata gives you all the resolver data you need to then go and compile the binary with the same set of features. Though, you might need a way to have more features enabled for the binary than the dev-library dependency (which is an issue for the RFC as well as the proposed third-party tool), it would be annoying having to compile clap etc. when you're just running tests and not invoking the binary.

Nemo157 avatar Sep 24 '21 11:09 Nemo157

Thanks for posting the RFC! I think the team is warm towards making it easier to run specific versions of CLI tools.

I'm wondering if you could maybe come up with some different and additional use cases. The one stated in the introduction on mdbook doesn't seem like a compelling example. mdbook has a library, and that is the intended method for interacting with it from a build script. In general, we strongly prefer that build scripts use libraries instead of executing commands whenever possible. Additionally, build scripts are not intended to write to any files outside of the OUT_DIR. I suspect this is suggesting to avoid that restriction.

I also think there maybe be some confusion about RFC #3028, as the text here currently mentions that it is specifically for embedding artifacts. The path to the executable is given to the build script so that it can execute it. It is not necessarily just for embedding the contents.

ehuss avatar Sep 27 '21 22:09 ehuss

An embedded use case is flashing/debugging after the compile. Weve actually rewritten a lot of these in rust so they COULD be used as a library, but at the moment the only way to hook post compile commands is the .cargo/config.toml runner so we have to have people install one or more tools in the readme like cargo-flash or probe-run or hf2-cli because the runner is set to use them.

Note some of these tools still need libusb to be linked so this isnt a perfect solution as there might be further system deps needed on some platforms so installing the binary isnt enough

jacobrosenthal avatar Sep 28 '21 00:09 jacobrosenthal

FWIW, the implementation of the artifact dependencies is available as an unstable option. Since this feature was cited above, I thought it could be helpful.

dbofmmbt avatar May 20 '22 07:05 dbofmmbt

Yes I think artifact deps subsumes this and so this can be closed in lieu of https://github.com/rust-lang/cargo/issues/9096

Ericson2314 avatar Dec 13 '22 21:12 Ericson2314

@rfcbot close

Eh2406 avatar Dec 13 '22 21:12 Eh2406

Team member @Eh2406 has proposed to close this. The next step is review by the rest of the tagged team members:

  • [x] @Eh2406
  • [x] @ehuss
  • [x] @epage
  • [ ] @joshtriplett
  • [x] @weihanglo

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot avatar Dec 13 '22 21:12 rfcbot

I don't think artifact-dependencies is a complete alternative. As discussed in this RFC text, it is not possible to directly run binaries from artifact-dependencies on the command-line. There are some use cases where it can be helpful to have a CLI tool that the developer can run, and to have the CLI tool versioned with the local project (and not installed in the global location). An example would be diesel_cli, where you may not want to use a single version across multiple projects.

However, since there hasn't been much discussion about specific use cases where having a locally versioned CLI tool, I'm going to check my box, although more of as a postpone status, since there hasn't been much discussion or clear motivation.

ehuss avatar Dec 13 '22 22:12 ehuss

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot avatar Dec 13 '22 22:12 rfcbot

I don't see how https://github.com/rust-lang/cargo/issues/9096 could allow to do something equivalent to npm run with cargo. In my understanding, it would only be useful for build.rs. This concern was also raised on https://github.com/rust-lang/cargo/issues/2267.

Maybe artifact dependencies could be extended to cover the use cases described here and in the issue #2267, as proposed on https://github.com/rust-lang/rfcs/pull/3168#issuecomment-906912941?

dbofmmbt avatar Dec 14 '22 01:12 dbofmmbt

The final comment period, with a disposition to close, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This is now closed.

rfcbot avatar Dec 23 '22 22:12 rfcbot