cargo
cargo copied to clipboard
cargo package --workspace is not very useful
Tracking
Unstable flag: -Zpackage-workspace
Stabilizing this would also close
- #1169
Implementation
- [x] #13947
- [x] #14433
- [x] #14659
- [ ] When stabilizing, do a clean up of d30fde980f5 man page duplicates.
Changes since original plan
- Added
--registry
and--index
flags tocargo package
to know what registry will be used for generating theCargo.lock
file as if the internal dependencies were already published -
cargo publish
is not atomic but it does verify all before publish
Open questions
- Are we ok with a slight compatibility breakage in
cargo package
? See https://github.com/rust-lang/cargo/issues/10948#issuecomment-2253530842 - Are we ok stabilizing this and #1169 at the same time? Currently, they are behind the same flag
- What is the desired behavior for the publish timeout? #14433 uploads the crates in batches (depending on the dependency graph), and we only timeout if nothing in the batch is available within the timeout, deferring the rest to the next wait-for-publish. So for example, if you have packages
a
,b
,c
then we'll wait up to 60 seconds and if onlya
andb
were ready in that time, we'll then wait another 60 seconds forc
. - What is the desired behavior when publishing some packages in a workspace that have
publish = false
? #14433 raises an error whenever any of the selected packages haspublish = false
, so it will error oncargo publish --workspace
in a workspace with an unpublishable package. An alternative interface would implicitly exclude unpublishable packages in this case, but still error out if you explicitly select an unpublishable package with-p package-name
(see #14356). #14433 behavior is the most conservative one as it can change from an error to implicit excludes later.
Known issues
- #14396
Problem
Let's say you have a workspace
workspace/
package_a/
Cargo.toml (name="a", version="0.1.0")
package_b/
Cargo.toml (name="a", version="0.1.0", [dependencies] a="0.1.0")
Cargo.toml (members = ["package_a","package_b"])
If you have already published a
to crates.io, then cargo package --workspace
will complete successfully.
Now, you update a
to 0.1.1
, and update b
to use that new minimum version. If you try cargo package --workspace
again, it will no longer work. You will get an error that 0.1.1
was not found.
This happens because cargo package
makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1
from the registry, which it fails to do.
Proposed Solution
If you have a workspace where some package b
depends on another package a
, where a
is both specified by a version AND a path, AND that path is within the workspace members, then the following will happen:
cargo package --workspace
will make the new dummy project to verify the crates, but it will leave in the path information for a
within b
.
Notes
In my experience, I've never had a workspace where all crates are independent. There's always at least one that depends on another crate. When managing private registries, it's not uncommon to cargo package
and upload the .crate
file manually.
Currently, it's necessary to package each workspace member individually and upload them in the order they depend.
Ideally, we can run a single
cargo package --workspace
after bumping all versions, then get all of the .crate
files and upload them in a batch.
This sounds very useful!
I took a look. First time contributor to cargo so just trying to find the relevant parts of the codebase. It looks to me like build_lock()
in src/cargo/ops/cargo_package.rs
, then TomlManifest::prepare_for_publish()
in src/cargo/util/toml/mod.rs
might be where the logic of @conradludgate described may be:
This happens because cargo package makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1 from the registry, which it fails to do.
So I believe that here we is where we would need to implement @conradludgate 's proposed solution, and to do so in the order the crates depend on another.
This is after a quick look at the code for the very first time ever, so I may be off base. I'm going to keep looking and try to confirm what I think. Also I'm on Zulip if people want to chat/mentor there
EDIT: After this Zulip discussion, due to the limited bandwidth of cargo team members new features should be accepted first before opening a PR. I am going to work on some issues raised in that thread that already have momentum to try my first contribution
One background knowledge: cargo publish
is roughly cargo package + call crates.io API
. Therefore, some shared code path between cargo publish
and cargo package
might affect each other. The requirement of a dependency existence on a registry might come from what cargo publish
needs: a .crate
file is supposed to be distributable.
One thing a bit weird is that packaging a crate with all verifications but not publishing the path dependencies. If cargo permit that, it make me feel like we try to convince cargo that our build is verified but only locally. As aforementioned a .crate
file is supposed to be distributable. This rule may not be true anymore if doing so. Maybe we need to think more on whether enabling packaging with unpublished path dependencies is on the right direction, but I'd say the issue you have does exist and people hit it.
At this time being, a good tool for releasing crates is cargo-release, created by sunny87 and epage. I haven't tried it but I believe it works. Perhaps @epage can give you more experiences on this topic :)
BTW, this is somehow related to https://github.com/rust-lang/cargo/issues/9260#issuecomment-1219714903. There are more discussions in the wild regarding a more sequential and smooth packaging/publishing process, though I cannot find them now.
Cargo release is a good tool for workspaces. Unfortunately, we can't use it as-is since we don't have the ability to publish using cargo. However, its something we could fork and fix to work with our own registry
@jneem and I are looking into this. Some of the aspects we are considering are:
- It's not so much that
cargo package
makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated. - Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.
- Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if
cargo package
could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates? - Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to
cargo publish
where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (multi-package publishing).
If anybody has opinions on this, feel free to chime in.
@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.
One way this could possibly be split up is
- Implement this for
--no-verify
- Implement verification
It's not so much that cargo package makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated.
For verification, we should be verifying the generated .crate
, rather than an intermediate to ensure maximum verification.
I wonder if we can inject patches rather than changing anything about the code. This can all be done in-memory, rather than writing it out to disk.
Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.
I have no preference between whether we package+verify one at a time or package all then verify all. We should likely do separate compilation per verify though. #5931 would speed up the compile times for this.
Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if cargo package could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates?
There are times to depend on old versions of packages.
- Dev-dependencies for comparing across versions (usually for major versions)
- Semver hack (for major versions)
- You can have access to a registry variant of a package through transitive dependencies, either semver compatible or not.
Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to cargo publish where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (https://github.com/rust-lang/cargo/issues/1169#issuecomment-1753461359).
This isn't relevant to this issue but the follow up one. Let's keep the conversations in each Issue focused and move this over to there.
Thanks for your input, it's been noted.
@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.
Sure, we will show up there.
The rough idea we have so far is:
- Determine all intra-workspace path dependencies.
- Package each crate using the existing code (without trying to compile/build it, or otherwise resolve online dependencies).
- Unpack all packaged crates in a temporary directory, giving each one a random dir name.
- Go through all workspace dependencies from step 1 and fix them up in each crate. Can we do this in-memory?
- See if this builds.
Running into a chicken-and-egg problem with the lock files:
When packaging, Cargo strips away all path dependencies and generates lock files by expecting to find dependent packages online. So that elements like this can be put into the lock file:
[[package]]
name = "my-dep"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "780f1cebed1629e4753a1a38a3c72d30b97ec044f0aef68cb26650a3c5cf363c"
But when generating a lock file using just a local path dependency, we do not get any source
or checksum
. While we could add on a source
and a checksum
after the fact, we can't really assume that the dependency will end up on crates.io. But we also cannot publish a crate that doesn't specify where to find its dependencies.
So it seems that packaging and publishing are strongly coupled, and there's no way around packaging+publishing one crate at a time.
Unless we introduced something like source = same-as-this-one
.
Any ideas?
A possible solution is to let cargo package
have the same --index
option that cargo publish
has for indicating which registry to use. So you'll use that to indicate which registry you intend to publish to.
Assuming we could use the --index
option suggested above, we still need to figure out how to modify the generated lock files to include a source and a checksum. Ideally, we would like to modify the new_resolve
in build_lock()
before it gets serialised. But a Resolve
seems quite immutable.
We could of course try to modify the lock file after serialisation, but that doesn't seem very nice. The alternative is to modify the resolver code so it can treat a crate as if it was pulled from a registry.
We have two problem steps in cargo package --workspace
- Generating the lockfile for each
.crate
: path dependencies won't be in the registry yet, so this will fail.- Above they talked about forcing path dependencies to be used during the lockfile generation and then patching up the lockfile afterwards with the correct source and checksum
- In theory, we could do some kind of decoration of crates.io to allow this in-memory...
- Regardless of the method we use to workaround that, to generate the source, it seems like we need to know where the
.crate
files are intended for so we'd need--index
/--registry
flags
- Verify step for each
.crate
: the decompressed.crate
files will need to depend on each other, rather than their registry, to build- We could do in-memory patches
- Maybe there are alternatives that build on what we did for the lockfile?
@Eh2406 or @arlosi any thoughts or tips on this?