cargo icon indicating copy to clipboard operation
cargo copied to clipboard

fix(publish): Block until it is in index

Open epage opened this issue 3 years ago • 6 comments

Originally, crates.io would block on publish requests until the publish was complete, giving cargo publish this behavior by extension. When crates.io switched to asynchronous publishing, this intermittently broke people's workflows when publishing multiple crates. I say interittent because it usually works until it doesn't and it is unclear why to the end user because it will be published by the time they check. In the end, callers tend to either put in timeouts (and pray), poll the server's API, or use crates-index crate to poll the index.

This isn't sufficient because

  • For any new interested party, this is a pit of failure they'll fall into
  • crates-index has re-implemented index support incorrectly in the past, currently doesn't handle auth, doesn't support git-cli, etc.
  • None of these previous options work if we were to implement workspace-publish support (#1169)
  • The new sparse registry might increase the publish times, making the delay easier to hit manually
  • The new sparse registry goes through CDNs so checking the server's API might not be sufficient
  • Once the sparse registry is available, crates-index users will find out when the package is ready in git but it might not be ready through the sparse registry because of CDNs

So now cargo will block until it sees the package in the index.

  • This is checking via the index instead of server APIs in case there are propagation delays. This has the side effect of being noisy because of all of the "Updating index" messages.
  • This is done unconditionally because cargo used to block and that didn't seem to be a problem, blocking by default is the less error prone case, and there doesn't seem to be enough justification for a "don't block" flag.

In reviewing this change, be sure to look at the individual commits

  • The first makes it possible to write the tests for this
  • The second adds a test that shows the current behavior
  • The third updates the test to the expected behavior, showing all of this works

Fixes #9507

epage avatar Sep 08 '22 18:09 epage

r? @weihanglo

(rust-highfive has picked a reviewer for you, use r? to override)

rust-highfive avatar Sep 08 '22 18:09 rust-highfive

Hmm, hadn't run all of the tests before posting and hadn't considered there isn't a functioning-enough registry to pull from.

Open to people's thoughts on doing some kind of hack for this (env variable, unstable flag to not block, etc) or to try to add an insta-stable flag for controlling blocking.

epage avatar Sep 08 '22 20:09 epage

Hmm, hadn't run all of the tests before posting and hadn't considered there isn't a functioning-enough registry to pull from.

How hard would it be to implement a real publish API? I suspect a simplified implementation should be relatively short (maybe 40-50 lines?). I think the steps would roughly be:

  1. Read the data sent by cargo, and deserialize the JSON.
  2. Save the .crate file into the correct directory.
  3. Convert the publish JSON to the index JSON. I think this should be relatively basic copy of fields into a new serde_json::Value. The only one I think that needs real translation is the registry field of a dependency. I think the cksum would also need to be computed. EDIT: or just use Cargo's types like NewCrate directly (would need Deserialize derives).
  4. Add that JSON to the index, and commit it.

ehuss avatar Sep 08 '22 23:09 ehuss

The fact that almost all of our tests do not require actually running a web server, it's kind of nice for performance and reliability. An alternative implementation to support the tests, would be to add a flag for skipping the new delay.

We should also keep in mind in designing this that not all registries necessarily publish promptly. it is perfectly reasonable for a registry that accepts the publish API to require some kind of intervention before it appears in the index. I could see a registry requiring a user to review the package they just uploaded to verify that it looks correct before adding it to the index. A registry could also do an automated or manual security audit before it is added.

Eh2406 avatar Sep 11 '22 02:09 Eh2406

We should also keep in mind in designing this that not all registries necessarily publish promptly.

For this use case, it seems like we'd want this to be part of registry configuration as passing in a --wait=no every time doesn't seem right.

Maybe we should start out with both an unstable flag and an unstable config value so any users can unblock themselves and we can also use this as a way to collect feedback on what is needed and not block this on naming or other details like that.

epage avatar Sep 12 '22 12:09 epage

We should also keep in mind in designing this that not all registries necessarily publish promptly.

To start speculating on an RFC... Thinking about this more, I think the right place for a registry to tell us how long to wait is in the response to the /api/v1/crates/new call. Either in the response object or in headers. I see a couple of reasonable things a registry could want to tell a user about how long to wait:

  • Retry for some expected period of time (the behavior you're adding here) possibly with the ability for a registry to estimate how long the retries should run for.
  • A URL that needs to be interacted with in order for progress to be made. For the cases where it requires the author to do a manual review.
  • A URL that can be used to track the progress of publication. (Possibly with some specification for how cargo can use it as a retry loop.)
  • Clarification that the retry loop is unlikely to be productive. Say if all publishes require manual review by the registry.

On the other hand, we can change the default behavior now and add such configurability in a backwards compatible way in the future.

Eh2406 avatar Sep 13 '22 16:09 Eh2406

I wonder if we could express all of these with semi-standard HTTP response codes + headers:

Retry for some expected period of time (the behavior you're adding here) possibly with the ability for a registry to estimate how long the retries should run for.

202 Accepted
Retry-After: Fri, 07 Nov 2022 23:59:59 GMT
Content-Location: https://crates.io/api/v1/crates/$crate/$new_version

A URL that needs to be interacted with in order for progress to be made. For the cases where it requires the author to do a manual review.

202 Accepted
Location: https://registry.example.com/approve/$crate/$new_version

A URL that can be used to track the progress of publication. (Possibly with some specification for how cargo can use it as a retry loop.)

202 Accepted
X-Cargo-Track-Progress: v1
Content-Location: https://crates.io/api/v1/crates/$crate/$new_version

Clarification that the retry loop is unlikely to be productive. Say if all publishes require manual review by the registry.

202 Accepted

And to add one: "available now" could be

201 Created
Location: https://crates.io/api/v1/crates/$crate/$new_version

jonhoo avatar Sep 14 '22 22:09 jonhoo

I was curious why this waits after publish, instead of waiting for dependencies to be available? The original intent I had for #9507 is to only block publishing to wait for dependencies to become available. That way, it doesn't affect the majority of cases where someone is publishing a single package. Do you have any thoughts on those approaches?

I'm a little concerned about always blocking until it is available. There are times where there can be a significant delay, and the user may not care, but would be bothered if publishing gets "stuck".

I can see some benefit of blocking before finishing, such as someone wanting to publish an announcement. But I'm a bit wary of forcing that behavior.

ehuss avatar Sep 20 '22 18:09 ehuss

We should also keep in mind in designing this that not all registries necessarily publish promptly. it is perfectly reasonable for a registry that accepts the publish API to require some kind of intervention before it appears in the index. I could see a registry requiring a user to review the package they just uploaded to verify that it looks correct before adding it to the index. A registry could also do an automated or manual security audit before it is added.

Indeed. But right now such a registry would be unuseable with cargo, because to publish a workspace of interdependent crates one would have to wait for the human review cycle to complete before publishing the next crate.

To support a registry with this kind of behaviour, it would be necessary to allow cargo publish of a crate whose dependencies are not yet available via the published view of the regsitry.

ijackson avatar Sep 22 '22 15:09 ijackson

:umbrella: The latest upstream changes (presumably #11111) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Sep 23 '22 01:09 bors

:umbrella: The latest upstream changes (presumably #11069) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Oct 07 '22 20:10 bors

This patch changes the behaviour of cargo publish as listed below:

  • By default, synchronously wait for the publish propagating to crates.io index.
    • The default timeout is 60 seconds. Polling interval is 1 second.
  • To opt out, set the unstable config publish.timeout = 0.

Ed summarizes this feature very well in the PR description. I'd recommend taking a look.

I am going to propose this to be merged, but 1.65.0 is going to release on November 3rd, do we want to postpone the merge after that to maximize the test window?

@rfcbot merge

weihanglo avatar Oct 13 '22 07:10 weihanglo

Team member @weihanglo has proposed to merge this. The next step is review by the rest of the tagged team members:

  • [x] @Eh2406
  • [x] @ehuss
  • [x] @epage
  • [ ] @joshtriplett
  • [x] @weihanglo

Concerns:

  • ~~update-issue~~ resolved by https://github.com/rust-lang/cargo/pull/11062#issuecomment-1286423072

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot avatar Oct 13 '22 07:10 rfcbot

I am going to propose this to be merged, but 1.65.0 is going to release on November 3rd, do we want to postpone the merge after that to maximize the test window?

If we quickly get votes, don't wait for the 10 day period, and immediately update rust-lang/rust, we'll get 9 weeks of testing. 7 or so if votes take some time and we wait for the full FCP period.

Seeing as we've discussed this and the plan, I'm assuming the vote on this is uncontroversial and we can forego the 10 day waiting period.

Either way, if people want to wait until after November 3rd, I can understand. I would ask that we just change the default to 0 and merge as-is because this has already had 3 different semantic merge conflicts, one that took a decent amount of time to resolve.

epage avatar Oct 13 '22 12:10 epage

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot avatar Oct 13 '22 13:10 rfcbot

@rfcbot concern update-issue

I'm trying to test this out, but I can't seem to get it to work. I opened #11253 for what I'm seeing. Can someone take a look?

ehuss avatar Oct 17 '22 22:10 ehuss

@rfcbot resolve update-issue

ehuss avatar Oct 21 '22 03:10 ehuss

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot avatar Oct 21 '22 03:10 rfcbot

One thing we may want to look at separately is the UI while it is waiting. Cargo just keeps repeating Updating crates.io index. I'm wondering if it might be better to suppress the Updating message? That could maybe be implemented by temporarily enabling quiet mode. I'm also wondering about maybe showing some kind of busy indicator (maybe a progress bar that ticks forward for each second)?

I'm also a little concerned that it feels like there is essentially no test for this feature (that exercises what users will experience). Would someone be willing to follow up with a test for it? I think it shouldn't be too difficult to change the PUT handler to optionally delay the write_to_index call (it could maybe toss it into a thread or something to run in the background).

ehuss avatar Oct 21 '22 04:10 ehuss

The recent Cargo meeting concluded that we want to move forward and maximize the testing period for this default timeout setting before it hits next stable version. I'll merge it now then.

@bors r+

weihanglo avatar Oct 27 '22 15:10 weihanglo

:pushpin: Commit f2fc5ca86d9efbc7dbf8c5d88734ecb498718413 has been approved by weihanglo

It is now in the queue for this repository.

bors avatar Oct 27 '22 15:10 bors

:hourglass: Testing commit f2fc5ca86d9efbc7dbf8c5d88734ecb498718413 with merge 7e484fc1a766f56dbc95380f45719698e0c82749...

bors avatar Oct 27 '22 15:10 bors

:sunny: Test successful - checks-actions Approved by: weihanglo Pushing 7e484fc1a766f56dbc95380f45719698e0c82749 to master...

bors avatar Oct 27 '22 16:10 bors