cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Tracking Issue for serving an index over HTTP

Open ehuss opened this issue 3 years ago • 6 comments

Summary

RFC: #2789 Implementation: #10470 Documentation: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#sparse-registry Issues: https://github.com/rust-lang/cargo/labels/Z-sparse-registry

This is a tracking issue for RFC #2789, an experimental extension to serve the index over HTTP instead of via git.

Unresolved issues

  • [ ] Cache invalidation
    • Currently 1 minute TTL. Too long for publish, too short for effective caching.
    • Minor concern about how long an invalidation takes (Cloudfront says 10 to 100 seconds)
    • The cost is expected to negligible.
    • https://github.com/rust-lang/crates.io/issues/4913
  • [ ] Make sparse+ and registry+ more consistent. https://github.com/rust-lang/cargo/pull/10470#discussion_r832484967
  • [ ] Find a better name (and structure) for index_version https://github.com/rust-lang/cargo/pull/10470#discussion_r832529130
  • [ ] Handle user urls that start with http. They currently imply registry+. But if sparse+ are more common then it would be good to auto detect. https://github.com/rust-lang/rfcs/pull/2789#issuecomment-737240327
  • [x] Have a plan for lockfiles.
  • [x] Implement support in crates.io https://github.com/rust-lang/crates.io/pull/4661
  • [ ] Check that crates.io is willing to maintain their current system at production levels of traffic.
  • [x] Teach cargo that crates.io has two equivalent indexes. #10722
  • [ ] Make sure we are future compatible with a merkle tree design.
  • [ ] The RFC is intentionally sparse on details, it is almost an eRFC. Do we need another RFC/PR to discuss the details we have chosen?
  • ~~Does not support fuzzy queries. Is that a blocker? Only used by error messages.~~ This was a misunderstanding, indexes don't support name-based typo suggestions at all. Sparse supports the same case-folding and -/_ fuzzy comparison as git indexes. Sparse fundamentally won't be able to support typo suggestions, but since cargo hasn't handled that in the past, this isn't a regression, and more like a "can't do that in the future" thing.
  • [ ] Investigate publish issues. 1 minute update is quite slow if publishing multiple packages (and cargo does not yet have retry logic).

Current best way to test

To try it out, add the -Z sparse-registry flag on nightly-2022-06-20 or newer build of Cargo. For example, to update dependencies:

rustup update nightly
cargo +nightly -Z sparse-registry update

The feature can also be enabled by setting the environment variable CARGO_UNSTABLE_SPARSE_REGISTRY=true. Setting this variable will have no effect on stable Cargo, making it easy to opt-in for CI jobs.

If you see any issues please report them in new tickets here in the Cargo repo. The output of Cargo with the environment variable CARGO_LOG=cargo::sources::registry::http_remote=trace set will be helpful in debugging.

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Implementation history

  • #10064

  • #10470

  • #10482

  • #10698

  • #10725

  • #10738

  • #10830

  • #10831

  • #10835

  • https://github.com/rust-lang/crates.io/pull/4661

  • https://github.com/rust-lang/crates.io/pull/4826

ehuss avatar Jan 12 '21 19:01 ehuss

Hey, would this mean that when cargo publish exits with 0 the published crate version is actually available (which is not the case now, due to CDN)?

mightyiam avatar Jun 22 '22 05:06 mightyiam

It doesn't address this problem. There is still going to be a delay between cargo publish and the crate being available globally. It would be nice to improve this, but it's more of an issue for server-side crates-io implementation than the registry protocol.

kornelski avatar Jun 22 '22 10:06 kornelski

@mightyiam #9507 is the issue you want to be following.

For background, publishing a crate used to be a blocking operation but to speed up crates.io, they made cargo publish just put the crate into a queue for later publishing, making the operation asynchronous.

Tools like cargo-release poll the server. We could easily do similar in cargo until an improved registry protocol is made. I was looking at implementing it but there was a large hurdle for writing the relevant tests. The new sparse registry code has improved the test infrastructure so I could now write the relevant tests.

epage avatar Jun 22 '22 14:06 epage

Can the unstable feature be enabled via .cargo/config.toml?

tigregalis avatar Jul 24 '22 13:07 tigregalis

Yes, I think it is:

[unstable]
sparse-registry = true

Eh2406 avatar Jul 24 '22 13:07 Eh2406

Just an update for anyone following this issue. A proposal for how to configure git vs http is available at https://hackmd.io/@rust-cargo-team/B13O52Zko. Additionally #10964 and #10965 contain some more background.

ehuss avatar Aug 31 '22 19:08 ehuss

Hi, has there been any progress on this? Anything I can help with? We would really like to help this get into stable ASAP

luciusmagn avatar Sep 26 '22 07:09 luciusmagn

Thank you for being interested in this feature! The Cargo team and contributors are currently working hard on this topic, as well as collaborating with crates.io team to build up the infrastructure needed (rust-lang/crates.io#5200, rust-lang/crates.io#5112, rust-lang/crates.io#5066, etc).

One of the things we're happy to collect is feedback for this feature. Personally I'd appreciate any user feedback, especially about user experiences. You can find more relevant issues from label https://github.com/rust-lang/cargo/labels/Z-sparse-registry, or in the description of this issue, which contains the development history and unresolved issues.

weihanglo avatar Sep 26 '22 08:09 weihanglo

Based on https://github.com/rust-lang/cargo/issues/10722 I was expecting it to be possible to fetch crates using -Zsparse-registry and then use them without it, but this fails:

> CARGO_HOME=$(mktemp -d) sh -c 'cargo fetch -Zsparse-registry && cargo check --locked --offline'
    Updating crates.io index
  Downloaded serde v1.0.145
  Downloaded 1 crate (76.6 KB) in 0.27s
error: no matching package named `serde` found
location searched: registry `crates-io`
required by package `foo v0.1.0 (/tmp/tmp.mos8FpTA4j/foo)`
As a reminder, you're using offline mode (--offline) which can sometimes cause surprising resolution failures, if this error is too confusing you may wish to retry without the offline flag.

I guess the equivalence is only known for some subset of operations? Is this something that should be supported, or must all operations always use the same variant of crates.io.

Nemo157 avatar Oct 15 '22 10:10 Nemo157

Using CARGO_UNSTABLE_SPARSE_REGISTRY: true with nightly seems to be broken on GHA:

Updating crates.io index
error: failed to get `clap` as a dependency of package `cargo-binstall v0.15.1 (/home/runner/work/cargo-binstall/cargo-binstall/crates/bin)`

Caused by:
  failed to query replaced source registry `crates-io`

Caused by:
  download of config.json failed

Caused by:
  failed to download from `sparse+[https://index.crates.io/config.json`](https://index.crates.io/config.json%60)

Caused by:
  [1] Unsupported protocol (Protocol "sparse+https" not supported or disabled in libcurl)
Error: Process completed with exit code 101.

NobodyXu avatar Oct 17 '22 11:10 NobodyXu

Any estimation on when this will be available on stable? Using the sparse registry feature is the only available solution for the following issue in Artifactory https://www.jfrog.com/jira/browse/RTFACT-27248 which is currently a blocker for us.

jasal82 avatar Dec 01 '22 12:12 jasal82

@jasal82 for corporate proxies, use net.git-fetch-with-cli = true.

kornelski avatar Dec 01 '22 14:12 kornelski

@kornelski We're already using that option in the clients. However, for production use we must not use the upstream repositories but Artifactory remote caches instead. Now the problem is that Artifactory cannot properly update its remote caches from the upstream repo when a new previously uncached package is requested. This is definitely a bug in Artifactory but they refuse to fix it and the only answer we got from them is that we should use sparse registries instead which seems to work.

jasal82 avatar Dec 01 '22 19:12 jasal82

I'd also be interested on when this might hit stable. We also have CI difficulties/runarounds in docker because Cargo deciding that it needs to clone the index from scratch takes a long time for whatever reason, and I suspect we're not alone in that.

For the curious our docker solution is cargo install lazy-static which forces cargo to get an index before failing out because there's no binary target.

ahicks92 avatar Dec 01 '22 21:12 ahicks92

For what it's worth, rust-analyzer also tends to get blamed for this ("but after I closed Code, I ran cargo clean and cargo check and they were fast, it's RA that's slow!") because it calls cargo metadata when loading the project.

lnicola avatar Dec 02 '22 05:12 lnicola

It's bad even on my gigabit internet at this point if I'm being honest. But since that's once per install that doesn't seem worth complaining about, or something like that. I'll be honest though: I dread that progress bar nowadays.

ahicks92 avatar Dec 02 '22 14:12 ahicks92

A proposal to stabilize this has been posted at #11224.

ehuss avatar Jan 04 '23 18:01 ehuss

Sparse registries are now stabilized and will be available in Rust 1.68, which will be released on 2023-03-09. I'm going to close this tracking issue as I think it has served its purpose. There is still some future work to be continued, such as better support for third-party registries. Those issues can be tracked in the https://github.com/rust-lang/cargo/labels/A-sparse-registry label.

ehuss avatar Jan 30 '23 14:01 ehuss

I'm super glad this got merged, but I am having trouble finding the necessary config steps to enable the new behaviour. Most of the documentation, like in the first post in this issue, still mention it as unstable feature to be enabled with -Z. Adding [registries.crates-io] protocol = sparse to my cargo.toml doesn't work, i suppose it needs to be added to some cargo config file somewhere else. Could someone that got it running write some annotations or documentation in some easy to find places?

Christoph-AK avatar Mar 22 '23 16:03 Christoph-AK

Cargo.toml doesn't work

Agreed, this is a great feature! According to the release notes you're supposed to add the setting to .cargo/config.toml, not Cargo.toml. Either that, or set the environment variable CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse. The release notes have some links to more information as well!

JockeTF avatar Mar 22 '23 17:03 JockeTF

you're supposed to add the setting to .cargo/config.toml, not Cargo.toml.

thanks for this and those links, but for future outsider readers like me, it looks like this became the default in cargo 1.70 which was released on June 2023 (https://blog.rust-lang.org/2023/06/01/Rust-1.70.0.html#sparse-by-default-for-cratesio), so a better approach now IMO is just to rustup update, which will use the fast index on all repos by default without any per-repo config.

jose-mut-lopez avatar Jun 19 '23 14:06 jose-mut-lopez