rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

crates.io: Remove dev-dependencies from the index

Open Turbo87 opened this issue 1 year ago • 13 comments

Rendered

Turbo87 avatar Jul 31 '24 12:07 Turbo87

it looks like there are no major concerns brought up so far and the point about the large number of deltas is already addressed in the RFC, so let's get this process started:

@rfcbot fcp merge

Turbo87 avatar Aug 16 '24 08:08 Turbo87

Team member @Turbo87 has proposed to merge this. The next step is review by the rest of the tagged team members:

  • [ ] @JohnTitor
  • [x] @LawnGnome
  • [x] @Rustin170506
  • [x] @Turbo87
  • [x] @carols10cents
  • [x] @eth3lbert
  • [x] @jtgeibel
  • [x] @mdtro

Concerns:

  • features using dev-dependencies (https://github.com/rust-lang/rfcs/pull/3674#issuecomment--1978044408)

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot avatar Aug 16 '24 08:08 rfcbot

I wonder if the issues with the excessive deltas/commits could also be reduced by starting the process shortly after having done a index squash, and by doing another index squash shortly after completion of the process.

I am currently (ab)using the presence of dev-dependencies in the index to construct benchmark cases for cargo's new resolver. This is absolutely not a blocker, because the data is available from other places like downloading and extracting the .crate files (or possibly the crates.io database dump depending on implementation details). Alternatively, as a stopgap, I could pin my benchmarking code to the last version of the index before this data was removed.

Eh2406 avatar Aug 19 '24 14:08 Eh2406

I wonder if the issues with the excessive deltas/commits could also be reduced by starting the process shortly after having done a index squash, and by doing another index squash shortly after completion of the process.

yeah, I thought I had put that in the RFC text, but apparently I forgot. my plan would be to keep this running for a couple of weeks/months and when we do a full sync we will likely couple it with an index squash then.

Turbo87 avatar Aug 19 '24 14:08 Turbo87

:bell: This is now entering its final comment period, as per the review above. :bell:

rfcbot avatar Aug 23 '24 14:08 rfcbot

before I forget: we discussed this RFC in the crates.io team meeting last week and once accepted and implemented we are planning to publish an Inside Rust blog post with a cut-off date before we enable this behavior. this should give people that don't follow the RFCs a bigger chance of getting notified of the behavior change and have time to adjust if they are actually using the dev-dependencies in the index for anything.

Turbo87 avatar Aug 27 '24 08:08 Turbo87

@rfcbot concern features using dev-dependencies

On Zulip @shepmaster notified us that cargo and crates.io currently allow publishing crates with features that depend on dev-dependencies. While these features are not usable from the outside, they can be useful for development purposes. One example of this is in the popular syn crate.

While these "private" features are not problematic in isolation, cargo currently throws up a parsing error when features exist that depend on a missing dependency declaration in the index:

failed to parse "3/s/syn" registry package: feature test includes syn-test-suite/all-features, but syn-test-suite is not a dependency

This makes removing the dev-dependencies more complicated, since we can't do it unconditionally anymore and have to check the features declarations first.

I currently see three options:

  1. we cancel this effort and keep the dev-dependencies in the index.
  2. we parse the features and keep only those dev-dependencies that are used in any of the features.
  3. we (recursively?) remove any features that depend on dev-dependencies.

I guess the latter two options are slightly more risky as we have to ensure that our logic matches that of cargo.

It ultimately becomes a question of whether the risk-benefit-ratio is high enough. I wrote this RFC under the assumption that cargo does not use the dev-dependencies in the index in any sort of form, since that was the initial response from the cargo team. Now that we discovered that cargo does indeed need the dev-dependencies in some cases I'm no longer sure whether the risk and additional complexity is worth the benefit anymore.

@rust-lang/crates-io @rust-lang/cargo I'm open to thoughts on how to proceed from here. I'm slightly leaning towards one of these three options, but I'm curious what others think :)

Turbo87 avatar Aug 29 '24 07:08 Turbo87

@rust-lang/crates-io @rust-lang/cargo I'm open to thoughts on how to proceed from here. I'm slightly leaning towards one of these three options, but I'm curious what others think :)

I'm inclined towards option 1 (don't make the change), in this case — option 2 (keep only used dev-dependencies) results in less of a win, and requires additional knowledge on the part of crates.io on how Cargo manifests are processed, and option 3 (remove features that depend on dev-dependencies) scares me a little. (At best, we'd have to understand how tooling is using the features metadata that's currently in the index.)

The lesson is never try.

LawnGnome avatar Aug 29 '24 18:08 LawnGnome

I'm interested in using the devs dependency information for a tool that I'm currently planning which will run the test suite for all your dependencies recurisvely (as this is something cargo doesn't support currently).

VorpalBlade avatar Sep 01 '24 06:09 VorpalBlade

Another concern: does lib.rs need this info @kornelski

VorpalBlade avatar Sep 01 '24 06:09 VorpalBlade

I'm interested in using the devs dependency information for a tool that I'm currently planning which will run the test suite for all your dependencies recurisvely

I'd suggest using the Cargo.toml files of the dependencies for that instead.

Another concern: does lib.rs need this info

lib.rs ingests the daily database dump and parses the Cargo.toml files itself, so I don't think they need this in the index either.

Turbo87 avatar Sep 01 '24 08:09 Turbo87

regarding the concern raised in https://github.com/rust-lang/rfcs/pull/3674#issuecomment-2316922888: in the crates.io team meeting last week we decided to delay our decision on how to proceed a little bit. @shepmaster is currently working on some feature in margo that might help us validate how viable the options are. once we have more information we will be better prepared to take a decision on this. until that time this RFC will stay open.

Turbo87 avatar Sep 01 '24 08:09 Turbo87

I'm currently using dev-dependencies from the index for reverse deps statistics on lib.rs, but I can live without this info.

kornelski avatar Sep 01 '24 16:09 kornelski

We talked about this RFC in the crates.io team meeting today and concluded that the marginal space savings from removing the dev-dependencies are not worth the risk of unintentional breakage of cargo or other third-party tools.

Turbo87 avatar Feb 14 '25 15:02 Turbo87