rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC: Nested Cargo packages

Open kpreid opened this issue 2 years ago • 17 comments

Rendered

This is my first Rust RFC. The idea was previously discussed on IRLO (Private nested Cargo packages), and the feedback was generally positive except for confusion about exactly what was being changed, which I believe I have now precisely defined in this RFC draft.

rust-lang/cargo#2203

kpreid avatar Jul 01 '23 00:07 kpreid

Thanks @kpreid for taking the time to write this up!

epage avatar Jul 01 '23 01:07 epage

Progress on the RFC has been stalled mainly because I have been pondering how to specify the behavior given the premise that inclusion should be based on the presence of dependencies, not just directory nesting (particularly to permit nested “utility” packages which, in the workspace they are developed, are not directory children of the published package, because they are shared between multiple published packages).

I'm currently thinking that:

  • Each dependencies entry must explicitly request nesting of the named package
  • If the nested package is within the parent's directory, keep it there. If not, put it in a designated location, perhaps ./.cargo-package/packages/ (the ./.cargo-package/ parent directory being reserved for future extensions in case something like this comes up again).
  • Sub-packages declaring package.publish = "nested" is still required, to prevent mistakes. But there might be e.g. a flag to cargo publish to override that requirement.
  • Sketch out some means by which "public binary packages" could be achieved, to make sure that we are not making it difficult to add later. Since such packages will often depend on the parent package, not vice versa, they will need explicit manifest information pointing at them, unlike the current version.
    • Perhaps something like package.include-packages = ["..."], and if you specify it then it has to be complete (i.e. it is an error if a dependency entry asks for nesting but include-packages does not mention that package.

I think that covers most of the necessary ground, but I still need to write the corresponding RFC text.

kpreid avatar Jul 26 '23 03:07 kpreid

Huh, just found #2224. @kpreid I'd recommend going through that old RFC discussion to see what is relevant or not.

epage avatar Jul 28 '23 14:07 epage

This would dramatically improve the current state of publishing and managing proc macro crates, and just would be a huge QOL improvement for large-workspace projects, of which there are many.

sam0x17 avatar Aug 18 '23 19:08 sam0x17

Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?

For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).

decathorpe avatar Nov 11 '23 21:11 decathorpe

Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?

Without knowing the intricacies of downstream users, its hard to predict.

The one case that might be weird is if there are anti-vendoring rules. One use case for nested packages is allowing one user-visible package to be made up of multiple compilation units, so you should consider the code as if it was under src/. However, depending on how general we make this, it might allow some degree of vendoring of code from outside of the project.

For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).

This sounds like its more of a "tools need updating when nightly implementation is available" rather than "this affects the design or feasibility". Is that right?

epage avatar Nov 12 '23 02:11 epage

Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?

Without knowing the intricacies of downstream users, its hard to predict.

The one case that might be weird is if there are anti-vendoring rules. One use case for nested packages is allowing one user-visible package to be made up of multiple compilation units, so you should consider the code as if it was under src/. However, depending on how general we make this, it might allow some degree of vendoring of code from outside of the project.

Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...

For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).

This sounds like its more of a "tools need updating when nightly implementation is available" rather than "this affects the design or feasibility". Is that right?

Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).

decathorpe avatar Nov 12 '23 11:11 decathorpe

Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...

I assume there isn't a reason we need to be taking on this distribution policy (this isn't a rule we've agreed to, we already support other forms of vendoring, and users can always copy/paste code) but that this is more of a note that you'll be needing to patch these packages?

Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).

If you watch the tracking issue, when created, and the stabilization FCP ends with a disposition to merge, that gives you 6-12 weeks warning before people will start using it widely. Even then, people are unlikely to use it and release on Day 1.

Is that sufficient?

epage avatar Nov 13 '23 15:11 epage

Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...

I assume there isn't a reason we need to be taking on this distribution policy (this isn't a rule we've agreed to, we already support other forms of vendoring, and users can always copy/paste code) but that this is more of a note that you'll be needing to patch these packages?

I'll need to look into what the actual implementation looks like. If the "private" / "inner" / "nested" crates are just an implementation detail, and they are not exposed to other crates in any way, I don't think that would even count as them being vendored / bundled, since they are effectively "one project".

Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).

If you watch the tracking issue, when created, and the stabilization FCP ends with a disposition to merge, that gives you 6-12 weeks warning before people will start using it widely. Even then, people are unlikely to use it and release on Day 1.

Is that sufficient?

Yes. I've subscribed to all relevant tickets I could find, and that timeframe should give us enough time to make necessary tooling changes (if any).

decathorpe avatar Nov 13 '23 16:11 decathorpe

I'll need to look into what the actual implementation looks like. If the "private" / "inner" / "nested" crates are just an implementation detail, and they are not exposed to other crates in any way

Yes, that's the idea. From the outside view of the package, this should be no different from the fact that a package may compile multiple crates within itself (build script, library dep of binary); it's just generally recursive instead of restricted to specific cases.

(P.S. In general: Sorry I haven't gotten around to updating the RFC for months. I got a little writer's-block on it because of the dismaying prospect of ripping out my "elegant" file-inclusion-based system in favor of something needing more configuration and implementation rules, but I do intend to do that.)

kpreid avatar Nov 13 '23 18:11 kpreid

I have pushed major updates to the RFC text.

  • Nested packages (packages to be published in another package) are now allowed to not be sub-packages (packages which happen to be located inside another package directory). This solves some problems for workspace root packages and allows the possibility of nested packages being duplicated into more than one published package (making a Rust form of “header-only library”, in a sense).
  • Prior art has been added.
  • More drawbacks and considerations have been discussed.

I have not yet re-written the “Reference-level explanation” as requested.

kpreid avatar Feb 05 '24 20:02 kpreid

I have now rewritten “Reference-level explanation” and reorganized the discussion of alternatives, which I believe completes what I can currently do in response to reviews given so far. Two unresolved review conversations currently exist:

kpreid avatar Feb 05 '24 22:02 kpreid

food for thought: if every crate that has a proc-macro sub crate were to adopt this, that would be n * number of dependents less crates that would need to sit collectively in people's dependencies, which might actually be a non-zero percentage of all crates.io traffic depending on how this is implemented

sam0x17 avatar Feb 06 '24 15:02 sam0x17

Trying to raise visibility of this on the respective teams so we can see what more problems we can uncover.

epage avatar Mar 18 '24 17:03 epage

One drawback that I think would be good to include is the risk around having publicly exposed shared types be incompatible if a dependency is built multiple times.

For example: If I have three packages, "a", "b", and "shared". There is a dependency from a→shared and b→shared, and I publish a and b with shared being nested in both, and a and b expose types from shared, then they cannot be used together. That might not be evident even after publishing, since publishing won't validate that a and b work together. Also, the error message from rustc could be fairly difficult to understand.

ehuss avatar Apr 02 '24 15:04 ehuss

We discussed this further in today's cargo team meeting. The above comments opened represent some of the discussion.

Several areas of concern were focused on:

The original appeal of this solution is that is relatively trivial in thought as path dependencies exist. The problem is when getting into the details, particularly dependency resolution / the Index. On top of that, crates.io would likely want to duplicate the Index Summary generation from the .crate file because they want to work with one source of truth, rather than multiple (and its unclear if we can flatten in just Cargo.toml).

This also adds more design choices to the user, making it harder for them to navigate the choice of what tools to use which would include

  • workspaces
  • packages
  • build-targets / crates within a package
  • open namespaces
  • public / private packages
  • nested packages

While this helps to streamline people's workflows by reducing publishing overhead, there isn't any inherent blockers from people solving this today. The closest is likely in communicating semver guarantees and there is interest in something like pub(namespace) as a way solving the problem. For the friction in publishing these packages, we should likely work to improve that larger problem. In several workflows discussed, it seemed likely that a workspace would have more than one non-nested package, meaning workspace workflow improvements are still necessary.

All of that said, the unease wasn't enough to block this proposal at this stage but we would be open to this being explored further to see how much these concerns are addressed or validated when this is put into practice.

As for next steps, we didn't fully resolve that. The RFC affects two teams with crates.io mostly being involved for the Index side of things and would generally be implemented after a go/no-go from the above exploration. We didn't get to the point of discussing whether we'd want to push this RFC forward, experiment with it, and possibly reject it after approval if the concerns were shown to be blocking, or if we'd want to authorize an experiment to be done as part of the writing of this RFC, much like cargo-script.

The big risk would be to where you put your time as this would likely be a big experiment with enough uncertainty around it that the odds of it being stabilized are iffy.

epage avatar Apr 02 '24 20:04 epage