rfcs
rfcs copied to clipboard
RFC: Nested Cargo packages
This is my first Rust RFC. The idea was previously discussed on IRLO (Private nested Cargo packages), and the feedback was generally positive except for confusion about exactly what was being changed, which I believe I have now precisely defined in this RFC draft.
rust-lang/cargo#2203
Thanks @kpreid for taking the time to write this up!
Progress on the RFC has been stalled mainly because I have been pondering how to specify the behavior given the premise that inclusion should be based on the presence of dependencies, not just directory nesting (particularly to permit nested “utility” packages which, in the workspace they are developed, are not directory children of the published package, because they are shared between multiple published packages).
I'm currently thinking that:
- Each
dependenciesentry must explicitly request nesting of the named package - If the nested package is within the parent's directory, keep it there. If not, put it in a designated location, perhaps
./.cargo-package/packages/(the./.cargo-package/parent directory being reserved for future extensions in case something like this comes up again). - Sub-packages declaring
package.publish = "nested"is still required, to prevent mistakes. But there might be e.g. a flag tocargo publishto override that requirement. - Sketch out some means by which "public binary packages" could be achieved, to make sure that we are not making it difficult to add later. Since such packages will often depend on the parent package, not vice versa, they will need explicit manifest information pointing at them, unlike the current version.
- Perhaps something like
package.include-packages = ["..."], and if you specify it then it has to be complete (i.e. it is an error if a dependency entry asks for nesting butinclude-packagesdoes not mention that package.
- Perhaps something like
I think that covers most of the necessary ground, but I still need to write the corresponding RFC text.
Huh, just found #2224. @kpreid I'd recommend going through that old RFC discussion to see what is relevant or not.
This would dramatically improve the current state of publishing and managing proc macro crates, and just would be a huge QOL improvement for large-workspace projects, of which there are many.
Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?
For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).
Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?
Without knowing the intricacies of downstream users, its hard to predict.
The one case that might be weird is if there are anti-vendoring rules. One use case for nested packages is allowing one user-visible package to be made up of multiple compilation units, so you should consider the code as if it was under src/. However, depending on how general we make this, it might allow some degree of vendoring of code from outside of the project.
For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).
This sounds like its more of a "tools need updating when nightly implementation is available" rather than "this affects the design or feasibility". Is that right?
Any idea how this feature would impact downstream consumers of Rust crates, like Linux distributions that build packages for them (i.e. Fedora Linux and Debian)?
Without knowing the intricacies of downstream users, its hard to predict.
The one case that might be weird is if there are anti-vendoring rules. One use case for nested packages is allowing one user-visible package to be made up of multiple compilation units, so you should consider the code as if it was under
src/. However, depending on how general we make this, it might allow some degree of vendoring of code from outside of the project.
Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...
For example, the fact that no crates published on crates.io can have any path type dependencies is currently a "hard assumption" in our tooling for building packages for crates in Fedora, and introducing them would very likely break our tools and processes in at least some "interesting" ways - like when doing dynamic dependency resolution for a package build, path-type dependencies are not supported at all for Rust crates, since they (currently) don't exist in crates published on crates.io. (Path-type dependencies are supported when building a cargo workspace, but those don't come from crates.io and are handled differently).
This sounds like its more of a "tools need updating when nightly implementation is available" rather than "this affects the design or feasibility". Is that right?
Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).
Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...
I assume there isn't a reason we need to be taking on this distribution policy (this isn't a rule we've agreed to, we already support other forms of vendoring, and users can always copy/paste code) but that this is more of a note that you'll be needing to patch these packages?
Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).
If you watch the tracking issue, when created, and the stabilization FCP ends with a disposition to merge, that gives you 6-12 weeks warning before people will start using it widely. Even then, people are unlikely to use it and release on Day 1.
Is that sufficient?
Yeah, that's one use case that we'd not want :) We have strong rules against vendoring sources, and if this "nested crates" approach is used to vendor other third-party crates (for no good reason, like them being an incompatible fork), that's something we'd want to avoid ...
I assume there isn't a reason we need to be taking on this distribution policy (this isn't a rule we've agreed to, we already support other forms of vendoring, and users can always copy/paste code) but that this is more of a note that you'll be needing to patch these packages?
I'll need to look into what the actual implementation looks like. If the "private" / "inner" / "nested" crates are just an implementation detail, and they are not exposed to other crates in any way, I don't think that would even count as them being vendored / bundled, since they are effectively "one project".
Yup - it would be great to have the tooling support in place when this feature hits the "stable" train. Otherwise any crates that start to use this feature would be impossible to update until the tooling support is in place (similar to what happened when the dep:foo and foo?/bar syntax in feature dependencies was introduced ...).
If you watch the tracking issue, when created, and the stabilization FCP ends with a disposition to merge, that gives you 6-12 weeks warning before people will start using it widely. Even then, people are unlikely to use it and release on Day 1.
Is that sufficient?
Yes. I've subscribed to all relevant tickets I could find, and that timeframe should give us enough time to make necessary tooling changes (if any).
I'll need to look into what the actual implementation looks like. If the "private" / "inner" / "nested" crates are just an implementation detail, and they are not exposed to other crates in any way
Yes, that's the idea. From the outside view of the package, this should be no different from the fact that a package may compile multiple crates within itself (build script, library dep of binary); it's just generally recursive instead of restricted to specific cases.
(P.S. In general: Sorry I haven't gotten around to updating the RFC for months. I got a little writer's-block on it because of the dismaying prospect of ripping out my "elegant" file-inclusion-based system in favor of something needing more configuration and implementation rules, but I do intend to do that.)
I have pushed major updates to the RFC text.
- Nested packages (packages to be published in another package) are now allowed to not be sub-packages (packages which happen to be located inside another package directory). This solves some problems for workspace root packages and allows the possibility of nested packages being duplicated into more than one published package (making a Rust form of “header-only library”, in a sense).
- Prior art has been added.
- More drawbacks and considerations have been discussed.
I have not yet re-written the “Reference-level explanation” as requested.
I have now rewritten “Reference-level explanation” and reorganized the discussion of alternatives, which I believe completes what I can currently do in response to reviews given so far. Two unresolved review conversations currently exist:
food for thought: if every crate that has a proc-macro sub crate were to adopt this, that would be n * number of dependents less crates that would need to sit collectively in people's dependencies, which might actually be a non-zero percentage of all crates.io traffic depending on how this is implemented
Trying to raise visibility of this on the respective teams so we can see what more problems we can uncover.
One drawback that I think would be good to include is the risk around having publicly exposed shared types be incompatible if a dependency is built multiple times.
For example: If I have three packages, "a", "b", and "shared". There is a dependency from a→shared and b→shared, and I publish a and b with shared being nested in both, and a and b expose types from shared, then they cannot be used together. That might not be evident even after publishing, since publishing won't validate that a and b work together. Also, the error message from rustc could be fairly difficult to understand.
We discussed this further in today's cargo team meeting. The above comments opened represent some of the discussion.
Several areas of concern were focused on:
The original appeal of this solution is that is relatively trivial in thought as path dependencies exist. The problem is when getting into the details, particularly dependency resolution / the Index. On top of that, crates.io would likely want to duplicate the Index Summary generation from the .crate file because they want to work with one source of truth, rather than multiple (and its unclear if we can flatten in just Cargo.toml).
This also adds more design choices to the user, making it harder for them to navigate the choice of what tools to use which would include
- workspaces
- packages
- build-targets / crates within a package
- open namespaces
- public / private packages
- nested packages
While this helps to streamline people's workflows by reducing publishing overhead, there isn't any inherent blockers from people solving this today. The closest is likely in communicating semver guarantees and there is interest in something like pub(namespace) as a way solving the problem. For the friction in publishing these packages, we should likely work to improve that larger problem. In several workflows discussed, it seemed likely that a workspace would have more than one non-nested package, meaning workspace workflow improvements are still necessary.
All of that said, the unease wasn't enough to block this proposal at this stage but we would be open to this being explored further to see how much these concerns are addressed or validated when this is put into practice.
As for next steps, we didn't fully resolve that. The RFC affects two teams with crates.io mostly being involved for the Index side of things and would generally be implemented after a go/no-go from the above exploration. We didn't get to the point of discussing whether we'd want to push this RFC forward, experiment with it, and possibly reject it after approval if the concerns were shown to be blocking, or if we'd want to authorize an experiment to be done as part of the writing of this RFC, much like cargo-script.
The big risk would be to where you put your time as this would likely be a big experiment with enough uncertainty around it that the odds of it being stabilized are iffy.