cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Nested workspaces

Open nrc opened this issue 8 years ago • 35 comments

Allow multiple layers of workspaces, not just one. This would be useful for tools in the Rust repo, and I think more generally.

nrc avatar Feb 15 '18 19:02 nrc

A lightweight way to help here might be to just ignore nested virtual manifests. If we want to be strict, we could do this only if the outer manifest includes all the workspace members in the inner manifest. This might admit some errors where there is an ignored Cargo.toml somewhere in a tree of workspaces, but I don't think that is much of a hazard, especially if the outer manifest explicitly includes the inner members.

One question is forwards compatibility, if one day we want to allow true nested workspaces, then I'm not sure how well that co-exists with this simpler proposal. I think technically it would be OK, but it might be hard to understand for users.

cc @rust-lang/cargo - thoughts? If this sounds OK, then I'd quite like to implement so that we can make Rustfmt and maybe RLS use workspaces (and even Cargo!).

nrc avatar Feb 27 '18 08:02 nrc

This is already supported with the help of exclude key I think?

λ pwd
/home/matklad/trash/foo

~/trash/foo master*
λ tree .
.
├── bar
│   ├── Cargo.toml
│   └── src
│       └── lib.rs
├── Cargo.lock
├── Cargo.toml
└── src
    └── lib.rs

3 directories, 5 files

~/trash/foo master*
λ cat Cargo.toml
[package]
name = "foo"
version = "0.1.0"
authors = ["Aleksey Kladov <[email protected]>"]

[workspace]
exclude = [ "./bar" ]

[dependencies]
bar = { path = "./bar" }

~/trash/foo master*
λ cat bar/Cargo.toml
[package]
name = "bar"
version = "0.1.0"
authors = ["Aleksey Kladov <[email protected]>"]

[workspace]

[dependencies]

~/trash/foo master*
λ cargo build
   Compiling bar v0.1.0 (file:///home/matklad/trash/foo/bar)
   Compiling foo v0.1.0 (file:///home/matklad/trash/foo)
    Finished dev [unoptimized + debuginfo] target(s) in 0.55 secs

matklad avatar Feb 27 '18 08:02 matklad

I don't think the exclude trick quite works for the case I have in mind. I have:

.
├── Cargo.toml
├── bar
│   ├── Cargo.toml
│   └── baz
│       ├── Cargo.toml
│       └── src
│           └── lib.rs
└── foo
    ├── Cargo.toml
    └── src
        └── lib.rs

Where bar/Cargo.toml is

[workspace]
members = [
    "baz",
]

and ./Cargo.toml is

workspace]
members = [
    "foo",
    "bar/baz",
]

exclude = [ "./bar" ]

When I cargo build in ., I get:

error: package `/Users/nick/hello-ws/bar/baz/Cargo.toml` is a member of the wrong workspace
expected: /Users/nick/hello-ws/Cargo.toml
actual:   /Users/nick/hello-ws/bar/Cargo.toml

And I don't think I can fix that subject to the constraint that baz may be a Git submodule and should build in its own context too.

nrc avatar Feb 28 '18 00:02 nrc

Hm, that looks interesting!

So basically we have . workspace with ., foo and bar/baz and bar workspace with bar and bar/baz. That is, we actually have intersecting workspaces, and not just nested ones!

Why does bar/baz needs to be a member of the outer workspace though? Can it be a regular path dependency?

matklad avatar Feb 28 '18 05:02 matklad

Why does bar/baz needs to be a member of the outer workspace though? Can it be a regular path dependency?

It's not a dependency of anything in the . workspace, it is just part of bar. To be precise there are multiple crates in bar where at least one is a binary that needs to be produced when building . and the others are deps of crates inside bar. So, I think at least the crates with binaries need to members of ..

The concrete example here is tools in the Rust repo when Rust is . and bar is rustfmt or the RLS.

nrc avatar Mar 01 '18 19:03 nrc

I discussed with @alexcrichton on irc, it turns out that this would actually be a breaking change: in my example above, you can't currently build the outer workspace but you can build bar and bar/baz. If we adopt the semantics of ignoring bar's manifest, then we would be changing the semantics of building bar.

An alternative would be to only ignore bar's manifest if we are building in the outer workspace, not in bar or bar/baz. However, although that would be backwards compatible, it would break a key invariant of workspaces that where you build a workspace does not affect which crates are members of the workspace.

So, I think the better solution is the breaking change, since I doubt anyone actually uses nested workspaces right now since they are broken when building at the top. However, since it is breaking, it probably needs an RFC, or at least a bit more advertising and discussion than a quick 'bug fix'.

nrc avatar Mar 01 '18 22:03 nrc

@nrc this was initially motivated by the rust-lang/rust repo, right?

I think that Cargo may have actually progressed far enough at this point to deal with git dependencies gracefully that it may work (previously we had to avoid git dependencies). I wonder if that'd be an option to avoid vendoring rustfmt's source code?

alexcrichton avatar Mar 01 '18 23:03 alexcrichton

@nrc I'm interested to understand what experience you had that motivated this. I understand the mechanics of what you're proposing, and I'd find it helpful to understand what situation the mechanics are targeting.

wycats avatar Mar 01 '18 23:03 wycats

@alexcrichton how would we run rustfmt tests if we used it as a dep?

@wycats the motivation is where you want to pull in a workspace project as a Git submodule and a workspace member of another project. The specific example here is pulling Rustfmt into the Rust repo.

nrc avatar Mar 01 '18 23:03 nrc

@nrc bah excellent point!

alexcrichton avatar Mar 02 '18 19:03 alexcrichton

wycats the motivation is where you want to pull in a workspace project as a Git submodule and a workspace member of another project. The specific example here is pulling Rustfmt into the Rust repo.

I came here to mention this exact use case. At work I've got one project which would like to use a crate from another project. I was initially making project B a git submodule of project A and then using a path dependency to include the crate, but I encountered @alexcrichton's error message.

error: package `C:\$ProjectA\$ProjectB\some-crate\Cargo.toml` is a member of the wrong workspace
expected: C:\$ProjectA\Cargo.toml
actual:   C:\$ProjectA\$ProjectB\Cargo.toml

Michael-F-Bryan avatar Mar 29 '18 04:03 Michael-F-Bryan

@nrc What if, in your example, change ./Cargo.toml to simply

[workspace]
members = ["foo", "bar"]

So, that one of the workspace members is a workspace itself. This is now actually nested workspaces, not intersecting like before. So, every member of the inner workspace should be considered as a member of the outer one.

where build a workspace does not affect which crates are members of the workspace

This invariant should still hold, just some crates (like baz) would be members of several workspaces (in which case the top-level one should be used I think). Nothing is ignored, so everything that worked before would still work, there would be no breaking changes, I think.

Currently this produces this error:

error: multiple workspace roots found in the same workspace:
  /home/user/hello-ws/bar
  /home/user/hello-ws

kuviman avatar Sep 21 '18 23:09 kuviman

Any updates on this? I like the solution proposed by @kuviman since it would reduce redundancy in nested virtual manifests.

Something I haven't seen discussed yet is how nested workspaces should behave wrt implicit members in non-virtual manifests. How would those be handled?

jrobsonchase avatar Jan 09 '19 15:01 jrobsonchase

Bumping for visibility. I would love to see this resolved. Nested workspaces seems like a common use case.

PrismaPhonic avatar Sep 12 '19 02:09 PrismaPhonic

I created a separate branch for the inner repository named "as_submodule". The inner repositories' "as_submodule" branch is identical to master branch, minus the root Cargo.toml workspace. So when somebody wants to include the project as a submodule they do git submodule add -b as_submodule. The master and development branches all act as a workspace which makes testing and stuff easier.

I don't like doing it this way, so the README points to this issue for people who judge :)

TomzBench avatar Dec 08 '19 11:12 TomzBench

This is my first "tru" involvement in the Rust development ecosystem (I'm not really sure how to contribute to Rust and am not very knowledgeable on lexers, parsers and whatnot). I would love to have this resolved. I am currently working on an operating system that is structured like so:

.
├── Cargo.lock
├── Cargo.toml
├── disk.img
├── LICENSE
├── linux-0.2.img
├── pi.vfd
├── README.md
├── src
│   ├── acpi
│   │   └── core
│   │       └── opcodes.rs
│   ├── bits.rs
│   ├── drivers
│   │   ├── bus
│   │   │   ├── mod.rs
│   │   │   └── usb
│   │   │       ├── ehci
│   │   │       │   └── mod.rs
│   │   │       ├── mod.rs
│   │   │       ├── ohci
│   │   │       │   └── mod.rs
│   │   │       ├── uhci
│   │   │       │   └── mod.rs
│   │   │       └── xhci
│   │   │           └── mod.rs
│   │   ├── fs
│   │   │   ├── ext2.rs
│   │   │   └── mod.rs
│   │   ├── hid
│   │   │   ├── keyboard
│   │   │   │   └── mod.rs
│   │   │   ├── mod.rs
│   │   │   └── mouse
│   │   │       └── mod.rs
│   │   ├── iof
│   │   │   ├── io.rs
│   │   │   └── mod.rs
│   │   ├── mod.rs
│   │   ├── net
│   │   │   └── mod.rs
│   │   ├── sound
│   │   │   ├── hda.rs
│   │   │   ├── mod.rs
│   │   │   └── pcspeaker.rs
│   │   ├── storage
│   │   │   ├── ahci
│   │   │   │   └── internal.rs
│   │   │   ├── ahci.rs
│   │   │   ├── ata
│   │   │   │   ├── dco.rs
│   │   │   │   ├── identification.rs
│   │   │   │   ├── security.rs
│   │   │   │   └── smart.rs
│   │   │   ├── ata.rs
│   │   │   ├── gpt.rs
│   │   │   └── mod.rs
│   │   ├── video
│   │   │   └── mod.rs
│   │   └── virtio
│   │       └── mod.rs
│   ├── exec
│   │   └── elf
│   │       └── types.rs
│   ├── gdt.rs
│   ├── interrupts.rs
│   ├── lib.rs
│   ├── main.rs
│   ├── memory.rs
│   ├── pci.rs
│   ├── scheduler.rs
│   ├── smbios.rs
│   ├── tasking.rs
│   ├── ui.rs
│   └── vga.rs
└── x86_64-kernel-none.json

I'd like to divide this up into workspaces: the main kernel tree as the "top-level" workspace, then nested workspaces under that (i.e. "drivers" is a workspace, "acpi" is another, ...). Currently (unless this issue has been resolved) I am unable to make this layout without creating separate repositories, which makes it harder to manage the tree as a hole as I am the only maintainer right now.

ethindp avatar Jan 19 '20 22:01 ethindp

I got it by this issue too. I have a crate that have multiples binaries, and multiple libraries, so I'm using a workspace. I had to vendor another library that itself uses a workspace. In order to do so, I had to modify it to move its workspace from its Cargo.toml to my top-level Cargo.toml. More details on URLO.

robinmoussu avatar Jun 25 '20 17:06 robinmoussu

my-cargo-workspace
│
├── Cargo.toml
├── backend
│   ├── api-server
│   │   ├── Cargo.toml
│   │   └── src
│   │       └── main.rs
│   ├── push-server
│   │   ├── Cargo.toml
│   │   └── src
│   │       └── main.rs
│   ├── websocket-server
│       ├── Cargo.toml
│       └── src
│           └── main.rs
│
├── frontend



how to run?

cheolgyu avatar Oct 17 '20 19:10 cheolgyu

In https://github.com/rust-lang/cargo/issues/5042#issuecomment-423695349, @kuviman wrote:

@nrc What if, in your example, change ./Cargo.toml to simply

[workspace]
members = ["foo", "bar"]

So, that one of the workspace members is a workspace itself. This is now actually nested workspaces, not intersecting like before. So, every member of the inner workspace should be considered as a member of the outer one.

I guess one problem with this is that, in order for Cargo to identify that the child workspace is nested within another when invoked from within its subtree, either the child's Cargo.toml must point at its parent workspace or Cargo must always traverse to the filesystem root searching for a parent workspace. The former would not satisfy the git submodule use case, and the latter would be a breaking change.

One solution could be to add a boolean configuration field, e.g. nestable, to the [workspace] section of Cargo.toml (defaulting to false so as not to break existing behaviour, but perhaps in future defaulting to true?). The child could then indicate nestable = true to have Cargo search for a parent workspace, or alternatively specify the location of the parent with a workspace.parent field.

The parent workspace would, as @kuviman suggested, include the child as one of its members. It would be an error if such a specified child is not nestable, as it would be for a nestable child to have a parent workspace in which the child is not listed as a member.

Would a PR implementing this be acceptable?

eggyal avatar Oct 19 '21 07:10 eggyal

Using a "nestable" attribute doesn’t work either, because submodules can no-longer be unmodified. And its totally break encapsulation. How do you decide if you want your project to be nestable or not?

I think that recursing by default, searching for a nested = "child-name" in the parent [workspace] section would be better. It cannot change existing behavior, because currently no parent have a nested attribute.

robinmoussu avatar Oct 19 '21 10:10 robinmoussu

I believe you always want your project to be nestable, the attribute is only needed since this would be a breaking change

Although I am not sure if this breaking change actually affects anyone. Because it only breaks if the suggested workspace structure was used, but why would anyone use it if it does not work currently. Correct me if I'm wrong

kuviman avatar Oct 19 '21 10:10 kuviman

I believe you always want your project to be nestable, the attribute is only needed since this would be a breaking change

Agreed. Perhaps one could even configure nestable as some sort of out-of-band override, in order to avoid even having to make that change to the submodule.

Although I am not sure if this breaking change actually affects anyone. Because it only breaks if the suggested workspace structure was used, but why would anyone use it if it does not work currently. Correct me if I'm wrong

Perhaps you're right, and the break is sufficiently rare that we could immediately default workspaces to being "nestable". But there are certainly projects that currently have workspaces located in the filesystem beneath other workspaces without them being unified: for example, rust-lang/rust contains library/backtrace, library/stdarch and src/tools/rust-analyzer amongst others—and it isn't the case that the semantics should change, so at very least there needs to be some way of disabling any new nesting behaviour.

eggyal avatar Oct 19 '21 12:10 eggyal

Why modifying the parent project (as I proposed above) and not the child (as you are discussing) doesn't work?

robinmoussu avatar Oct 19 '21 12:10 robinmoussu

But there are certainly projects that currently have workspaces located in the filesystem beneath other workspaces without them being unified

Well, those workspaces do not list nested workspaces as members, right? So would not affect them anyway

Why modifying the parent project (as I proposed above) and not the child (as you are discussing) doesn't work?

I mean, specifying a "nestable" workspace as a member of a parent workspace already serves the purpose, why is any new attribute even needed?

kuviman avatar Oct 19 '21 12:10 kuviman

You are totally rihgt @kuviman. I don’t think that any extra markers are needed, and recursing can be done without risking to break anything.

robinmoussu avatar Oct 19 '21 12:10 robinmoussu

So if a potential parent workspace is found in the filesystem ancestry, but the child workspace neither appears amongst that potential parent workspace's members nor exclude fields, it is assumed to be excluded? This behaviour would differ from that of a mere package in the same position, about which Cargo currently raises an error.

At very least, I think this should generate a warning that can be suppressed by adding the child to the parent's exclude list.

It also means that Cargo will always search for potential parent workspaces anywhere in the filesystem hierarchy, all the way to the root. Is this acceptable?

eggyal avatar Oct 19 '21 13:10 eggyal

It also means that Cargo will always search for potential parent workspaces anywhere in the filesystem hierarchy

Isn't this already done by Cargo for regular packages to find the workspace and show the error you are talking about?

This behaviour would differ from that of a mere package in the same position

This is indeed a valid question, and it should probably error like the regular package so it behaves similarly, but that is now a real breaking change, so a warning is probably the only solution here.

As a side note though, what is the rationale behind that error when a package "assumes" it is in the workspace when it is not? I have faced that issue when generating code and it's kinda annoying tbh, my personal preference would be for that package to not assume anything

kuviman avatar Oct 19 '21 13:10 kuviman

Isn't this already done by Cargo for regular packages to find the workspace and show the error you are talking about?

Kind of. The search currently terminates once a workspace is found. Now the search will always continue to the root.

This is indeed a valid question, and it should probably error like the regular package so it behaves similarly, but that is now a real breaking change, so a warning is probably the only solution here.

Or there needs to be some way to configure whether the child is nestable :)

As a side note though, what is the rationale behind that error when a package "assumes" it is in the workspace when it is not? I have faced that issue when generating code and it's kinda annoying tbh, my personal preference would be for that package to not assume anything

Good question. I wondered the same myself, when considering this. Perhaps it's because it's quite common to add crates beneath a workspace, yet easy to forget to add them into the workspace itself? I saw an issue somewhere that proposed modifying cargo new so that it would default to adding such crates into the presumed workspace—am on mobile right now so not easy to dig it out but I'll link here if I find it.

eggyal avatar Oct 19 '21 13:10 eggyal

Or there needs to be some way to configure whether the child is nestable :)

I guess there's another option: if it is ok to relax the error of a package assumption to a warning (which is something I would personally be very happy with), that would make nested workspaces work exactly same as regular packages

kuviman avatar Oct 19 '21 15:10 kuviman

As a side note though, what is the rationale behind that error when a package "assumes" it is in the workspace when it is not?

I've been looking into this.

RFC 1525 introduced workspaces. Under validating workspaces, it explains that both workspace root and workspace member must be able to find each other in order to maintain a consistent view of the workspace irrespective of the location from which Cargo is invoked (emphasis added):

If, however, this restriction were not in place then the set of crates in a workspace may differ depending on which crate it was viewed from. For example if workspace root A includes B then it will think B is in A's workspace. If, however, B does not point back to A, then B would not think that A was in its workspace. This would in turn cause the set of crates in each workspace to be different, further causing Cargo.lock to get out of sync if it were allowed. By ensuring that all crates have edges to each other in a workspace Cargo can prevent this situation and guarantee robust builds no matter where they're executed in the workspace.

To alleviate misconfiguration Cargo will emit an error if the two properties above do not hold for any crate attempting to be part of a workspace. For example, if the package.workspace key is specified, but the crate is not a workspace root or doesn't point back to the original crate an error is emitted.

The RFC then immediately continues by defining implicit relations:

The package.workspace can be omitted if it would only contain ../ (or some repetition of it). That is, if the root of a workspace is hierarchically the first Cargo.toml with [workspace] above a crate in the filesystem, then that crate can omit the package.workspace key.

It concludes (emphasis added):

Note that these implicit relations will be subject to the same validations mentioned above for all of the explicit configuration as well.

So I think the current behaviour is required by the RFC. Discussion here appears to propose only to imply that a membership relation exists if the root (explicitly or implicitly) includes the member? This does feel a bit brittle to me, and one could easily find that intended workspace members are instead freestanding packages, with Cargo.lock files and target build directories littered throughout a workspace—the solution to which (explicitly or implicitly adding the member to the root workspace) could be non-obvious. I think it's a significant change anyway, and probably not one that can be made without deeper thought.

eggyal avatar Oct 20 '21 10:10 eggyal