Proposal: Monorepo + workspace build for the `esp-idf-*` crates
Overview
This proposal suggests migrating the esp-idf-* crates' GIT repos (and a few others, see below) to:
- A monorepo setup
- A workspace re-organization of the crates
(This change was inspired from the experience I got with the monorepo-yet-micro-crates approach of edge-net which works quite nicely.)
What will NOT be changed by this proposal?
Crates' names and their consumption from crates.io.
From the POV of the end user, everything remains backwards compatible:
- Crates' names are not changed
- Crates are still "micro-crates", i.e.
esp-idf-sys,esp-idf-hal,esp-idf-svcare separate crates
But why?
The main reason is that the esp-idf-* crates are highly inter-dependent at it is just easier to develop them in a single repo, as part of a workspace setup.
Major pain points of the current micro-repo approach, examples below.
Introducing changes across > 1 crate
- A backwards-incompatible change to
esp-idf-halmasteroften implies a backwards-incompatible change toesp-idf-svcmaster:- To model this change, we first need to make the change in
esp-idf-hal. - Further, and with a separate commit (non-atomic GIt changes, not nice) we need to introduce the change in
esp-idf-svctoo. - Finally,
Cargo.tomlofesp-idf-svcneeds to be patched with[patch.crates-io]to use themaster(GIT) repo. And we need not to forget to remove this patch prior to publishing theesp-idf-svccrate
- To model this change, we first need to make the change in
Contrast this with a monorepo with a workspace setup:
- A single, atomic commit / PR is addressing all changes across
esp-idf-sys,esp-idf-hal,esp-idf-svc, and if necessary -embedded-svc(if it is part of the repo), andembuild - NO need to do
Cargo.tomlequilibristics, as - with a workspace setup - the deps between the crates are expressed using apathnotation syntax (whichcargo publishautomatically turns on references to the published crates during publishing). It just works!
Slow CI
CI is currently slow, because of the fact that the build of esp-idf-sys itself accounts to 80% of the total build time of any other downstream crate (hal, svc, the -template).
Because we are using separate GIT repos for each micro-crate, their CIs are in fact repeating the (slow) build of esp-idf-sys multiple times:
- Done once for the CI of
esp-idf-sysitself (I mean, multiple times, but once per ESP IDF version + target + MCU triple + type of build) - Ditto for
esp-idf-hal - Ditto for
esp-idf-svc - Ditto for
esp-idf-template
Contrast this with a workspace setup:
- There will be a single CI, which just builds the workspace. What it would mean:
- a) All the examples, and downstream crates from them
- b) A project generated by
esp-idf-template
... as a side effect, all downstream crates will be built. And with non-incremental build only the first time. Doing no_std and so on re-builds will be incremental.
Worse: Unreliable CI / testing changes
Because CI is so slow, and because we are duplicating what amounts to the same build pipeline across 3-4 crates, there is NO single CI which does all the builds against everything:
esp-idf-sysbuilds against different versions of ESP IDF compared to hal and svc- PIO build is only done by some crates' CI
- builds against ESP IDF master are done by the hal crate, but not others
- CMake builds are only done by the
esp-idf-templatecrate, against released versions of the other crates (i.e. we don't know if the code in upstream crates will break the CMake build until after we release them, which sucks)
No unified examples
Each crate has its own examples folder. On one hand, this is nice, because some examples need only - say - esp-idf-hal so it is somewhat nice to show that these examples can work without esp-idf-svc.
On the other hand, the hal examples cannot use the Rust log crate, as support for it is only in esp-idf-svc, so they use println!, which is not nice at all.
Further, the reality is, NO useful end-user application of the esp-idf-* crates can be implemented without depending on the svc crate, so hal-only examples are a solution looking for a problem really.
(
Digression:
We recognized this problem several months ago, and this is one of the reasons why the hal crate re-exports the sys crate under esp_idf_hal::sys and further - the svc crate re-exports both sys and hal.
This way, the user needs - in 99% of the cases - only one dependency - esp-idf-svc. Via this single dependency, the user can still call into hal and sys as these are re-exported.
Further, such a setup lifts from the shoulders of the user the problem that otherwise she has to think about what hal version matches its svc version.
For niche cases, user can still depend directly or only on sys and/or hal, or any combination of the crates thereof, but that's the exception rather than the rule, so we are not optimizing for it; but also not making it any harder.
In retrospective - we could've renamed esp-idf-svc to just esp-idf or we could've created a new aggregationr crate named esp-idf - if only esp-idf was not taken already by somebody in crates.io - and taken under the sub-optimal esp_idf name, which prohibits the usage of esp-idf (note that dash vs underscore) by somebody else either!
)
While we can move all examples to esp-idf-svc without introducing monorepo and workspace setup, if we move to a monorepo setup, it does make sense to move the examples to the workspace level in a common examples/ folder.
Monorepo without a workspace setup - is it worth it?
I'm not sure, but I'm inclined to say: NO.
Because a lot of the benefits actually come from the workspace setup, not from the monorepo per-se. The workspace setup requires monorepo of course. Examples where workspace would help, but "just monorepo" won't:
- No need to patch
Cargo.tomlfor backwards-incompatible changes. Without workspace setup, we still need to do this - Faster, unified, single CI build: without a workspace build we still need separate builds for each crate. If for nothing else, then for the examples (if all examples were in
esp-idf-svc, we can just buildesp-idf-svcand not even build the sys and hal; but we still need to buildesp-idf-template).
What crates should be part of the monorepo + workspace setup:
Minimum:
esp-idf-sysesp-idf-halesp-idf-svcesp-idf-template- note - we have to be a bit inventive here, as you cannot just "build" this crate. You need to generate a project from it first.
Extra:
embedded-svc- why not? While it is - strictly speaking - not part of the
esp-idf-*crates, pulling it into the workspace would allow to easily implement changes that transcend both the esp-idf crates and this one. This crate can still be consumed independently from esp-hal (baremetal). nothing will be changed here
- why not? While it is - strictly speaking - not part of the
embuild- 90% (if not 99%) of the usage of this crate is by
esp-idf-syshence the same issues apply here: if we changeembuildin a backward incompatible way and want to use it inesp-idf-sys, we have to patch the sys crate (and all downstream crates) so that they use the un-releasedembuild. - One problem with pulling
embuildinto a workspace setup is that it always compiles against the host target triple, not against the MCU triple. I'm unsure that - given that restriction - we can pull it into the workspace. Maybe, maybe not. We need to experiment withforce-targetin Cargo to see how that would work (if at all)
- 90% (if not 99%) of the usage of this crate is by
What is NOT solved by this proposal?
Ability to do builds against different versions of sdkconfig.defaults. I.e. a BT example might need one sdkconfig.defaults, while an Ethernet one - another.
Currently, we "stuff" a single sdkconfig.defaults with all settings for all examples, and it works.
- Is it a bit of a pain? Yes.
- Is it a big pain that needs to be addressed ASAP? Not so sure.
- Can this be solved after we migrate to monorepo + workspace? I hope so
- Is the need to have multiple
sdkconfig.defaultsincompatible with a monorepo + workspace build? I hope not, but it is only a gut feeling. If you have examples where it would be incompatible - show them
Migration strategy
The biggest pain-point and biggest risk is the GIT migration itself. The baremetal team migrated to a single repo by just generating a bunch of patches (patch-per-commit) for all repos which were migrated to the "monorepo" one. And then one of the repos was chosen as the "monorepo" one.
Everything else (implementing the Cargo.toml of the workspace and so on) can be done post-factum, and I don't see major hurdles there.
Possible migration steps:
- Dedicate
esp-idf-svcas the future monorepo - Branch it; all further operations proceed in the new branch below
- Move
esp-idf-svc's root into a sub-folder namedesp-idf-svcto leave room for the other crates to join the monorepo - Clone locally all of sys, hal, -template and - if we decide so - embuild and embedded-svc
- Generate - for each commit on
masterof each of those repos - a bunch of patches in a directory git applyall these commits on the `esp-idf-svc repo- Biggest unknown to be checked: the patches will be applied on a slightly different setup where the sys, hal, -template and so on will be subdirectories of the
esp-idf-svcdirectory. See how exactly that would work
- Biggest unknown to be checked: the patches will be applied on a slightly different setup where the sys, hal, -template and so on will be subdirectories of the
- Push the changes to the
esp-idf-svcbranch - Create a
Cargo.tomlworkspace file - Move around the sys and hal and svc examples into a common
examples/folder - Fix the CI
- Open a PR against
esp-idf-svc - Once the PR CI is confirmed to build fine, merge the PR
- Rename
esp-idf-svcGIT repo to justesp-idf - Change the readmes of sys, hal, -template, and potentially embuild and embedded-svc to have a "h1" that explains that these are archived, and put a link in there to their new locations, in the monorepo
- Archive sys, hal, -template and so on
Done!
I definitely think this is the right direction, we're finding a lot of success with it in bare-metal land.
esp-idf-template
Imo, this should stay separate, just because it's easy to remember if you need the template it's just cargo generate esp-rs/esp-idf-template, but that's just my opinion, feel free to ignore :).
Monorepo without a workspace setup - is it worth it?
I would say yes, just from our experience - but if you can make the workspace work properly then by all means you should use it :D.
Rename esp-idf-svc GIT repo to just esp-idf
Do you not think there might be some confusion with this? I imagine this will make it very hard to find good search results via google etc. I'd suggest keeping it either esp-idf-hal, esp-idf-svc or bikeshedding some other name. Unless you have a compelling reason to call it esp-idf.
esp-idf-template
Imo, this should stay separate, just because it's easy to remember if you need the template it's just
cargo generate esp-rs/esp-idf-template, but that's just my opinion, feel free to ignore :).
True... even though cargo generate esp-rs/esp-idf/template is not that bad either?
Monorepo without a workspace setup - is it worth it?
I would say yes, just from our experience - but if you can make the workspace work properly then by all means you should use it :D.
Would you elaborate on the benefits of a monorepo without a workspace setup for the baremetal projects? As you can see above, I'm struggling to formulate a single meaningful benefit from this. :) Maybe only atomic commits. BUT: without the "auto-patching" of dependent crates that you get when using a workspace, which is IMO "the" or "a" big deal.
Anything I'm missing here?
Rename esp-idf-svc GIT repo to just esp-idf
Do you not think there might be some confusion with this? I imagine this will make it very hard to find good search results via google etc. I'd suggest keeping it either esp-idf-hal, esp-idf-svc or bikeshedding some other name. Unless you have a compelling reason to call it esp-idf.
Would there be confusion or hard-to-search results indeed?
- All crates remain named as they are now (i.e. crates.io indexing in google remains the same)
- We still have
esp-idf-sys/ hal / svc as sub-directories of the future GITesp-idf(or even justidf) monorepo. Wouldn't that be "good enough" for google? I.e. searching for "edge-http github rust" returns as a top result the "edge-net" monorepo site, just as I would expect...?
Would you elaborate on the benefits of a monorepo without a workspace setup for the baremetal projects?
I can just speak for myself here, but in my experience Rust tooling support for workspaces is fine. Rust-Analyzer is the worst offender I can think of, as Monorepos tend to grow, RA slows down to a snail's pace. RA also only works if the virtual cargo manifest sits in the workspace/ folder root.
The most useful thing for me is that my crates live side by side and I can pin dependencies painlessly by defining most of them in the virtual workspace manifest.
I can just speak for myself here, but in my experience Rust tooling support for workspaces is fine.
Rust-Analyzeris the worst offender I can think of, as Monorepos tend to grow, RA slows down to a snail's pace.
Heh. I would not qualify Rust tooling support as "fine" if RA is unusable. Interesting that RA is affected by the size of the workspace, as my assumption was that it is also affected by the number and size of your dependencies? Where I'm getting at is that with a workspace setup, whatever extra weight we get via workspace deps, we'll lose in the form of out-of-workspace deps, so net-net we should be at the same "weight" at the end.
RA also only works if the virtual cargo manifest sits in the workspace/ folder root.
That's an easy restriction to comply with.
The most useful thing for me is that my crates live side by side and I can pin dependencies painlessly by defining most of them in the virtual workspace manifest.
Exactly. As I do here.
rs-matter by the way also has a workspace setup and works quite OK. And is quite large of a code-base!
Heh. I would not qualify Rust tooling support as "fine" if RA is unusable
Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of serde and axum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support.
I think heavy macro useage is the number one factor which can slow down RA. The problem with serde is that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode).
Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.
Just to add, I think maintaining most crates in the esp-idf-* ecosystem would be great.
Heh. I would not qualify Rust tooling support as "fine" if RA is unusable
Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of
serdeandaxum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support.I think heavy macro useage is the number one factor which can slow down RA. The problem with
serdeis that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode).Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.
rs-matter is a very heavy proc-macro user (more of a per-struct though and even if not serde) and I don't observe any slowdowns there.
Are you sure you don't hit something very specific for your setup?
As for the ESP-IDF crates - yes - no proc-macros there.
Just to add, I think maintaining most crates in the
esp-idf-*ecosystem would be great.
Not sure I follow. We are (or at least we fool) ourselves that we maintain them? Or do you mean paid maintenance?
Just to add, I think maintaining most crates in the
esp-idf-*ecosystem would be great.Not sure I follow. We are (or at least we fool) ourselves that we maintain them? Or do you mean paid maintenance?
Whoops, sorry. I meant adding them to a monorepo. It's been a long day.
Heh. I would not qualify Rust tooling support as "fine" if RA is unusable
Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of
serdeandaxum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support. I think heavy macro useage is the number one factor which can slow down RA. The problem withserdeis that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode). Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.
rs-matteris a very heavy proc-macro user (more of a per-struct though and even if not serde) and I don't observe any slowdowns there.Are you sure you don't hit something very specific for your setup?
As for the ESP-IDF crates - yes - no proc-macros there.
I'm not sure if it's something specific for my setup, I actually had a look at rs-matter when initializing my project.
It's a reference implementation for a state of the art zero-touch/zero-trust enrollment standard. I maintain it at https://github.com/hm-edu/open-brski if you want to take a look. This could also be interesting to @MabezDev I suppose.
I think some open question regarding the migration strategy are still not discussed.
How do we manage the interleaving of the PR numbering for past PR's that are already merged - E.g #440 is something different in hal than in svc. Currently we have a CHANGELOG for every crate and that would also not change, but in every crate we would now have a historical break, so at the migration point we hopefully don't get in a situation where we have a higher number in esp-idf-svc than in esp-idf-hal, such that we get the same number referring to the old number in esp-idf-svc and one in esp-idf-hal's combined repo.
Transferring issues - i assume we will only transfer open issues to the combined repo? That is also a step we do pretty late where we know the other stuff works. Also moved PR's are creating a new issue number i think and so bumping the numbering system of PR's / issues. Something to keep in mind.
I think embuild and the template are relative independent, and we first should only focus on sys/hal/svc. If we want we still can include it later but the operation is already complex enough, so i would appreciate we don't do to much at the same time increasing risk of the unknown. Also embuild itself would not be a good fit for in a workspace.It also provides ldproxy that will always be build against a host-target. That is deeply problematic as a rust workspace just doesn't work with mutti target builds.
Also embuild itself would not be a good fit for in a workspace.It also provides ldproxy that will always be build against a host-target. That is deeply problematic as a rust workspace just doesn't work with mutti target builds.
embuild for the purposes of a build-time dependency for sys/svc and hal might not be a problem, as it is a dev-dependency.
As for embuild for the purposes of ldproxy and cago-pio - it might work if we create two workspaces:
- sys/hal/svc + embuild as a dev-dependency
- ldproxy + cargo-pio + embuild as a regular dependency (that is, if you are allowed to have two bin projects in one workspace)
The key is to check if a single crate can be a member of multiple workspaces. If it can, then we don't have an issue.
I think we need to experiment a bit rather than theorizing, so as to see what works and what not.
Also, and regardless whether the above multi-workspace build works or not, we probably want all of sys/hal/svc/embuild/ldproxy/cargo-pio/template in a single monorepo. This would allow us to play with various workspace setups post-migration. And would allow us to have at least atomic commits spanning multiple crates. (UPDATE: and multiple CIs (if we end up with multiple workspaces) triggering on any commit. As in a commit in embuild would trigger the CI for sys/hal/svc.
And users will stop opening issues "in the wrong repo" because very often they can't really distinguish whether an issue belongs to sys, hal, svc, template or embuild.
How do we manage the interleaving of the PR numbering for past PR's that are already merged - E.g #440 is something different in hal than in svc.
We don't? Legacy PRs remain in the legacy repos which will be archived and readonly but not deleted.
Currently we have a CHANGELOG for every crate and that would also not change, but in every crate we would now have a historical break, so at the migration point we hopefully don't get in a situation where we have a higher number in esp-idf-svc than in esp-idf-hal, such that we get the same number referring to the old number in esp-idf-svc and one in esp-idf-hal's combined repo.
I do not understand this. If you could clarify.
Transferring issues - i assume we will only transfer open issues to the combined repo? That is also a step we do pretty late where we know the other stuff works. Also moved PR's are creating a new issue number i think and so bumping the numbering system of PR's / issues. Something to keep in mind.
I don't think we need to move open PRs at all. They anyway would be totally unmergeable with the new repo. Open issues we might indeed move.
git-filter-repo should make joining together the projects as workspace easily.
I would make an esp-idf empty repo with the workspace and a main readme, then import the full history of all the target esp-idf* repos and at the end setup the new CI.