esp-idf-sys Proposal: Monorepo + workspace build for the `esp-idf-*` crates

Overview

This proposal suggests migrating the esp-idf-* crates' GIT repos (and a few others, see below) to:

A monorepo setup
A workspace re-organization of the crates

(This change was inspired from the experience I got with the monorepo-yet-micro-crates approach of edge-net which works quite nicely.)

What will NOT be changed by this proposal?

Crates' names and their consumption from crates.io. From the POV of the end user, everything remains backwards compatible:

Crates' names are not changed
Crates are still "micro-crates", i.e. esp-idf-sys, esp-idf-hal, esp-idf-svc are separate crates

But why?

The main reason is that the esp-idf-* crates are highly inter-dependent at it is just easier to develop them in a single repo, as part of a workspace setup.

Major pain points of the current micro-repo approach, examples below.

Introducing changes across > 1 crate

A backwards-incompatible change to esp-idf-hal master often implies a backwards-incompatible change to esp-idf-svc master:
- To model this change, we first need to make the change in esp-idf-hal.
- Further, and with a separate commit (non-atomic GIt changes, not nice) we need to introduce the change in esp-idf-svc too.
- Finally, Cargo.toml of esp-idf-svc needs to be patched with [patch.crates-io] to use the master (GIT) repo. And we need not to forget to remove this patch prior to publishing the esp-idf-svc crate

Contrast this with a monorepo with a workspace setup:

A single, atomic commit / PR is addressing all changes across esp-idf-sys, esp-idf-hal, esp-idf-svc, and if necessary - embedded-svc (if it is part of the repo), and embuild
NO need to do Cargo.toml equilibristics, as - with a workspace setup - the deps between the crates are expressed using a path notation syntax (which cargo publish automatically turns on references to the published crates during publishing). It just works!

Slow CI

CI is currently slow, because of the fact that the build of esp-idf-sys itself accounts to 80% of the total build time of any other downstream crate (hal, svc, the -template).

Because we are using separate GIT repos for each micro-crate, their CIs are in fact repeating the (slow) build of esp-idf-sys multiple times:

Done once for the CI of esp-idf-sys itself (I mean, multiple times, but once per ESP IDF version + target + MCU triple + type of build)
Ditto for esp-idf-hal
Ditto for esp-idf-svc
Ditto for esp-idf-template

Contrast this with a workspace setup:

There will be a single CI, which just builds the workspace. What it would mean:
a) All the examples, and downstream crates from them
b) A project generated by esp-idf-template

... as a side effect, all downstream crates will be built. And with non-incremental build only the first time. Doing no_std and so on re-builds will be incremental.

Worse: Unreliable CI / testing changes

Because CI is so slow, and because we are duplicating what amounts to the same build pipeline across 3-4 crates, there is NO single CI which does all the builds against everything:

esp-idf-sys builds against different versions of ESP IDF compared to hal and svc
PIO build is only done by some crates' CI
builds against ESP IDF master are done by the hal crate, but not others
CMake builds are only done by the esp-idf-template crate, against released versions of the other crates (i.e. we don't know if the code in upstream crates will break the CMake build until after we release them, which sucks)

No unified examples

Each crate has its own examples folder. On one hand, this is nice, because some examples need only - say - esp-idf-hal so it is somewhat nice to show that these examples can work without esp-idf-svc.

On the other hand, the hal examples cannot use the Rust log crate, as support for it is only in esp-idf-svc, so they use println!, which is not nice at all.

Further, the reality is, NO useful end-user application of the esp-idf-* crates can be implemented without depending on the svc crate, so hal-only examples are a solution looking for a problem really.

( Digression: We recognized this problem several months ago, and this is one of the reasons why the hal crate re-exports the sys crate under esp_idf_hal::sys and further - the svc crate re-exports both sys and hal. This way, the user needs - in 99% of the cases - only one dependency - esp-idf-svc. Via this single dependency, the user can still call into hal and sys as these are re-exported.

Further, such a setup lifts from the shoulders of the user the problem that otherwise she has to think about what hal version matches its svc version.

For niche cases, user can still depend directly or only on sys and/or hal, or any combination of the crates thereof, but that's the exception rather than the rule, so we are not optimizing for it; but also not making it any harder.

In retrospective - we could've renamed esp-idf-svc to just esp-idf or we could've created a new aggregationr crate named esp-idf - if only esp-idf was not taken already by somebody in crates.io - and taken under the sub-optimal esp_idf name, which prohibits the usage of esp-idf (note that dash vs underscore) by somebody else either! )

While we can move all examples to esp-idf-svc without introducing monorepo and workspace setup, if we move to a monorepo setup, it does make sense to move the examples to the workspace level in a common examples/ folder.

Monorepo without a workspace setup - is it worth it?

I'm not sure, but I'm inclined to say: NO.

Because a lot of the benefits actually come from the workspace setup, not from the monorepo per-se. The workspace setup requires monorepo of course. Examples where workspace would help, but "just monorepo" won't:

No need to patch Cargo.toml for backwards-incompatible changes. Without workspace setup, we still need to do this
Faster, unified, single CI build: without a workspace build we still need separate builds for each crate. If for nothing else, then for the examples (if all examples were in esp-idf-svc, we can just build esp-idf-svc and not even build the sys and hal; but we still need to build esp-idf-template).

What crates should be part of the monorepo + workspace setup:

Minimum:

esp-idf-sys
esp-idf-hal
esp-idf-svc
esp-idf-template - note - we have to be a bit inventive here, as you cannot just "build" this crate. You need to generate a project from it first.

Extra:

embedded-svc
- why not? While it is - strictly speaking - not part of the esp-idf-* crates, pulling it into the workspace would allow to easily implement changes that transcend both the esp-idf crates and this one. This crate can still be consumed independently from esp-hal (baremetal). nothing will be changed here
embuild
- 90% (if not 99%) of the usage of this crate is by esp-idf-sys hence the same issues apply here: if we change embuild in a backward incompatible way and want to use it in esp-idf-sys, we have to patch the sys crate (and all downstream crates) so that they use the un-released embuild.
- One problem with pulling embuild into a workspace setup is that it always compiles against the host target triple, not against the MCU triple. I'm unsure that - given that restriction - we can pull it into the workspace. Maybe, maybe not. We need to experiment with force-target in Cargo to see how that would work (if at all)

What is NOT solved by this proposal?

Ability to do builds against different versions of sdkconfig.defaults. I.e. a BT example might need one sdkconfig.defaults, while an Ethernet one - another.

Currently, we "stuff" a single sdkconfig.defaults with all settings for all examples, and it works.

Is it a bit of a pain? Yes.
Is it a big pain that needs to be addressed ASAP? Not so sure.
Can this be solved after we migrate to monorepo + workspace? I hope so
Is the need to have multiple sdkconfig.defaults incompatible with a monorepo + workspace build? I hope not, but it is only a gut feeling. If you have examples where it would be incompatible - show them

Migration strategy

The biggest pain-point and biggest risk is the GIT migration itself. The baremetal team migrated to a single repo by just generating a bunch of patches (patch-per-commit) for all repos which were migrated to the "monorepo" one. And then one of the repos was chosen as the "monorepo" one.

Everything else (implementing the Cargo.toml of the workspace and so on) can be done post-factum, and I don't see major hurdles there.

Possible migration steps:

Dedicate esp-idf-svc as the future monorepo
Branch it; all further operations proceed in the new branch below
Move esp-idf-svc's root into a sub-folder named esp-idf-svc to leave room for the other crates to join the monorepo
Clone locally all of sys, hal, -template and - if we decide so - embuild and embedded-svc
Generate - for each commit on master of each of those repos - a bunch of patches in a directory
git apply all these commits on the `esp-idf-svc repo
- Biggest unknown to be checked: the patches will be applied on a slightly different setup where the sys, hal, -template and so on will be subdirectories of the esp-idf-svc directory. See how exactly that would work
Push the changes to the esp-idf-svc branch
Create a Cargo.toml workspace file
Move around the sys and hal and svc examples into a common examples/ folder
Fix the CI
Open a PR against esp-idf-svc
Once the PR CI is confirmed to build fine, merge the PR
Rename esp-idf-svc GIT repo to just esp-idf
Change the readmes of sys, hal, -template, and potentially embuild and embedded-svc to have a "h1" that explains that these are archived, and put a link in there to their new locations, in the monorepo
Archive sys, hal, -template and so on

Done!

Jun 12 '24 07:06 ivmarkov

I definitely think this is the right direction, we're finding a lot of success with it in bare-metal land.

esp-idf-template

Imo, this should stay separate, just because it's easy to remember if you need the template it's just cargo generate esp-rs/esp-idf-template, but that's just my opinion, feel free to ignore :).

Monorepo without a workspace setup - is it worth it?

I would say yes, just from our experience - but if you can make the workspace work properly then by all means you should use it :D.

Rename esp-idf-svc GIT repo to just esp-idf

Do you not think there might be some confusion with this? I imagine this will make it very hard to find good search results via google etc. I'd suggest keeping it either esp-idf-hal, esp-idf-svc or bikeshedding some other name. Unless you have a compelling reason to call it esp-idf.

Jun 12 '24 09:06 MabezDev

esp-idf-template

Imo, this should stay separate, just because it's easy to remember if you need the template it's just cargo generate esp-rs/esp-idf-template, but that's just my opinion, feel free to ignore :).

True... even though cargo generate esp-rs/esp-idf/template is not that bad either?

Monorepo without a workspace setup - is it worth it?

I would say yes, just from our experience - but if you can make the workspace work properly then by all means you should use it :D.

Would you elaborate on the benefits of a monorepo without a workspace setup for the baremetal projects? As you can see above, I'm struggling to formulate a single meaningful benefit from this. :) Maybe only atomic commits. BUT: without the "auto-patching" of dependent crates that you get when using a workspace, which is IMO "the" or "a" big deal.

Anything I'm missing here?

Rename esp-idf-svc GIT repo to just esp-idf

Do you not think there might be some confusion with this? I imagine this will make it very hard to find good search results via google etc. I'd suggest keeping it either esp-idf-hal, esp-idf-svc or bikeshedding some other name. Unless you have a compelling reason to call it esp-idf.

Would there be confusion or hard-to-search results indeed?

All crates remain named as they are now (i.e. crates.io indexing in google remains the same)
We still have esp-idf-sys / hal / svc as sub-directories of the future GIT esp-idf (or even just idf) monorepo. Wouldn't that be "good enough" for google? I.e. searching for "edge-http github rust" returns as a top result the "edge-net" monorepo site, just as I would expect...?

Jun 12 '24 11:06 ivmarkov

Would you elaborate on the benefits of a monorepo without a workspace setup for the baremetal projects?

I can just speak for myself here, but in my experience Rust tooling support for workspaces is fine. Rust-Analyzer is the worst offender I can think of, as Monorepos tend to grow, RA slows down to a snail's pace. RA also only works if the virtual cargo manifest sits in the workspace/ folder root.

The most useful thing for me is that my crates live side by side and I can pin dependencies painlessly by defining most of them in the virtual workspace manifest.

Jun 27 '24 14:06 juliankrieger

I can just speak for myself here, but in my experience Rust tooling support for workspaces is fine. Rust-Analyzer is the worst offender I can think of, as Monorepos tend to grow, RA slows down to a snail's pace.

Heh. I would not qualify Rust tooling support as "fine" if RA is unusable. Interesting that RA is affected by the size of the workspace, as my assumption was that it is also affected by the number and size of your dependencies? Where I'm getting at is that with a workspace setup, whatever extra weight we get via workspace deps, we'll lose in the form of out-of-workspace deps, so net-net we should be at the same "weight" at the end.

RA also only works if the virtual cargo manifest sits in the workspace/ folder root.

That's an easy restriction to comply with.

The most useful thing for me is that my crates live side by side and I can pin dependencies painlessly by defining most of them in the virtual workspace manifest.

Exactly. As I do here.

rs-matter by the way also has a workspace setup and works quite OK. And is quite large of a code-base!

Jun 27 '24 15:06 ivmarkov

Heh. I would not qualify Rust tooling support as "fine" if RA is unusable

Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of serde and axum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support.

I think heavy macro useage is the number one factor which can slow down RA. The problem with serde is that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode).

Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.

Jun 27 '24 15:06 juliankrieger

Just to add, I think maintaining most crates in the esp-idf-* ecosystem would be great.

Jun 27 '24 15:06 juliankrieger

Heh. I would not qualify Rust tooling support as "fine" if RA is unusable

Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of serde and axum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support.

I think heavy macro useage is the number one factor which can slow down RA. The problem with serde is that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode).

Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.

rs-matter is a very heavy proc-macro user (more of a per-struct though and even if not serde) and I don't observe any slowdowns there.

Are you sure you don't hit something very specific for your setup?

As for the ESP-IDF crates - yes - no proc-macros there.

Jun 27 '24 15:06 ivmarkov

Just to add, I think maintaining most crates in the esp-idf-* ecosystem would be great.

Not sure I follow. We are (or at least we fool) ourselves that we maintain them? Or do you mean paid maintenance?

Jun 27 '24 15:06 ivmarkov

Just to add, I think maintaining most crates in the esp-idf-* ecosystem would be great.

Not sure I follow. We are (or at least we fool) ourselves that we maintain them? Or do you mean paid maintenance?

Whoops, sorry. I meant adding them to a monorepo. It's been a long day.

Jun 27 '24 15:06 juliankrieger

Heh. I would not qualify Rust tooling support as "fine" if RA is unusable

Yeah, it's a bit of a disappointment. I currently maintain a small/mid-sized codebase with 10kloc. It's a bit macro heavy because of serde and axum, which I think slows RA down drastically. You have to pick your poison here, either you enable proc-macro support in RA and the LSP slows down to a crawl or you can't enjoy full editor support. I think heavy macro useage is the number one factor which can slow down RA. The problem with serde is that a ton of code is generated per struct, per attribute macro and per serializer (I support json, cobr and bincode). Since I haven't identified any macro useage in esp-idf-* crates, this shouldn't be as much of a problem.

rs-matter is a very heavy proc-macro user (more of a per-struct though and even if not serde) and I don't observe any slowdowns there.

Are you sure you don't hit something very specific for your setup?

As for the ESP-IDF crates - yes - no proc-macros there.

I'm not sure if it's something specific for my setup, I actually had a look at rs-matter when initializing my project.

It's a reference implementation for a state of the art zero-touch/zero-trust enrollment standard. I maintain it at https://github.com/hm-edu/open-brski if you want to take a look. This could also be interesting to @MabezDev I suppose.

Jun 27 '24 16:06 juliankrieger

I think some open question regarding the migration strategy are still not discussed.

How do we manage the interleaving of the PR numbering for past PR's that are already merged - E.g #440 is something different in hal than in svc. Currently we have a CHANGELOG for every crate and that would also not change, but in every crate we would now have a historical break, so at the migration point we hopefully don't get in a situation where we have a higher number in esp-idf-svc than in esp-idf-hal, such that we get the same number referring to the old number in esp-idf-svc and one in esp-idf-hal's combined repo.

Transferring issues - i assume we will only transfer open issues to the combined repo? That is also a step we do pretty late where we know the other stuff works. Also moved PR's are creating a new issue number i think and so bumping the numbering system of PR's / issues. Something to keep in mind.

Jul 08 '24 09:07 Vollbrecht

I think embuild and the template are relative independent, and we first should only focus on sys/hal/svc. If we want we still can include it later but the operation is already complex enough, so i would appreciate we don't do to much at the same time increasing risk of the unknown. Also embuild itself would not be a good fit for in a workspace.It also provides ldproxy that will always be build against a host-target. That is deeply problematic as a rust workspace just doesn't work with mutti target builds.

Jul 08 '24 09:07 Vollbrecht

Also embuild itself would not be a good fit for in a workspace.It also provides ldproxy that will always be build against a host-target. That is deeply problematic as a rust workspace just doesn't work with mutti target builds.

embuild for the purposes of a build-time dependency for sys/svc and hal might not be a problem, as it is a dev-dependency.

As for embuild for the purposes of ldproxy and cago-pio - it might work if we create two workspaces:

sys/hal/svc + embuild as a dev-dependency
ldproxy + cargo-pio + embuild as a regular dependency (that is, if you are allowed to have two bin projects in one workspace)

The key is to check if a single crate can be a member of multiple workspaces. If it can, then we don't have an issue.

I think we need to experiment a bit rather than theorizing, so as to see what works and what not.

Also, and regardless whether the above multi-workspace build works or not, we probably want all of sys/hal/svc/embuild/ldproxy/cargo-pio/template in a single monorepo. This would allow us to play with various workspace setups post-migration. And would allow us to have at least atomic commits spanning multiple crates. (UPDATE: and multiple CIs (if we end up with multiple workspaces) triggering on any commit. As in a commit in embuild would trigger the CI for sys/hal/svc.

And users will stop opening issues "in the wrong repo" because very often they can't really distinguish whether an issue belongs to sys, hal, svc, template or embuild.

Jul 08 '24 10:07 ivmarkov

How do we manage the interleaving of the PR numbering for past PR's that are already merged - E.g #440 is something different in hal than in svc.

We don't? Legacy PRs remain in the legacy repos which will be archived and readonly but not deleted.

Currently we have a CHANGELOG for every crate and that would also not change, but in every crate we would now have a historical break, so at the migration point we hopefully don't get in a situation where we have a higher number in esp-idf-svc than in esp-idf-hal, such that we get the same number referring to the old number in esp-idf-svc and one in esp-idf-hal's combined repo.

I do not understand this. If you could clarify.

Transferring issues - i assume we will only transfer open issues to the combined repo? That is also a step we do pretty late where we know the other stuff works. Also moved PR's are creating a new issue number i think and so bumping the numbering system of PR's / issues. Something to keep in mind.

I don't think we need to move open PRs at all. They anyway would be totally unmergeable with the new repo. Open issues we might indeed move.

Jul 08 '24 10:07 ivmarkov

git-filter-repo should make joining together the projects as workspace easily.

I would make an esp-idf empty repo with the workspace and a main readme, then import the full history of all the target esp-idf* repos and at the end setup the new CI.

Nov 06 '24 15:11 lu-zero