miden-vm
miden-vm copied to clipboard
Crate organization
Some thoughts about potential crate organization options. First, the current organization roughly looks like this:
Most crates in the above are self-explanatory, but miden-core
deserves a few extra words. Currently, miden-core
contains the following components:
- Some common imports from Winterfell - e.g.,
Felt
,FieldElement
,StarkField
etc. - MAST-related structs - e.g.,
Program
,CodeBlock
,Operation
etc. - Hash function related components (e.g.,
Digest
) - these are mostly re-exported from Winterfell. -
AdviceSet
and structs - e.g.,MerkleTree
,MerklePathSet
etc. - Trace layout constants.
- Some commonly-used utility functions (some re-exported from Winterfell and others defined in the crate).
For v0.4, we already have plans to do the following:
- Move hash function and crypto-related structs (e.g.,
MerkleTree
) into miden-crypto crate. - Move other
AdviceSet
-related components intomiden-processor
crate.
Beyond this, I think we can also do the following:
- Move trace layout constants out of
miden-core
. These can go either into its own crate (e.g.,miden-vm-trace
) or maybe intomiden-air
trace. In the latter case, we'll need to makemiden-processor
depend onmiden-air
, but this may not be a problem. - Rename
miden
crate intomiden-vm
- Add a new crate called
miden-objects
(other suggestions welcome). This create will contain foundational components for the rollup. For example, structs forAsset
,Account
,Note
,Transaction
etc. - Add a new crate called
miden-lib
. This crate will contain Miden assembly modules we'll use in a rollup (e.g., transaction kernel, transaction prologue/epilogue etc.).
Preliminary dependency graph accounting for these changes would look like this:
The above assumes that trace layout constants have been moved from miden-core
to miden-air
- though, we don't need to do this right away.
The two new crates (miden-objects
and miden-lib
) could live in the same repo under 0xPolygonMiden. Probably something like minde-base
.
Also, the current repo will probably be renamed to miden-vm
and moved to the same org.
A couple of suggestions:
Single-import
We could set a rule to never import dependencies of dependencies directly.
In this example, C
cannot require A
as dependencies
; instead, B
should re-export A
, and C
should fetch A
from B
.
Benefits: We will never have a tree conflict of B
requiring a different A
than C
.
Transitive simplification
The following graph suggests that B
should exist either as module of A
or C
, unless there is a strong domain separation between them
Candidates for this check:
- miden-crypto: could be a module of miden-core
I'm going to add a few other considerations to the discussion
- Number of distinct repos
Having a long chain of repos results into slower feature releases. The user-facing program is the end in the dependency chain, so the amount of work is proportional to the number of distinct repos used by the dependencies. IOW updating more repos means more PRs, reviews and CI runs. This problems gets worse as the owner of the libraries changes.
This is a bit unavoidable, as the domain can be so different that most people won't have the competency to work on the whole stack. But, to me that means we shouldn't have overly generic crates like miden-objects
on a distinct repo, because the extra work to update it wouldn't be from a difference in domain knowledge, but just pure code organization.
- Number of distinct crates
I think having more crates is better, one of the big reasons is that crates allows for better compilation parallelism. (There is the caveat of code bloat, since monomorphization is done per crate, but IIRC duplicate code elimination happens at linking time, so this is mainly a disk usage kinda of thing).
But this is only an advantage if the dependencies are broad instead of deep, so I think it is better for miden-core
to be a thin crate, or to not exist at all.
- Naming and discoverability
I think the crate name should clearly inform someone of its contents, without needing to read a README or the code itself. I also think names should be consistent, and that we shouldn't rename things, because it makes it easy to see a use
directive and tell where the code is coming from. For this reason I would avoid crates like miden-object
or miden-core
, since that is not informative, and preferably not rename the crates in the Cargo.toml
. I also think that this consistency should apply to repo names, so miden-crypto
would IMO be a better name for it.
- Single imports
Just to document my perspective. I still haven't really understood the benefit of this approach. What I understood is that every time we add a new type on crate, it must be exported on every parent in the dependency chain. With the dependency graph above, if we add the SMT
on miden-crypto
, we have to update miden-core
, miden-air
, miden-prover
to get it to miden
?
The rationale of avoiding multiple doesn't make sense to me, since that is determined by the dependencies in the Cargo.toml and not source code re-exports.
The rationale of moving code around also doesn't seem super appealing, why would we want to future proof moving code crypto
to air
? I can understand moving a dependency down the dependency graph, and re-exporting it on the previous crate, but that would need re-exports as moves happen, and not the other way around.
This also makes code discoverability harder, because one would have to go over each re-export until finding the real implementation (for instance, while browsing code over GH, instead of local code with a language server)
Another thing to consider with the single imports, this makes searching for code in the generate docs a bit harder, since the same structs are defined multiple times and at first glance there is no way to tell if it is just a re-export or another struct with the same name: