ref-fvm
ref-fvm copied to clipboard
ipld block syscalls naming
Context
The syscalls bound to the invocation container are:
ipld::openipld::createipld::readipld::statipld::cid
The syscalls as named in the kernel are the following, under the BlockOps trait:
block_openblock_createblock_readblock_statblock_link
There are two problems here.
Problem 1: Naming/categorization
We should name these equally on both sides. Two proposals:
- Name the system
ipldand namespace block operations under theblock_prefix, i.e.:ipld::block_open. - Consider that IPLD is a foundational component of the stack, and as such it permeates everything. Having a dedicated
ipldsubsystem sounds odd from those lens. Instead, dropipld::and name this subsystemblock::, i.e.block::open. Block operations use IPLD under the hood, like everything else.
Problem 2: ipld::cid vs. block_link
Both these names are confusing.
cidimplies that only a CID is being computed, when in reality the block is also being added to the reachable set.linkin the context of blocks suggests that a block is being linked to another block, which is not the case.
Proposals and rationale:
block::promoteoripld::block_promote: promotes a block to the reachable set and returns its CID.block::exportoripld::block_export: exports a block outside the actor context and returns its CID.block::publishoripld::block_publish: publishes a block outside the actor context and returns its CID.
Problem 1.1:
We have the block prefix in the Kernel because, e.g., open, by itself, is ambiguous. This mirrors MessageOps (msg_) and NetworkOps (network_). Honestly, I'd go the other way and merge all of the kernel traits into a single trait.
Problem 1.2:
I wouldn't just call it block::, as:
- At the moment, this is the only IPLD API exposed to actors.
- Block is ambiguous (usually refers to "blocks" as in blockchain).
We could rename create to write_block and read to read_block as we will likely introduce non-block-based versions (e.g., an ipld::build and, maybe, an ipld::decode). But I'm not convinced it's worth it.
Problem 2
For some context:
- I called this
cidfirst (because it computes the cid). - Then I called it
link, to mirror the unix "link" syscall and to make it clear that it's an action. But I never renamed the syscall, just the kernel function name.
link was supposed to mirror the "link" syscall. You're giving a "name" to a block. It's not quite the same thing, but it fits with the filesystem theme (open, read, stat, link) where create is the exception.
I'm not totally happy with link, but I much prefer it to promote, export, and publish. The user's mental model should be "I've created a block, now I'm 'linking' it into my state, so I call 'link' and put that CID into my state".
I'd go the other way and merge all of the kernel traits into a single trait.
This is really just cosmetics. I much prefer the trait-per-family approach as it provides organization and makes code easier to navigate. It also enables us to mock/stub operation families. If anything, I'd vote to make them even more granular going forward (e.g. IpldBlockOps, IpldDagOps, etc.)
However, traits by themselves don't provide function namespacing, so we should include the family as a prefix in the function names. For the IPLD ops, I'd clean things up in this manner:
- Syscall names:
ipld::block_open,ipld::block_write... - Kernel side: for now I'd go with a single trait for all IPLD ops:
trait IpldOps {
fn ipld_block_open();
fn ipld_block_write();
...
}
Syscall nomenclature
It would be good to standardise on a nomenclature on both the invocation container and the kernel. On the container side, I'd lean towards <namespace>::[group_]<name>, where group is optional and refers to a subfamily of operations.
## ipld namespace, naturally lends itself to grouping
ipld::block_write
ipld::block_stat
ipld::dag_build
ipld::dag_destroy
ipld::selector_prepare
ipld::selector_apply
ipld::selector_delete
## crypto namespace
crypto::seal_verify
crypto::seal_verify_batch
crypto::seal_verify_aggregate
crypto::hash_compute
...
ipld::link
The user's mental model should be "I've created a block, now I'm 'linking' it into my state, so I call 'link' and put that CID into my state".
I seriously think that link is a confusing choice. And I don't think this mental model is correct. This op in itself does not link a block to my state. Referencing this block from another that eventually connects to the root is what links the block to my state. To put it simply, I could call link on a block, get a CID back and never use it. When the buffered blockstore commits the state to disk, it will discard my block. Therefore calling link never truly linked my block.
Apart from that, the term link is ambiguous when DAGs are involved. It can easily be interpreted as "this operation will create a link from one block to other." (which is actually more functionally equivalent to the Linux link syscall...)
In my head, this operation today just marks a block as "elegible" for writing (because now I have a CID I can use elsewhere). It kinda feels like Git's staging area. Maybe block_stage is another option?
It also enables us to mock/stub operation families
Not unless we have some form of meta "composed kernel" type that lets us layer a mock over a kernel. But I guess we could have that. And I agree that the grouping is nice.
Syscall names: ipld::block_open, ipld::block_write...
Yeah, SGTM. That makes it clear that this is all block-level.
I seriously think that link is a confusing choice. And I don't think this mental model is correct. This op in itself does not link a block to my state. Referencing this block from another that eventually connects to the root is what links the block to my state. To put it simply, I could call link on a block, get a CID back and never use it. When the buffered blockstore commits the state to disk, it will discard my block. Therefore calling link never truly linked my block.
It's not great, but, well, this is the operation that creates a "link" to the block. All the other proposals make it sound like it does something super special and complicate matters. Yes, this is where we treat the block as "reachable" as well, but I really don't want users to think about that. They should focus on the idea of creating IPLD blocks and linking them together, not "promoting", storing, exporting, staging, etc.
My main gripe is still ambiguity. When I think of IPLD I naturally think of DAGs, and in that context link means something entirely different. So if it's confusing for me as an FVM core engineer, I'm suspect it'll be confusing for users too.
How about ipld::block_commit()?
ipld::block_save()?
ipld::block_save() works for me.