Refactor binary encoding of canon builtins for easier future extensibilty
Currently canon builtins are primarily encoded as a prefix byte plus any payload immediately afterwards. Over time though we might want to add more options/extensibility to preexisting builtins, such as the try idea from https://github.com/WebAssembly/component-model/pull/444. In this situation it's always possible to add new builtin codes at the end of the index space, and functionally there's no issue with that. Conceptually though it'd be unfortunate if the same intrinsic could be defined across multiple opcodes and can make implementations a little more awkward to maintain -- e.g. parsing is spread out across major opcodes for the "same intrinsic".
An example of this split today is that 0x03 indicates the resource.drop intrinsic while 0x07 is resource.drop async. Morally these are the same intrinsic, just with a different option, and spreading it out across two opcodes is a little unfortunate.
What I'd envision in the future is something like:
- Each canon builtin gets a prefix opcode, just as today.
- Each canon builtin is then followed by
flags:varu32, a leb-encoded 32-bit integer. This integer is a bitset of optional fields that follow- For example bit 0 could mean "async" so
resource.drop asyncwould be encoded as0x03 0x01whileresource.dropwould be encoded as0x03 0x00.
- For example bit 0 could mean "async" so
- The meaning of each bit would be intrinsic-specific, but a loose guideline would be that each bit may optionally indicate that there are more bytes to decode. For example
async?wouldn't have any more bytes to decode, but some future flag may require another immediate to decode. - Intrinsics could still reserve the right to use this extensibility
u32as way of completely changing how the rest of the intrinsic is encoded, for example in the future an intrinsic might completely drop a canonopt list or something like that.
I don't think we should make this change in the near term per se as this is basically just a stylistic concern for the binary format. This might be good to finalize/discuss just before a final release of the component model though.